You are here

Segmentation fault

4 posts / 0 new
Last post
Segmentation fault
#1

Hi everyone,
I tried to run protocol AbinitioRelax to predict the structure of a sequence amino acid.
Howerver, I just could run this protocol with less than 11 cores (CPU processor). If I increased the core number more than 11, the protocol couldn't be run, even my PC has 12 or 16 processor. It returned the "Segmentation fault (11)" and cancel the job.
It doesn't happen the same problem when I run the protocol Flexpepdock.
Did anybody meeting the same problem with me?
Can anybody help me solve this problem?
Thank you so much.
Hongtham

Post Situation: 
Mon, 2014-12-15 20:21
Hongtham

How are you specifying the number of processors? Is that through the MPI launcher application, or are you trying some other method. (Are you running an MPI compile?)

What is being printed to the tracer immediately prior to the segmentation fault? Would it be possible for you to recompile the application in debug mode and re-do the runs? There are a number of checks in debug mode which are removed from release mode - one of them may tell us better what is going on.

Tue, 2014-12-16 03:02
rmoretti

Hi rmoretti,
Yes, I am using the MPI compile. I specified the number of processor by set the below command:
/usr/local/bin/mpiexec -np 12 ~/rosetta3.4/rosetta_source/bin/AbinitioRelax.mpi.linuxgccrelease @flags > log (in this case: 12 is number of processor)

This is the error file:

[node3:18264] [ 0] /lib64/libpthread.so.0() [0x38d4a0f710]
[node3:18264] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x38d4232925]
[node3:18264] [ 2] /lib64/libc.so.6(abort+0x175) [0x38d4234105]
[node3:18264] [ 3] /lib64/libc.so.6() [0x38d422ba4e]
[node3:18264] [ 4] /lib64/libc.so.6(__assert_perror_fail+0) [0x38d422bb10]
[node3:18264] [ 5] /home/khuong/SW/rosetta/debug/rosetta3.4/rosetta_source/build/src/debug/linux/2.6/64/x86/gcc/4.4/mpi/libprotocols_b.5.so(_ZNK7utility7pointer10owning_ptrIN9protocols7jobdist8BasicJobEEptEv+0x37) [0x2ad61de418c7]
[node3:18264] [ 6] /home/khuong/SW/rosetta/debug/rosetta3.4/rosetta_source/build/src/debug/linux/2.6/64/x86/gcc/4.4/mpi/libprotocols.1.so(_ZN9protocols7jobdist18BaseJobDistributor8next_jobERN7utility7pointer10owning_ptrINS0_8BasicJobEEERi+0x282) [0x2ad622e1f3ee]
[node3:18264] [ 7] /home/khuong/SW/rosetta/debug/rosetta3.4/rosetta_source/build/src/debug/linux/2.6/64/x86/gcc/4.4/mpi/libprotocols_b.5.so(_ZN9protocols8abinitio18AbrelaxApplication4foldERN4core4pose4PoseEN7utility7pointer10owning_ptrINS0_8ProtocolEEE+0x334c) [0x2ad61deff1ba]
[node3:18264] [ 8] /home/khuong/SW/rosetta/debug/rosetta3.4/rosetta_source/build/src/debug/linux/2.6/64/x86/gcc/4.4/mpi/libprotocols_b.5.so(_ZN9protocols8abinitio18AbrelaxApplication3runEv+0x1d9) [0x2ad61defff41]
[node3:18264] [ 9] /home/khuong/SW/rosetta/debug/rosetta3.4/rosetta_source/bin/AbinitioRelax.mpi.linuxgccdebug(main+0xc8) [0x40a68c]
[node3:18264] [10] /lib64/libc.so.6(__libc_start_main+0xfd) [0x38d421ed1d]
[node3:18264] [11] /home/khuong/SW/rosetta/debug/rosetta3.4/rosetta_source/bin/AbinitioRelax.mpi.linuxgccdebug() [0x40a509]
[node3:18264] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 13 with PID 18268 on node node3.local exited on signal 6 (Aborted).
--------------------------------------------------------------------------
Hongtham

Tue, 2014-12-16 04:29
Hongtham

I have had similar problems. I think that you should generate at least one model for each processor used. Is your -nstruct currently set to 10? If so I would try to increasing it until you have more models than processors.

Fri, 2015-08-07 15:49
Sandy