You are here

Memory Leak: Relax Density MPI RAM overusage kills process

6 posts / 0 new
Last post
Memory Leak: Relax Density MPI RAM overusage kills process
#1

I am trying to use Rosetta relax to refine my homology modelled enzyme models to my CCP4 maps, but Rosetta RAM usage rockets all the way up to 128 GB after a little less than a minute (regardless of number of processors) and then linux auto-kills it due to RAM overusage.  I have watched the htop as the process runs, and I have 128 GB on the system, only 1-2 GB are being used by other processes, but it rockets up from ~ 30% usage all the way up and then dies quite clearly when it runs out of RAM. I have tried running it with 32 or 8 nodes as well, no difference in results.

 

Indication of RAM overusage killing the process:

mpirun noticed that process rank 5 with PID 0 on node wiestml exited on signal 9 (Killed).

 

My executable file running the MPI application:

mpirun -np 16 /opt/rosetta/rosetta_src_2021.16.61629_bundle/main/source/bin/relax.linuxgccrelease \
 -database /opt/rosetta/rosetta_src_2021.16.61629_bundle/main/database \
 -in:file:s score_pre4i4b_a2_b2_cat_INPUT_0001.pdb  \
 -relax:fast \
 -relax:jump_move true \
 -edensity:mapfile 4i4b_phases_2mFo-DFc.ccp4 \
 -edensity:mapreso 1.70 \
 -edensity:fastdens_wt 50.0 \
 -symmetry_definition 4i4b_a2_b2.symm \
 -ignore_unrecognized_res \
 -out::nstruct 5 \
 -ex1 -ex2 \


 

happy to share the full output file if it is useful but there are no errors in the output

Truncated Output:

core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
core.conformation.symmetry.Conformation: Found disulfide between residues 294 720
core.conformation.symmetry.Conformation: current variant for 294 CYS
core.conformation.symmetry.Conformation: current variant for 720 CYS
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
core.conformation.symmetry.Conformation: Found disulfide between residues 294 720
core.conformation.symmetry.Conformation: current variant for 294 CYS
core.conformation.symmetry.Conformation: current variant for 720 CYS
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
core.conformation.symmetry.Conformation: Found disulfide between residues 294 720
core.conformation.symmetry.Conformation: current variant for 294 CYS
core.conformation.symmetry.Conformation: current variant for 720 CYS
core.conformation.symmetry.Conformation: current variant for 294 CYD
core.conformation.symmetry.Conformation: current variant for 720 CYD
core.conformation.symmetry.Conformation: Add symmetric chemical bond 720 to 294
core.conformation.symmetry.Conformation: Found disulfide between residues 294 720
core.conformation.symmetry.Conformation: current variant for 294 CYS
core.conformation.symmetry.Conformation: current variant for 720 CYS
core.conformation.symmetry.Conformation: current variant for 294 CYD
core.conformation.symmetry.Conformation: current variant for 720 CYD
core.conformation.symmetry.Conformation: Add symmetric chemical bond 720 to 294
core.conformation.symmetry.Conformation: Add symmetric chemical bond 294 to 720
core.conformation.symmetry.Conformation: Add symmetric chemical bond 294 to 720
core.conformation.symmetry.Conformation: current variant for 294 CYD
core.conformation.symmetry.Conformation: current variant for 720 CYD
core.conformation.symmetry.Conformation: Add symmetric chemical bond 720 to 294
core.conformation.symmetry.Conformation: Add symmetric chemical bond 294 to 720
core.conformation.symmetry.Conformation: current variant for 294 CYD
core.conformation.symmetry.Conformation: current variant for 720 CYD
core.conformation.symmetry.Conformation: Add symmetric chemical bond 720 to 294
core.conformation.symmetry.Conformation: Add symmetric chemical bond 294 to 720
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
core.conformation.symmetry.Conformation: current variant for 294 CYD
core.conformation.symmetry.Conformation: current variant for 720 CYD
core.conformation.symmetry.Conformation: Add symmetric chemical bond 720 to 294
core.conformation.symmetry.Conformation: current variant for 294 CYD
core.conformation.symmetry.Conformation: current variant for 720 CYD
core.conformation.symmetry.Conformation: Add symmetric chemical bond 720 to 294
core.conformation.symmetry.Conformation: Add symmetric chemical bond 294 to 720
core.conformation.symmetry.Conformation: Add symmetric chemical bond 294 to 720
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.pose.util: [ WARNING ] addVirtualResAsRoot() called but pose is already rooted on a VRT residue ... continuing.
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity: Loading Density Map
core.scoring.electron_density.ElectronDensity: Loading density map4i4b_phases_2mFo-DFc.ccp4
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:  Setting resolution to 1.7A
core.scoring.electron_density.ElectronDensity:           atom mask to 3.2A
core.scoring.electron_density.ElectronDensity:             CA mask to 6A
core.scoring.electron_density.ElectronDensity:  Read density map'4i4b_phases_2mFo-DFc.ccp4'
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
core.scoring.electron_density.ElectronDensity:      extent: 200 x 174 x 180
core.scoring.electron_density.ElectronDensity:      origin: 177 x 223 x 125
core.scoring.electron_density.ElectronDensity:   altorigin: 0 x 0 x 0
core.scoring.electron_density.ElectronDensity:        grid: 540 x 540 x 540
core.scoring.electron_density.ElectronDensity:     celldim: 226.082 x 226.082 x 226.082
core.scoring.electron_density.ElectronDensity:  cellangles: 90 x 90 x 90
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity:  voxel vol.: 0.0733866
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
core.scoring.electron_density.ElectronDensity: Effective resolution = 1.7
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Setting [kmin_,kmax_] to [0.0637226,3.25267]
core.scoring.electron_density.ElectronDensity: Bin 1:  B(C/N/O/S)=0 / 0 / 0 / 8.60156  sum=(0,0)
core.scoring.electron_density.ElectronDensity: Bin 1:  B(C/N/O/S)=0 / 0 / 0 / 8.60156  sum=(0,0)
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
Got some signal... It is:15
Signal 15 (SIGTERM) means that the process was terminated.  This usually means that something external to Rosetta, such as a queing system, aborted the process (e.g. due to a time or resource limit).
--------------------------------------------------------------------------
mpirun noticed that process rank 5 with PID 0 on node wiestml exited on signal 9 (Killed).

 

Category: 
Post Situation: 
Sun, 2023-12-31 12:02
mmfarrugia

Sounds like some sort of memory leak.

First thing I'd do is try the most recent version of Rosetta (weekly 2023.45) and see if that has fixed things.

If it hasn't, then I'd recommend posting all the input files needed to reproduce the issue (you may need to use an external file hosting system), and we can take a look at what might be going on.

Sun, 2023-12-31 12:11
rmoretti

Hi Rocco,

 

I tried the most recent (weekly 2023.45, source code tar is rosetta.source.release-362) and had the same issue. 

 

mpi_relax_density_4i4b.sh Executable:

#!/bin/bash


mpirun -np 16 /opt/rosetta/rosetta.source.release-362/main/source/bin/relax.linuxgccrelease \
 -database /opt/rosetta/rosetta.source.release-362/main/database \
 -in:file:s score_pre4i4b_a2_b2_cat_INPUT_0001.pdb  \
 -relax:fast \
 -relax:jump_move true \
 -edensity:mapfile 4i4b_phases_2mFo-DFc.ccp4 \
 -edensity:mapreso 1.70 \
 -edensity:fastdens_wt 50.0 \
 -symmetry_definition 4i4b_a2_b2.symm \
 -ignore_unrecognized_res \
 -out::nstruct 5 \
 -ex1 -ex2 \
 

 

I will upload the files to a GitHub repo if possible, otherwise a Google Drive folder and share in an updated reply, it is just the pdb file, symmetry definition file, and the ccp4 map, but pdb is too large for upload.

Mon, 2024-01-08 10:03
mmfarrugia

@rmoretti

 

Link to the other files:

https://drive.google.com/drive/folders/1hoNTsq5hOyRyIHEsV3BQ7Ms34duUslKY?usp=sharing

 

symmetry definition, input PDB after the necessary previous steps, executable which calls mpirun

Tue, 2024-01-09 08:28
mmfarrugia

Update: I have also tried this with the pre-compiled binaries (relax.static.linuxgccrelease) from 3.13 for linux.  It throttles up to almost full RAM usage, then drops back down to ~80GB, then jumps back up to 124GB and full 16GB of swp usage before crashing out (128 GB RAM computer).  

 

However, I have now done it with mpirun on 4 cores and this seems to run so far around 12.6 GB, MEM% only 2 per thread.  I've done the math for the scaling though and it doesn't make much sense that it would be so much higher at 16 cores, but it seems to either be specific to a larger number of threads or I simply haven't hit the memory spike/leak yet since it is slower on 4 cores than 16 or 32.

I will update if it runs all the way through or if it seems to be a memory-allocation/leak indeed.

 

Thu, 2024-01-11 11:42
mmfarrugia

I think I have further narrowed down the problem: it seems to use a little under 5 GB per process once it has settled down into a rhythm, but initially it intermittently spikes up to well over 12 GB RAM per process, sometimes maintaining this usage for a while before dropping down to 8 or 10 GB per node before finally calming down to the <5 GB per process that it remains at for the majority of the job as far as I can tell with intermittent htop checks.  I'd imagine this could be due to the setup of the calculation since the files themselves are not that large. This is an ~800 AA system.

Fri, 2024-01-12 10:35
mmfarrugia