You are here

segmentation fault after core.kinematics.FoldTree in homology modeling

8 posts / 0 new
Last post
segmentation fault after core.kinematics.FoldTree in homology modeling
#1

Hi, as I perform homology modeling, I get a 'segmentation fault' message. Following is my output right before the error message. Why could this be happening and what I could do to solve it? Thank you for any insights.

......
protocols.looprelax: Repacking because in loop: 387
protocols.looprelax: Repacking because in loop: 388
protocols.looprelax: Repacking because in loop: 389
protocols.looprelax: Missing dens: 389
protocols.looprelax: Repacking required
protocols.looprelax: Detecting disulfides
protocols.looprelax: Annotated sequence before repack: M[MET_p:NtermProteinFull]GKMAAAVGSVATLATEPGEDAFRKLFRFYRQSRPGTADLEGVIDFSAAHAARGKGPGAQKVIKSQLNVSSVSEQNAYRAGLQPVSKWQAYGLKGYPGFIFIPNPFLPGYQWHWVKQCLKLYSQKPNVCNLDKHMSKEETQDLWEQSKEFLRYKEATKRRPRSLLEKLRWVTVGYHYNWDSKKYSADHYTPFPSDLGFLSEQVAAACGFEDFRAEAGILNYYRLDSTLGIHVDRSELDHSKPLLSFSFGQSAIFLLGGLQRDEAPTAMFMHSGDIMIMSGFSRLLNHAVPRVLPNPEGEGLPHCLEAPLPAVLPRDSMVEPCSMEDWQVCASYLKTARVNMTVRQVLATDQNFPLEPIEDEKRDISTEGFCHLDDQNSEVKRARINPDS[SER_p:CtermProteinFull]
core.pack.dunbrack: Dunbrack library took 0 seconds to load from binary
core.pack.interaction_graph.interaction_graph_factory: Instantiating DensePDInteractionGraph
core.pack.pack_rotamers: built 2431 rotamers at 320 positions.
core.pack.pack_rotamers: IG: 1647024 bytes
core.pack.annealer.AnnealerFactory: Creating FixbbSimAnnealer
core.optimize: AtomTreeMinimizer::run: nangles= 822 start_score: 36295.187 start_func: 36303.538 end_score: 3617.902 end_func: 2999.605
protocols.looprelax: 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3617.902 -1225.216 3409.173 633.801 11.531 328.853 -18.366 -66.685 -62.207 -13.627 -8.827 0.000 0.000 0.000 0.000 75.799 60.759 557.945 3.909 -68.940====================================================================================
protocols.looprelax: ===
protocols.looprelax: === Refine
protocols.looprelax: ===
core.kinematics.FoldTree: FoldTree::reorder( 36 ) failed, new/old edge_list_ size mismatch
core.kinematics.FoldTree: 43 41
core.kinematics.FoldTree: FOLD_TREE EDGE 1 36 -1 EDGE 36 50 -1 EDGE 50 56 1 EDGE 50 54 -1 EDGE 55 56 -1 EDGE 56 74 -1 EDGE 74 81 2 EDGE 74 77 -1 EDGE 78 81 -1 EDGE 81 96 -1 EDGE 96 101 3 EDGE 96 98 -1 EDGE 99 101 -1 EDGE 101 123 -1 EDGE 123 208 4 EDGE 123 190 -1 EDGE 191 208 -1 EDGE 208 123 -1 EDGE 123 217 5 EDGE 123 209 -1 EDGE 210 217 -1 EDGE 217 123 -1 EDGE 123 219 6 EDGE 123 215 -1 EDGE 216 219 -1 EDGE 219 224 -1 EDGE 224 229 7 EDGE 224 225 -1 EDGE 226 229 -1 EDGE 229 230 -1 EDGE 230 235 8 EDGE 230 232 -1 EDGE 233 235 -1 EDGE 235 281 -1 EDGE 281 287 9 EDGE 281 284 -1 EDGE 285 287 -1 EDGE 287 291 -1 EDGE 291 335 10 EDGE 291 302 -1 EDGE 303 335 -1 EDGE 335 344 -1 EDGE 344 389 -1
Segmentation fault

Post Situation: 
Mon, 2012-10-08 14:27
pdbb

Segmentation faults unfortunately aren't error messages, so they don't help us diagnose the problem (it means Rosetta crashed by wandering outside its allowed memory allocation). Can you post a command line?

Off the top of my head, the FoldTree is composed of a ton of tiny fragments. I don't know if that's appropriate for where you are in the protocol (but I'm guessing not, if it's vanilla homology modeling - Rocco, can you correct me?)

320 residues is also awfully large for a homology modeling problem, which may be part of the problem.

Mon, 2012-10-08 14:35
smlewis

Foldtree manipulation does happen in homology modeling, particularly in the loop remodeling stages, where the loops are represented as branches in the FoldTree, relative to the fixed core. (This is to avoid leaver arm effects with moving the loops.) The FoldTree mostly looks like typical loop remodeling one (not surprising , given the protocols.looprelax lines), with polymer (-1) edges connecting the "fixed" residues, with jumps (positive numbers) from fixed region to fixed region across the loops, and then the loops being constructed from either side to the middle.

However, you get rather strange in the middle, where you have jump 4 across the loop from 123 to 208, then you have the cut loop from 123 to 190 and 208 to 191, but then you have a straight polymer run going from 208 to 123, and a third later on from 123 to 209, and then from 217 to 123 etc. It looks like the protocol is trying to model loops on loops and loops overlapping loops. My guess is that's why it's crashing. Are you specifying a loops file?

Tue, 2012-10-09 12:57
rmoretti

Hi, thank you for the replies,
Following are my flags related to loops. Do you thing I could tackle this problem by changing these flags?
Thanks

-loops:frag_sizes 9 3 1
-loops:aat000_09_05.200_v1_3.gz aat000_03_05.200_v1_3.gz none
-loops:extended
-loops:build_initial
-loops:remodel quick_ccd
-loops:refine refine_ccd

Wed, 2013-03-27 11:06
pdbb

You haven't really provided enough information to say. What is the full command line you used to start the run (including which application you used), and your full flag files? It's best to just copy/paste the whole thing as-is, rather than trying to edit things, as you avoid things like "-loops:aat000_09_05.200_v1_3.gz" which would cause an early halt to the run because of bad options.

That said, the subset of flags shown probably isn't the cause of the issue. As mentioned before, the core of the problem is probably how you're specifying the loops or how Rosetta is auto-determining the loops.

Wed, 2013-03-27 14:00
rmoretti

Hi, thank you the the reply

Following are my flags. I have tried both with and without a loop file. In both cases I get segmentation fault. It works okay when I run rosetta for smaller parts of the target protein sequence. The template covers only parts of the target protein. there are two sections that are not covered by the template. I was wondering if it is possible to fold only the parts that are covered by the template and the two not covered parts leave out as two large loops (100 residues each),so that rosetta won't try to fold these loops or further break them down into smaller loops. I don't think I can use homology model with extension as I think these loops interact with each other. What do you think?
Thanks

-run:protocol threading
-in:file:alignment template_target.aln
-cm:aln_format general
-frag3 ./starting_files/fragments/aat000_03_05.200_v1_3.gz
-frag9 ./starting_files/fragments/aat000_09_05.200_v1_3.gz
-in:file:fasta target.fasta
-in:file:fullatom
-loops:frag_sizes 9 3 1
-loops:frag_files ./starting_files/fragments/aat000_09_05.200_v1_3.gz ./starting_files/fragments/aat000_03_05.200_v1_3.gz
-in:file:psipred_ss2 ./starting_files/t000_.psipred_ss2
-in:file:fullatom
-out:nstruct 10000
-in:file:template_pdb template.pdb
-database ../../programs/rosetta3.4/rosetta_database/
-loops:extended
-loops:build_initial
-loops:remodel quick_ccd
-loops:refine refine_ccd
-silent_decoytime
-random_grow_loops_by 4
-select_best_loop_from 1
-out:file:fullatom
-out:output
-out:level 400
-out:file:silent threaded_model.out
-out:file:silent_struct_type binary
-out:file:scorefile threaded_model.fasc

Sat, 2013-03-30 12:59
pdbb

I can't see anything untoward with those flags offhand.

As long as your alignment file is set up correctly (and there is no additional issues), Rosetta should be able to handle unaligned regions. And by "handle" I mean "not crash on", not necessarily "model accurately". If you have large loops (e.g. larger than 10 aa or so) Rosetta has a problem, as it doesn't have enough template information to model them correctly. Two large loop, especially ones interacting with each other, will likely get you a large number of possibilities, most of which are garbage.

If these domains are likely to be unimportant structurally with respect to the structure of the remaining protein (a reasonable possibility, given that the homologs don't have them), then one possibility is to attempt to model the protein without those loops, by truncating them/replacing them with a shorter dummy loop. If the loops are thought to be an independently folding domain, you could try to model them by ab initio (each separately, as 100 aa is in the range to do so, but 200 is probably too big), or by homology modeling with homologs, if you have them - either separately or with dummy connecting loops if they're a pair.

If there are homologs for the loops, albeit in different PDBs than the main portion of the protein, the new "hybrid protocol" might be useful to you. I believe (but am not certain) that it will be included in the upcoming 3.5 release. It will allow you to mix and match homologs for different sections of the protein to fold.

If you're still getting segmentation faults, I'd probably recommend running in debug mode to see if there's a more helpful error message. Past that, a debugger would be needed. (See https://www.rosettacommons.org/node/3226#comment-5904 and https://www.rosettacommons.org/node/3226#comment-5913 for specifics.)

Mon, 2013-04-01 11:53
rmoretti

-select_best_loop_from 1

If I understand this flag correctly, it does nothing when the argument is 1. An argument larger than one causes it to do extra modeling (N models) and then select a best sub-model from the N...but 1 model is 1 model.

-random_grow_loops_by 4

How close are your closest loops to each other in primary sequence? Is it possible that the loops could come into overlap because of the random growth (are the closest loops 8 or fewer residues apart in primary sequence?) Try it without this flag?

Posting your actual loops file may help us.

Mon, 2013-04-01 11:08
smlewis