You are here

Fix back bone design fixbb

27 posts / 0 new
Last post
Fix back bone design fixbb
#1

Hi,
I use Rosett_3.3 to design a protein whose length is 166. And the command is
fixbb.linuxgccrelease -s protein.pdb -resfile protein.resfile -ex1 -ex2 -nstruct 1 -database $ROSETTA_DATABASE -jran 111111 -fast_sc_moves True

However, it takes me more than 2 hour to finish the job. I use rosetta 2.3 to design the same protein with the same parameter. It takes me only 1-2 minutes.

Could you please tell me what parameters should I use to make the program fast?

Post Situation: 
Tue, 2012-03-06 12:30
Lindsay

My best guess is that you're running out of memory, in which case -linmem_ig 20 will help. -fast_sc_moves is ignored by fixbb to the best of my knowledge. You could have parameters in the resfile that slow the run down, or certain tweaks to the database. The most parsimonious explanation is that you have a debug build of rosetta instead of a release build, in which case you should try deleting the compiled code and recompiling (watch for the -O flag when it recompiles).

Tue, 2012-03-06 12:57
smlewis

I have 8G memory and it didn't run out all the memory. The rosetta 2.3 can do its job pretty fast and use less memory.

Tue, 2012-03-06 14:29
Lindsay

Forget to mention, I use the release build.

Tue, 2012-03-06 14:30
Lindsay

sure. How do I send my files to you? By the way, for all the proteins I have tried, the rosetta 3.3 is always much slower that rosetta 2.3.

Tue, 2012-03-06 14:50
Lindsay

If you feel like sending along the PDB and resfile, I'll see if it duplicates for me in my copy of 3.3. Otherwise I don't know what to tell you other than that it shouldn't be doing this.

You can try linmem_ig 20 anyway if you want.

Tue, 2012-03-06 14:41
smlewis

I have set all the residue to ALA in this case. I also try to use the original pdb, and the rosetta 3.3 still very slow.
Here proteinA.jpg is the pdb file and png file is the resfile. Thank you!

Tue, 2012-03-06 14:53
Lindsay

10:40:17 contador ~/scr/rosetta33> time rosetta3.3/build/src/release/linux/3.0/64/x86/gcc/fixbb.linuxgccrelease -database ~/scr/rosetta33/rosetta_database/ -s ~/proteinA.pdb -resfile ~/proteinA.resfile -ex1 -ex2 -nstruct 1 -jran 111111
core.init: Mini-Rosetta version Release 3.3, from SVN 42942 from https://svn.rosettacommons.org/source/branches/releases/rosetta-3.3/rose...
core.init: command: rosetta3.3/build/src/release/linux/3.0/64/x86/gcc/fixbb.linuxgccrelease -database /home/smlewis/scr/rosetta33/rosetta_database/ -s /home/smlewis/proteinA.pdb -resfile /home/smlewis/proteinA.resfile -ex1 -ex2 -nstruct 1 -jran 111111
core.init: 'RNG device' seed mode, using '/dev/urandom', seed=538246942 seed_offset=0 real_seed=538246942
core.init.random: RandomGenerator:init: Normal mode, seed=538246942 RG_type=mt19937
...
core.pack.pack_rotamers: built 94858 rotamers at 166 positions.
core.pack.pack_rotamers: IG: 79203348 bytes
protocols.jd2.JobDistributor: proteinA_0001 reported success in 1016 seconds
protocols.jd2.JobDistributor: no more batches to process...
protocols.jd2.JobDistributor: 1 jobs considered, 1 jobs attempted in 1016 seconds
1012.635u 2.680s 16:56.45 99.8% 0+0k 5184+480io 67pf+0w

So, 17 minutes. It used 4.6 GB of memory to do it. I tested a different protein of the same length and it took 15 minutes and about half as much memory. I don't have a copy of 2.3 to test, but it isn't taking hours for me...

Wed, 2012-03-07 08:14
smlewis

Thank you! For rosetta 2.3 I use about 2 minutes to design 1 sequence for this protein(and nearly the same time for other protein around 100-200 residues). But rosetta 3.3 definitely took much longer time. I have 8 G memory and 8 cpus. I am running it again to test the running time~

Wed, 2012-03-07 08:35
Lindsay

[blast]$ core.pack.pack_rotamers: IG: 4043873912 bytes
protocols.jd2.JobDistributor: 1eerA_0001 reported success in 1931 seconds
protocols.jd2.JobDistributor: no more batches to process...
protocols.jd2.JobDistributor: 1 jobs considered, 1 jobs attempted in 1931 seconds

[1]+ Done ~/software/Rosetta_3.3/rosetta_source/bin/fixbb.linuxgccrelease -database /data6/zhixli/software/Rosetta_3.3/rosetta_database/ -s 1eerA.pdb -resfile 1eer.resfile -ex1 -ex2 -nstruct 1 -jran 111111

it takes about 30 minutes. I don't know why yesterday it takes more than 2 hours. But still it is much slower than rosetta 2.3

Wed, 2012-03-07 09:24
Lindsay

Rosetta2.3 would decide to use the on-the-fly rotamer-pair energy calculations automatically. Rosetat3.x does not, but, you can tell Rosetta to use the on-the-fly code (-linmem_ig 10) and it will make things faster while using less memory. I published this here:

http://dl.acm.org/citation.cfm?id=1791526
[PDF here: http://www.springerlink.com/content/j7p752wm12373768/fulltext.pdf ]

The on-the-fly code uses 95% less memory and improves performance considerably. It's a double win.

Fri, 2012-03-09 13:42
AndrewLeaver-Fay

Andrew, I know this isn't exactly the place to ask, but why isn't it the default if it's a double win? Where would we not want to use the on-the-fly code?

Fri, 2012-03-09 13:48
jadolfbr

LazyIG to regular IG to linmemIG performance varies with the number of rotamers and their distribution across positions in the pose. Each is better in a certain domain. A simple approximation of "above X rotamers, use linmem IG" is a 90% solution. We discussed how to put it in to fixbb (the app, not PackRotamersMover) but it remains to be seen when it'll get done.

For most of the work I did with AnchoredDesign, I did *not* see performance benefits from linmem_ig, although there were memory benefits. It generated a lot fewer rotamers than this user's problem, though. (Also, the lion's share of the time is spent in minimization not packing, so small increases/decreases would be lost to noise).

Fri, 2012-03-09 13:55
smlewis

Thanks Steven. that makes sense. So I guess just benchmark the design first?

Fri, 2012-03-09 14:27
jadolfbr

If you're doing design in a repeated, moving-BB context, yes. If it's in a single-run fixed BB context, it converges fast enough that the test run IS the production run, so it doesn't matter too much. None of the IGs has an effect on what sequence is ultimately chosen.

Sat, 2012-03-10 13:45
smlewis

-linmem 10 does not work with -symmetry:symmetry_definition :


ERROR: !core::pose::symmetry::is_symmetric( pose )
ERROR:: Exit from: src/core/scoring/ScoreFunction.cc line: 577

ScoreFunction.cc

//fpd fail if this is called on a symmetric pose
runtime_assert( !core::pose::symmetry::is_symmetric( pose ) );

what does fpd stands for?

Tue, 2012-08-07 07:38
attesor

LinearMemoryInteractionGraph (-linmem_ig) does not appear to have any hooks into symmetry. This is likely to mean it's not compatible with symmetry. On the other hand, NONE of the interaction graphs have hooks into symmetry, so maybe they all work.

fpd is probably Frank DiMaio, who wrote a chunk of the symmetry code.

The code you have pasted is in the middle of the scorefunction's main scoring operation and has nothing to do with interaction graphs, and is not related to the flag -linmem_ig. All calls to scoring go through this code. Does this failure only occur when -linmem_ig is passed?

Tue, 2012-08-07 11:03
smlewis

I agree with Steven - this doesn't look to have anything to do with linmem_ig. What's happening here is that the protocol you're using (which one is it?) is trying to score a symmetric pose with a non-symmetric scorefunction, instead of a symmetry-aware scorefunction.

Symmetry is currently a rough area in Rosetta, but it's relatively straightforward to convert protocols to use a symmetry-aware scorefunction instead of the (regular) symmetry unaware scorefunction. Crashing as such on a symmetric pose is definitely a bug. The trick is tracking down where the offending scorefunction calls are coming from, so any further details you could give on where/how you triggered this issue would be helpful.

Tue, 2012-08-07 11:37
rmoretti

But it appears to me that the -linmem_ig option is definitely responsible for the ERRORs.
I ran the same protocol except for parameter -linmem_ig 10

fixbb.linuxgccrelease @flags_fixbb_symm -s my_monomer -symmetry:symmetry_definition my_symmdef -out:file:silent mysilent.out -out:file:scorefile myscore.sc

fixbb.linuxgccrelease @flags_fixbb_symm -s my_monomer -symmetry:symmetry_definition my_symmdef -out:file:silent mysilent.out -out:file:scorefile myscore.sc -linmem_ig 10

The flags_fixbb_symm file content:

-ndruns 1
-resfile resfile
-ex1
-ex2
-nstruct 2

The output of the two protocols are different only in the ending:
1st protocol (successful):

... // many same lines skipped here
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pack.task: Packer task: initialize from command line()
core.pack.interaction_graph.interaction_graph_factory: Instantiating PDInteractionGraph
core.pack.pack_rotamers: built 15365 rotamers at 53 positions.
core.pack.pack_rotamers: IG: 233310128 bytes
core.conformation.util: Non-ideal residue detected: Residue #1 atom #1( N ) : Ideal D=1.458, Inspected D=1.49154
core.io.silent: detected attempt to write non-ideal pose to silent-file...Automatically switching to binary silent-struct type
protocols.jd2.JobDistributor: 1ekt_decoy_1_sim_1_INPUT_0002 reported success in 64 seconds
protocols.jd2.JobDistributor: no more batches to process...
protocols.jd2.JobDistributor: 2 jobs considered, 2 jobs attempted in 129 seconds

2nd protocol (unsuccessful):

... // many same lines skipped here
protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pack.task: Packer task: initialize from command line()
core.pack.dunbrack: Dunbrack library took 0.02 seconds to load from binary
core.pack.interaction_graph.interaction_graph_factory: Instantiating LinearMemoryInteractionGraph
ERROR: !core::pose::symmetry::is_symmetric( pose )
ERROR:: Exit from: src/core/scoring/ScoreFunction.cc line: 577

See, one uses PDInteractionGraph and the other uses LinearMemoryInteractionGraph. Obviously the latter does not handle symmetry.

Wed, 2012-08-08 01:56
attesor

I think I know a fix if you want to play guinea pig. I'm not sure if it will work. For background, most of the packing machinery appears to be symmetry-agnostic; there's no code about symmetry. This is relevant only in that there are many other places in Rosetta where there has to be lots of special code to handle the symmetry case. Obviously PDInteractionGraph works with symmetry, so the only question is, why not LinearMemoryInteractionGraph?

I think the bug is on line 465 of src/core/pack/interaction_graph/OnTheFlyInteractionGraph.cc:

score_function_ = new scoring::ScoreFunction( sfxn );

See what happens if you try replacing this with:

score_function_ = sfxn.clone();

instead. The extant line of code will convert a SymmetricScoreFunction into an (asymmetric, base) ScoreFunction, leading to the crash. The new line of code won't, but should work the same otherwise.

Wed, 2012-08-08 07:11
smlewis

I've filed you a bug in our soon-to-be-public bug tracker: https://carbon.structbio.vanderbilt.edu/mantisbt/view.php?id=63

Wed, 2012-08-08 07:16
smlewis

I made the change and recompiled rosetta and tried the same command (the one with -linmem_ig 10). Guess what? the error message gets explicit now:


protocols.simple_moves_symmetry.SymDockingInitialPerturbation: Reading options...
core.pack.task: Packer task: initialize from command line()
core.pack.dunbrack: Dunbrack library took 0 seconds to load from binary
core.pack.interaction_graph.interaction_graph_factory: Instantiating LinearMemoryInteractionGraph
core.pack.pack_rotamers: built 15365 rotamers at 53 positions.
ERROR: Cannot use symmetry with on-the-fly interaction graph yet!
ERROR:: Exit from: src/core/pack/rotamer_set/symmetry/SymmetricRotamerSets.cc line: 88

Wed, 2012-08-08 07:56
attesor

Well, then...good news, is was a bug that it was crashing with a nonspecific message; bad news, -linmem_ig isn't going to work until someone invested in the code fixes it intensively. I'll alter the bug to point out that hanging promise in the code, but it's beyond my purview to fix.

Wed, 2012-08-08 08:01
smlewis

OK. I will subscribe to the rss feed (which does not work currently) and look forward to the progress. Thanks again!

Wed, 2012-08-08 08:11
attesor

You are almost certainly the first person to have tried to use an RSS setup. It's probably a feature of Mantis (the bug tracking software) that we failed to either set up or disable, so it's in some nebulous partial state...

Wed, 2012-08-08 08:17
smlewis

I should mention - PDInteractionGraph is working for you - do you want linmem_ig just for speed, or because you're hitting a memory limit? Generating a symmetry-capable linmem_ig is not a minor task and it seems nobody is going to take care of it soon.

Wed, 2012-08-08 08:27
smlewis

Memory is not an issue for my job now. I need it for speed. So, I guess I will try to forget this function for the moment.

Wed, 2012-08-08 14:39
attesor