# cartesian space minimization

11 posts / 0 new
cartesian space minimization
#1

Hi folks--
Help! I am trying to energy minimize a protein with distance restraints in cartesian space. Instead of making nice regular bonds and bond angles, I get an arginine side chain that looks like a pitchfork and a phenylalanine ring that looks like somebody stepped on it. After having tried different weight sets and minimizers (FastRelax, MinMover), I am frustrated and need some advice. Has anyone successfully regularized a distorted structure in pyRosetta with the setting MinMover.cartesian(True) ? I don't see any bond length or bond angle weights in any of the wts files.

I am attaching an image showing pitchfork Arg and squashed Phe.

Here is the python script.

import rosetta
rosetta.init(extra_options="-ignore_unrecognized_res -extra_res_fa data/CRO.params data/HOH.params -mute all")
pose = rosetta.pose_from_pdb("1d3b.pdb")
mm = rosetta.MoveMap()
mm.set_bb(True)
mm.set_chi(True)
scorefxn = rosetta.create_score_function("score12_full")
minmover = rosetta.MinMover(mm, scorefxn, 'dfpmin_armijo_nonmonotone', 0.01, True)
minmover.cartesian(True)
minmover.apply(pose)
pose.dump_pdb("junk.pdb")

AttachmentSize
241.95 KB
Post Situation:
Fri, 2014-03-28 14:17
bystrc

You need to add the "cart_bonded" term to the scorefunction to properly use the cartesian minimizer. This scoreterm will appropriately penalize bad bond lengths and angles.

For your convenience, the "score12_cart" and "talaris2013_cart" scorefunctions should have the term turned on already.

Fri, 2014-03-28 15:54
rmoretti

Thanks. That fixed the distortions. But it now runs VERY slowly. It would be faster for me to dump_pdb, energy minimize in another program and read it back in.
Or am I doing something else wrong? The only difference from the script above is that I changed "score12_full" to "score12_cart"

Mon, 2014-03-31 15:09
bystrc

Okay, runtime. With Cartesian minimization, or even just flexible bond length/angle minimization you want to use the 'lbfgs_armijo_nonmonotone' minimization scheme, rather than the 'dfpmin_armijo_nonmonotone' scheme. This *greatly* reduces the runtime requirements (when I tried it, it went from >30 min to around 1.5 minutes). I was under the impression that setting cartesian to be true automatically changed this, and it does for certain methods of settings, but for the code path you use it doesn't.

The other thing you may want to reconsider is the use of the HOH.params. Rosetta and the Rosetta energy functions are set up around the concept of implicit waters. Indiscriminate loading of crystallographic waters is probably not going to gain you anything, and is just going to slow things down due to having to manipulate the extra residues. - You could potentially gain with a small number of explicitly chosen water molecules, but unless you have compelling reasons to have explicit waters, I'd recommend going with implicit waters.

Tue, 2014-04-01 09:25
rmoretti

I'd recommend against using "-mute all" in your initialization line. It doesn't (yet) occur with this case, but often there's informational messages that get printed out which will tell you if you have mis-applied or questionable settings, (for example, future PyRosetta versions will print something like "WARNING: Use of the 'lbfgs_armijo_nonmonotone' minimizer is recommended with Cartesian minimization."), but you won't see them if you mute all the tracers. If omitting it completely is too noisy for you, I'd recommend using "-out:levels all:warning" instead, which will quiet the output messages, but will still print warning and error messages.

Regarding the scorefunction, if you're using a recent version of PyRosetta (after r55300 or so) the talaris score function should be the default. Note that the switch isn't just in changing the weights files - there's a number of other parameters which also change. It's recommended to go with the talaris2013 scorefunction over the score12 scorefunction (so use the talaris2013_cart weights), but if you want to stick with score12, add the "-restore_pre_talaris_2013_behavior" flag to the initialization line to set all the changed parameters back to their score12-compatible settings.

Tue, 2014-04-01 09:49
rmoretti

OK. I am trying 'lbfgs_armijo_nonmonotone'.

We are not using waters in the case. However, when we put this code into our design script, we want to
be able to use explicit waters. We have our reasons. We may be wrong, but we have our reasons for
not depending on implicit solvation. Only buried waters are being modeled explicitly.

Ultimately we want to add an energy function for template-based H-bonding. This is described in our paper (Huang & Bystroff, IEEE, 2013)

Are there guidelines somewhere for coding in a new energy function. A template (code template) perhaps?

Minimizing w/ lbfgs_armijo_nonmonotone did the trick. The structure looks great.

Thanks!

Fri, 2014-04-04 13:58
bystrc

Judicious use of explicit waters is fine. I don't want to dissuade you from using explicit waters if you have a scientific rationale for it. I just wanted to caution you about indiscriminate loading of waters; most waters in PDBs as obtained from the RCSB aren't needed. If you've gone through and cleaned out those waters you don't want (e.g. all non-buried waters), then you're good.

There really isn't a good overview on how to write a new scorefunction term. I'd recommend the Rosetta3 overview paper (http://www.sciencedirect.com/science/article/pii/B9780123812704000196) as that has the best description of the overall scoring system in Rosetta/PyRosetta. We do have a little bit of information on adding new scoreterms at https://www.rosettacommons.org/manuals/archive/rosetta3.5_user_guide/d7/..., but that's a little light on details and the interface has changed a bit. What I'd probably recommend is taking a look at how an existing scorefunction term of the same class is implemented, and then doing something similar. (Note, though, that many of the common scorefunction terms are heavily optimized, so aren't great for learning. E.g. I wouldn't recommend anyone look at fa_atr, fa_rep or fa_sol as examples. Even the Hbonding terms may be slightly overbuilt for your purposes.) The fa_pair term (src/core/scoring/methods/PairEnergy.cc) may be a good place to start.

Keep in mind that scoring, packing and minimization each use different but overlapping sets of functions, so depending on what you want to do you could possibly defer implementation of some methods. In particular, the derivative calculation may be a bit tricky, as derivatives (even with Cartesian minimization) are calculated with respect to torsional space by the scheme of Abe, Braun, Noguti and Go (1984) http://www.sciencedirect.com/science/article/pii/0097848584850159 rather than in xyz coordinate space - but a proper implementation is only needed if you want to do minimization with your energy term. (Packing and scoring don't use the derivatives.)

Mon, 2014-04-07 09:09
rmoretti

Hello, could anybody share with me a curated parameter file for residue CRO (CRO.params) as found in GFP? You may contact me directly under [email address removed]. Thanks a lot in advance, Xavier

Tue, 2015-09-22 08:42
xgrcri

We spoke off the forums but I wanted to respond here too incase others are looking for this as well.

Here are some params for the GFP chromophore. Tom Linsky helped with the preparation of these params as well. I think we got all the hydrogens placed correctly but you should double check. I did a quick check of the params using PyRosetta and they seem to work alright during a simple minimization (the session is bellow). I have attached the residue type params file and a modified 4eul file; some of the atom names had to change in the structure. Yu will need to edit the name to the params file to remove the .txt extention (the forums software will not let me upload files with that extension). You will also need to remember to use the -extra_res_fa /path/to/CRO.params flag every time you run Rosetta. Although GFP is a common enough protein that I might go ahead and add these params to the database.

from rosetta import *

init(extra_options="-extra_res_fa CRO.params -ignore_unrecognized_res")

p = pose_from_pdb("4eul_mod.pdb")

sf = create_score_function( "talaris2013" )

mm = MoveMap()
mm.set_chi( True )
mm.set_bb( True )

minm = MinMover()
minm.movemap( mm )
minm.score_function( sf )
minm.type( "dfpmin" )

pmm = PyMolMover()
pmm.keep_history( True )

pmm.apply( p )
minm.apply( p )
pmm.apply( p )

File attachments:
Fri, 2015-09-25 14:45
renfrew

Hi There,

I tried using these Parameters for Protein-Protein docking although I get a prepacked structure the docking won't run. I've attached part of the run log file and the starting structure I used. Any advice or tips would be much apreciated.

Cheers,

Harley

File attachments:
Thu, 2016-02-18 08:25
Harley Worthy

For protein-protein docking, you also need centroid-mode parameters for the CRO residue. Centroid mode (in contrast to full atom mode) is a reduced representation that is used in certain protocols. For amino acids, it represents the sidechains by a single "superatom". For ligands, usually what we do is a united atom model, which removes hydrogens. This isn't entirely accurate, but it gives you something that Rosetta doesn't complain about.

See attached for the manually edited version of the params file for centroid mode. This should be passed as a commandline option with the option "-extra_res_cen /path/to/CRO.cen.params" For docking, you'll probably need to pass both the full atom and the centroid version of the params file.

Another thing to note is that the form of your -partners option might be off. I don't think it's supposed to include the square brackets.

File attachments:
Thu, 2016-02-18 14:22
rmoretti