As a new PyRosetta user, I've hit a roadblock regarding score function weights. How do you intelligently choose the weights for your score function? I've seen some scripts (e.g. dG_calc.py) where users arbitrarily set their weights, but how did they determine these weights? Some people have told me that the 'standard' weights should work for most apps, but when I ran the simple poly-alanine fold program (shown in the PyRosetta User Manual) 145 times using the standard weights instead of the custom weights, my poly-alanine chain never folded into an alpha helix. However, the custom weights (VDW 1, hbond_sr_bb 1) listed in the user manual could always lead to a good alpha-helix fold within 2-7 runs. I've heard of the OptE protocol also but I don't know what this is or how to use it and I haven't found any good papers describing it.
So bottom-line is: how do I choose the weights for my score function? My proteins of interest range from 35-100 amino acids and I'm just trying to calculate the stabilities of these proteins.
(Just a note, I'm not familiar with C++ so any advice in terms of python or simple layman language would be helpful!)
In general, you should use the weights that were used for the published protocol that's most similar to your intended application. For most regular full atom modeling tasks where the structure is close to native, that's going to be standard score function (or more precisely, the "standard" scorefunction with the score12 patches, often simply referred to as "score12").
In some applications, score12 doesn't work all that well - often because the problem is being treated at a cruder approximation than the full atom one that score12 has been parametrized for, or because you're trying to sample far from the starting structure. My guess is that's what you're seeing with the poly-alanine folding. The landscape for score12 is much more rugged than that of the simple poly-alanine weights function, so it's getting caught up and distracted by the many local minima. If you're doing a broad search, like ab initio folding, you want to have a much smoother energy function, with less local minima. (Typically in production runs you'd do multistage sampling, where you use the coarse energy function to find the overall fold, and then use a fine grained energy function like score12 to refine the atomistic details.)
On a similar note, if you're doing design, standard score12 doesn't always work that well, as the typical fixed-backbone sampling used may reject amino acid identities because of clashes that could be accommodated with slight backbone movements. In those sorts of situations, you may want to try a soft-rep energy function to sample the mutations, followed by a score12 minimization and scoring for final evaluation purposes.
Another reason you'd use a different energy function is if you're hoping to match published energy values from the literature. The typical score functions in Rosetta are in arbitrary units - good for looking at the difference between two different structures, but not all that enlightening if you want to predict things in terms of experimental kcal/mole. That's what's happening with the ddG-like calculations. For ddG calculations Kellogg et al. (Proteins 79 p. 830) looked at published literature values for experimental ddG, and then adjusted the energy function to better recapitulate those values (they used the soft-rep sampling/score12 refinement technique as well).
That's really how most energy function work proceeds. Someone identifies a problem they wish to solve (energy prediction, structure recapitulation, faster global search, etc.) and then tries several different variations on the energy function until the result they're looking in is optimized. (OptE is simply an program that helps to automate this process to a very, very limited extent.) To be honest, it's slow, tedious, and frustrating work. You're far better off searching for published papers where someone did something similar and just copying what they did (read: let someone else do all the hard work). If there is an obvious deficiency, you might try playing around with adding/adjusting a term or two, but it's not worth attempting to build an energy function from scratch.
As a final note, I should mention that there's a general trend in the Rosetta community away from specialized scorefunctions (at least in the fine-grained full atom mode) and toward a single, general scorefunction. After all, there's only one physical reality, and by having a single scorefunction you can capture this reality without over-optimizing for a particular use case. (There's a depressing trend in the molecular modeling literature to have specialized energy functions which can precisely recapitulate data on some small subset of proteins, but which fail miserably when conditions even slightly outside of the training set are tried.) - And score12 (and/or its successors in the future) is that scorefunction for Rosetta.
So, in summary, check the literature for an energy function specialized to your application. When in doubt, use score12 if you're doing full atom work and sampling close to the final structure. If you're doing ab initio like folding or loop modeling, a centroid representation with a smoother energy function may be required. If you're doing design, a soft-rep scorefunction for picking the mutations may be called for. But in production runs both sampling protocols are generally followed by a score12 minimization and final scoring.
For your particular case (predicting stabilities of the proteins), the answer depends on how the proteins differ. You'll probably get the best results with the prediction of the relative stabilities of proteins of the same length (i.e. ddG calculations like those of Kellogg et al. above). That work was specifically for point mutants, but the technique in general could be extended to multiple mutations - although I might advise benchmarking against an example set with known results to see what sort of error you might expect.
Extending it to proteins of different lengths is hard. Rosetta energy isn't normalized by lengths, so you'd expect a longer protein to have a lower Rosetta score. If I recall correctly, there's some work ongoing regarding true dG of folding calculations with an explicit unfolded reference state, which should be able to correct for protein length differences, but I don't believe anything has been published on that yet.
I am trying to calculate the Gibbs free energy of my mutants (ΔG or ΔΔG). After reading your comments, it seems that there is not a "standard" way to calculate Gibbs free energy. I am also trying to understand the paper of Kellogg et al. (Proteins 79 p. 830), which explains some optimisation strategy involved in the algorithm.
Can I ask
1) Is there a build-in scorefunction (e.g. get_fa_scorefunction) which is exactly for Gibbs free energy?
2) If there is no scorefunction for ΔG or ΔΔG, should I use score12, score12prime or something else to customise the weights of get_fa_scorefunction?
I will read Kellogg's paper in more detail. But in the meantime, could you please help me with that? Thank you very much.
(I am dealing with mutants from a protein of ~400 residues)
No, there isn't a scorefunction that corresponds numerically to kcal/mol or kJ/mol. One of the difficulties in making a general correlation line is that the numbers tend to vary slightly depending on what protocol you're using. More or less stringent sampling can change how large the numbers you get out are.
What people typically do (and what Kellogg et al. did) is to take a standard score function, run the protocol on a bunch of example cases which have known experimental values and then draw a line of best fit between the calculated values and the experimental values. You can then use this correlation to calculate what the experimental value for your unknowns might be, given the calculated values of the unknowns.
If you're interested in the Gibbs free energy change due to point mutants on monomer folding, I would recommend that you follow the protocol from Kellogg et al. (That's the standard protocol for that use case.) There's test cases in main/tests/scientific/biweekly/monomer_ddg (T4 lysozyme mutants), if you want to run the protocol on a known example and check that things look reasonable in your hands. (It should also help you double check the correlation that you get out.)
Really thank you for your help. It actually helps and I think I can ask questions more specifically now.
Can I ask some basic concepts regarding the paper of Kellogg et al. (Proteins 79 p. 830) ?
1. For mutations, can I regard
1) ΔG=G (folded state) - G (unfolded state) ?
2) ΔΔG= G (folded mutated state) - G (folded wild type state) ?
2. Can I regard
1) "backbone sampling" as changing the secondary structure?
2) "repacking" as maintaining the secondary structure and only changing the rotamers of side chains?
3. For “hard-rep” and “soft-rep”
From the supporting document, for soft-rep, I know I should use the weights of soft_rep_design;
but for hard-rep, what is “standard –score:patch score12”? I see a list of .wts files titled “score12…..wts”. What is the exact one I should choose? Is it “score12.wts”?
4. As a beginner, I do not understand the syntax of the following command (Supporting Document). How should I run the command on iPython using the following format? Could you please show me a real example (Row 9, 18 or 20 in the paper)? (I would appreciate it if there is a tutorial about it.)
-in:file:s <INPUT_PDB> -resfile <RESFILE> -database <DATABASE> -ignore_unrecognized_res –in:file:fullatom –constraints::cst_file <CONSTRAINTS_FILE>
5. When I create a mutant’s PDB based on wild type’s resfile, should I use NATRO (I used) or NATAA for the non-mutated residues?
6. In the Table 1 of the paper, both “correlation” and “stability-classification accuracy” are presented. Do you know the meaning of “accuracy”?
7. I could not find the file of
(I do not have the folder of “main’ within “PyRosetta”)
As you said, after knowing the answers for the questions above, I will
1) go through the 20 protocols based on mutants’ PDB files and obtain their individual ddG with regard to wild type,
2) Compare the ddG between computed values and experimental values, and determine the most correlated protocol.
Thank you very much. Have a good weekends!
1) Almost. In the context of the Kellogg et al. ddg_monomers protocol, that's what ΔG means, but ΔΔG = ΔG(mutant) - ΔG(wild type) (hence the delta of the delta G's)
2) The backbone sampling is backbone movement, but most of the time it's at a scale smaller than secondary structure changes. Often it's small (sub-Angstrom) movements of the backbone atoms. "Repacking" doesn't touch the backbone degrees of freedom (it leaves the backbone atoms in the same coordinates), but it does change the rotamers of the sidechain.
3) That's actually two options "-score:weights standard" and "-score:patch score12". That is, apply the "score12" patch ("score12.wts_patch") to the "standard" weight set ("standard.wts"). For convenience, several people have manually applied the "score12" patch to the "standard" weight set, resulting in the "score12" and "score12_full" weight sets. Using them directly with -score:weights ("-score:weights score12_full") should be equivalent to "-score:weights standard" and "-score:patch score12". (There's a slight difference with "score12.wts", in that the disulfide terms have a slightly older set of disulfide weightings.)
By the way, with the weekly releases (and recent PyRosetta releases - recent being "within the last year or so") the default scorefunction changed to talaris2013 scorefunction. This involved internal changes over and above just changing the weights, so the "-restore_pre_talaris_2013_behavior" flag is needed to switch back to using score12. Additionally, to avoid confusion, the "standard" weight set was renamed to "pre_talaris_2013_standard.wts" to avoid confusion.
4) You're supposed to replace the things in angle brackets with your own files. For example, you have your own input PDB, so you need to give the name of that file in place of <INPUT_PDB> Likewise with the directory which points to the database. We don't have a tutorial or a demo for ddg_monomer, but for examples you can take a look at the integration tests at tests/integration/tests/ddG_of_mutation or the scientific test at tests/scientific/biweekly/monomer_ddg in the standard Rosetta installation. These are set up to be run automatically, so unfortunately they might not be all that much clearer. Note that most of the commandline options are intended for use with the commandline application. If you're attempting to interface through the PyRosetta interface, you'll be controlling these via other means.
5) From what I understand, it shouldn't matter - for the ddg_monomer application, the resfile isn't used to specify the packing behavior, but instead is used only to specify which mutations should be made. The behavior in non-mutated residues isn't even looked at. Again, it also looks like the resfile reading is mainly for the commandline application - if you're interfacing through PyRosetta, you'll control the mutations directly on the protocols.ddg.ddGMover, rather than through a file.
7) The tests directory is not distributed with PyRosetta, but only with the regular Rosetta distribution. Note that the ddg_monomer application wasn't written with PyRosetta usage in mind - you should be able to use the protocols.ddg.ddGMover object through PyRosetta, which does the bulk of the work, but much of the setup code is done in code that isn't include with PyRosetta.
Thank you very much for your help regarding so many of my questions. I feel myself still spending so much energy to understand your words. I am happy to use PyRosetta in either iPython or commandline in a .py file but I do not understand C++ in Rosetta. My overall question is how to calculate the ddG of monomer point mutation by PyRosetta following the 20 protocols. If you do not mind, can I ask
For score weights,
1) Do you mean "standard.wts" as "pre_talaris_2013_standard.wts" instead of "talaris2013.wts"?
2) As you said, "score12.wts_patch" was used to generate "score12" and "score12_full". However, "hbond_sr_bb" is 0.5 in "score12.wts_patch" but 0.585 in "score12" and "score12_full". Why is that?
3) Should I use "-score:weights standard –score:patch score12" or "score12_full" for hard-rep scoring function?
4) What do you mean by saying:
the "-restore_pre_talaris_2013_behavior" flag is needed to switch back to using score12 ?
For the syntax to run 20 protocols, I downloaded "rosetta_2014.30.57114_bundle" and extracted it (but not install it).
1) I am sorry but I could not understand the "command" files in the folders of
As I only know python, can I run the 20 protocols by PyRosetta? If that is the case, would you like to show me an example of how to convert the Rosetta script into PyRosetta version? For example, in the Supporting Document, for row 1, Table I:
fix_bb_monomer_ddg.linuxgccrelease -ddg::weight_file soft_rep_design -ddg::iterations 1 -ddg::local_opt_only true -ddg::min_cst false -ddg::mean false -ddg::min true -ddg::sc_min_only false -ddg::opt_radius 0.1
I find some description in PyRosetta\GUIs\rosetta_flag_file_builder\AppOptions\Rosetta3-3\ddg_monomer.txt .
I also find the C++ script in rosetta_2014.30.57114_bundle\rosetta_2014.30.57114_bundle\main\source\src\apps\public\ddg\ddg_monomer.cc .
but I still have no idea how to write it in a python-based format.
2) Can I ask how to use "protocols.ddg.ddGMover" ?
Thank you very much.
If you just want to use the existing ddg_monomer protocol (e.g. replicate the protocols used in Kellogg et al) with regular (non-Py) Rosetta, you don't have to understand C++, you just need to be able to run commandline programs. Even with altered protocols there are some things you can do with commandline arguments. It's only when you need to change the protocol in ways the commandline arguments don't allow you to do that you would need to know C++ (or switch to PyRosetta). So if you're just aiming to apply the Kellogg et al. protocols as-is, it might be easier to just use the commandline version.
1) "standard.wts" in Rosetta3.X or some older (year+ ago) PyRosetta versions is the same as "pre_talaris_2013_standard.wts" in the weekly releases or recent PyRosetta versions. It is different than talaris2013.wts, which is the new default weight set. (Hence the name change. "standard.wts" is no longer the standard weightset.)
2) The "*= 0.5" in the wts_patch file means to multiply the existing weight by 0.5, rather than set it to 0.5. So starting with 1.17 in standard.wts, 1.17 * 0.5 = 0.585
3) Assuming Rosetta3.5, "-score:weights standard –score:patch score12" should be equivalent to "-score:weights score12_full". It really doesn't matter which one you use.
4) The talaris2013 scorefunction involved more changes than simply altering the weights file. A bunch of other settings in Rosetta had their defaults changed. If you're using a weekly release (or a recent version of PyRosetta) and you want to use score12-based scoring, you need to re-set those changed defaults back to their old values. That's what the "-restore_pre_talaris_2013_behavior" does. It's only for the weeklies and recent versions of PyRosetta, and only if you want to go back to using the old (pre-talaris2013) score12 scorefunction.
The examples in the tests/integration/tests/ddG_of_mutation and tests/scientific/biweekly/monomer_ddg are for using the commandline version of the protocol. The command files are basically bash shell scripts with various (Python syntax) substitutions which need to be made for the local running environment. The automated running system will make the substitutions and then launch the shell script, which will invoke the commandline version of the protocol.
There isn't an easy way to convert those commands into a PyRosetta program. You would need to recapitulate the functionality of the ddg_monomer.linuxgccrelease commandline application with a PyRosetta program. That is, all the setup, file reading and iteration that's done by the ddg_monomer.linuxgccrelease application would have to be done by you in a PyRosetta script you write yourself. You can use the rosetta_2014.30.57114_bundle\main\source\src\apps\public\ddg\ddg_monomer.cc file as a guide on how to set things up. It's that file that you don't have in your PyRosetta distribution that is needed for all the setup for the protocol. Unfortunately, that will require a bit of C++ reading, but the C++ here is rather simple, and should be close to what you'd write in Python. I'd recommend working "inside out" from the ddGMover on about line 382 or so. That's the block that is run if your mutational input is a resfile. Keep in mind that you don't have to re-implement all of ddg_monomer, just enough such that you can accomplish what you want.
You can get (limited) documentation by using the built-in help functionality of python or IPython. If you do a "help(protocols.ddg.ddGMover)" it should tell you about the mover, and what sort of functions are available. You can also call it on the methods to get function signature information (e.g. "help(protocols.ddg.ddGMover.set_min_cst)") You can also look at how an example is used to get an idea of how to use it. The usage in ddg_monomer.cc is a good start - it should tell you what sort of setup is needed to match the functionality of the application.
But again, if all you're interested in is running the exact protocols from Kellogg et al. on your own proteins, it's probably easiest just to stick with the commandline version.
Thank you very much for your very detailed help. Yes, I just want to use the same protocols as shown in Kellogg et al. So, I will install VMWare Player and ubuntu (e.g.14.04.1) on my Win 7 PC. Then I will install "rosetta_2014.30.57114_bundle.tgz". Therefore, I can run "ddg_monomer.cc". I will try to figure out how to specify the script with my own protein and obtain the ddG data from 20 protocols. (I am sure I will ask you soon.)
Just to clarify, do you refer "commandline" only to Linux-based console, or it can also be referred as cmd, iPython, etc?
When I talk about "commandline", I am indeed referring to commands given through the Linux (or Mac) based console, as opposed to a Python, iPython, or (Windows) cmd interface.
Thank you. I got it. :)
"How do you intelligently choose the weights for your score function? "
This is at least one and maybe a suite of unpublished papers. Andrew Leaver-Fay has brewed the OptE protocol for the purpose (and is working on a paper to go with it). It works great for fitting reference weights, and is at least functional for fitting energy term weights. It's not going to be a general solution because it requires trainable problems related to your question (not so specifically as the small-subset-of-proteins problem in the literature that Rocco bemoans).
"My proteins of interest range from 35-100 amino acids and I'm just trying to calculate the stabilities of these proteins."
If you mean experimental stabilities in terms of dG of unfolding in kcal/mol, we don't have a scorefunction for it. The ddG scorefunction produces something close to kcal/mol, but for point mutations, not folding/unfolding. I think it's more for ddG of binding, not of folding, but I could be wrong. I guess it's worth a shot.
"If I recall correctly, there's some work ongoing regarding true dG of folding calculations with an explicit unfolded reference state,"
Doug Renfrew created a fully functioning unfolded-like reference state to replace the reference energies for use with noncanonical amino acids (where reference weights can't be fit from the PDB). So, the latter half of that works. I don't know about the first part.
Is this the paper you referred to?
Chapter Six – Scientific Benchmarks for Guiding Macromolecular Energy Function Improvement
Andrew Leaver-Fay*, , 2, , Matthew J. O'Meara†, , 2, , Mike Tyka‡, Ron Jacak§, Yifan Song‡, Elizabeth H. Kellogg‡, James Thompson‡, Ian W. Davis¶, Roland A. Pache∥, Sergey Lyskov#, Jeffrey J. Gray#, Tanja Kortemme||, Jane S. Richardson**, James J. Havranek††, Jack Snoeyink†, David Baker‡, Brian Kuhlman*
Methods in Protein Design
Methods in Enzymology, Volume 523, 2013, Pages 109–143
On a related note, I am unclear as to how the talaris2013 weights were determined. There are references for how the features were selected and how the OptE protocol can be used to fit the reference weights and to explore weight space; however, I can't seem to find a statement indicating how the weights in talaris2013 were set. There are lots benchmarks indicating that the current weights are good.
I would greatly appreciate it if a high level explanation was provided (search strategy / dataset, etc.) or if I was directed to some place with the pertinent information.
Thanks in advance!
To my knowledge, there isn't a single "talaris2013" scorefunction paper. Talaris2013 was a combination of a number of different improvements which were made by multiple people over the years. (And are treated in varying extents in various papers.)
Some of the core papers are:
The "Scientific Benchmarks for Guiding Macromolecular Energy Function Improvement" paper listed above http://www.ncbi.nlm.nih.gov/pubmed/23422428
"A Combined Covalent-Electrostatic Model of Hydrogen Bonding Improves Structure Prediction with Rosetta." O'Meara et al. J Chem Theory Comput. 2015;11(2):609-622. http://www.ncbi.nlm.nih.gov/pubmed/25866491
"Structure-guided forcefield optimization." Song et al. Proteins. 2011 Jun;79(6):1898-909. http://www.ncbi.nlm.nih.gov/pubmed/21488100
As far as the optimization scheme, there wasn't necessarily a concerted overarching one. The general approach was that someone identifies a particular feature that the current scorefunction does not do well on. They then set up a benchmark set which demonstrates that deficiency. Next they play around with various hypotheses on how to fix it (upweight this, downweight that, add this term, replace that term, etc.). They do this until their benchmark says that their results are getting better. Then they run against other benchmark sets in order to check that other things haven't been made significantly worse.
There's some systematic effort and optimization involved (see the three papers for more details on that), but talaris2013 combines those systematic optimizations with other, not-so-systematic optimizations which were made and passed around as "lore" in the Rosetta community, validated by joint experience.
Thanks for the insightful response. It's much appreciated!
Thanks for the help. I've done some work with your advices and made some progress but now I have some more questions regarding scorefxns. What if I want to model a protein in different non-physiological environments (ex/ pure hexane solvent or pure water)? Are the energetics accounted for by the standard scorefxn still sufficient enough to accurately predict the protein's stability/structure? Or would I have to customize the scorefxn weights to simulate the desired environment? On this note, is the "standard" scorefxn weight best suited for protein stability/structure prediction in a physiological environment?
All the scorefunctions are trained on a crystalline environment (they're trained on crystal structures, anyway). That's as physiological as we have data for (and time to compute; explicit waters would certainly be more physiological). What you are trying to do sounds more like molecular dynamics than Rosetta-style Monte Carlo modeling.
Score12 has an implicit solvation term (Larzaridis-Karplus) and only the vaguest idea of what electrostatics are (the fa_pair term, which just checks residue pair distances/occurrences). These are explained in the 2004 Rohl review in Methods in Enzymology. It is not uncommon for users to swap fa_pair for an explicit electrostatics term (usually hack_elec). You could fiddle with the parameterizations of these terms to try to duplicate hexane or zero-salt, but who knows if it would work.
Anecdotally, I once did an experiment (it was for the original publication of the FloppyTail app, but we never used the data) where I turned the fa_pair term off in an attempt to mimic salt screening of charged interactions across an interface, and found that it worked well enough for that specific case.
There's also some membrane-related protocols in Rosetta (though I don't know how extensively they've been tested, or how well they work). If you're interested in modeling proteins in hydrophobic environments, you may want to look into how those protocols model the interior of the membrane. For example, in the database there's a couple of membrane_highres*.wts weights files. There's a number of differences to score12, but the major ones are setting fa_sol to zero, and turning on fa_mbenv, fa_mbsolv and Menv_smooth terms.
That said, it will likely be difficult to simulate non-physiological environments. Rosetta relies heavily on statistical potentials derived from experimental crystal structures (including a number of assumptions about implicit solvation). The further you move away from simulating proteins in environments like those used to crystallize proteins, the less benefit you'll get from what Rosetta does differently from AMBER/CHARMM-type molecular mechanics.
I wonder if the membrane scorefunction works well on proteins that aren't membrane proteins? I was envisioning a normal globular protein in a hexane environment, not so much evolved-for-membrane protein structures.
I discussed this over lunch with some other members of my lab and they pointed out you may be able to get some of the effect by hugely upweighting the hydrogen bonding terms. There's no way to penalize exposed unsatisfied polars in the normal scorefunction (which assumes water) so I don't know that there's a way to penalize them in a nonwater environment without rewriting C++ code.
I am working with a protein-peptide complex which contains few important aromatic pi-pi interactions. How can I effectively model those interactions? Right now I'm using score12. Is there any other good option?
This paper: Leaver-Fay A, O'Meara MJ, Tyka M, Jacak R, Song Y, Kellogg EH, Thompson J, Davis IW, Pache RA, Lyskov S, Gray JJ, Kortemme T, Richardson JS, Havranek JJ, Snoeyink J, Baker D, Kuhlman B. Scientific benchmarks for guiding macromolecular energy function improvement. Methods Enzymol. 2013;523:109-43. doi: 10.1016/B978-0-12-394292-0.00006-0. PubMed PMID: 23422428.
explains methods for choosing a scorefunction.
Immediately, score12prime is superior to score12 if you are doing design work. For pi-stacking, I've had good luck with score12/prime, simply because flat stacking is very VDW attractive. Steven Combs's unreleased orbitals code is probably the best for pi-pi interactions. I've sought his opinion.
For most general case use, score12prime works well in capturing pi-pi interactions through VDW attractive score, like Steven Lewis said. An explicit energy for such interactions are incorporated into the orbitals scfxn, but it is currently unreleased. When Rosetta3.5 is out, the new score function will be released. For rotamer recovery, aromatic residues were recovered slightly better with the orbitals scfxn over score12prime. Additionally, more pi-pi interactions were designed with the orbitals scfxn over score12prime. There should be a paper out soon describing the results in more detail.
For now, I would use score12prime if you are designing. Unfortunately, a lot of the improvements in both the score12 and orbitals score function are not incorporated in Rosetta3.4, but will be incorporated in Rosetta3.5.
All of the scorefunction stuff is in PyRosetta, if you wanted to use it now. Also, 3.5 is ready to go, we're just cleaning up some back-end issues before we release.