You are here

ddG calculation

29 posts / 0 new
Last post
ddG calculation
#1

Hello!

I am trying to run ddG calculations for protein mutations but not able to locate its python script. In the rosetta manual, it is mentioned that the script could be found at 

"The developmental version can be found in the Rosetta source code in source/src/python/bindings/app/membrane/compute_ddG.py"

However, compute_ddG.py is not located at this position (/bindings/app/membrane/ does not exist!!)

Could anyone suggest the reason for for this and solution as well?

Many thanks in advance!

Malkeet

Post Situation: 
Sun, 2017-06-25 07:53
malkeet.singh

I can't find that file, nor can I find a file at that path looking through the git history.  Link me to the manual page you are looking at and we'll have a better shot at finding it.

Mon, 2017-06-26 08:54
smlewis

I'm told it is called "protocol_capture/mp_ddg/predict_ddG.py" and should be in the mp_ddg demo.  

Mon, 2017-06-26 10:06
smlewis

hello !

i tried to run ddG_monomer application, and got following error. I tried to generate constraint file with "minimize_with_cst.static.linuxgccrelease" script using single PDB and a list of PDBs. but it ended up with the error

("ERROR: Unable to open weights/patch file. None of (./)standard or (./)standard.wts or /home/gnss/singhma/rosetta/software/rosetta_2017/main/database/scoring/weights/standard or /home/gnss/singhma/rosetta/software/rosetta_2017/main/database/scoring/weights/standard.wts exist")

genesis:/home/gnss/singhma/rosetta/test> /home/gnss/singhma/rosetta/software/rosetta_2017/main/source/bin/minimize_with_cst.static.linuxgccrelease -l pdblist -in:file:fullatom -ignore_unrecognized_res -fa_max_dis 9.0 -database /home/gnss/singhma/rosetta/software/rosetta_2017/main/database/ -ddg::harmonic_ca_tether 0.5 -score:weights standard -ddg::constraint_weight 1.0 -ddg::out_pdb_prefix min_cst_0.5 -ddg::sc_min_only false -score:patch /home/gnss/singhma/rosetta/software/rosetta_2017/main/database/scoring/weights/score12.wts_patch > mincst.log

Thanks!

Malkeet
 

Tue, 2017-06-27 03:31
malkeet.singh

in the protein minimizer application, the protein can be minimized with constraints.  for the demo, cst file is already given. in case, we have to do it on another protein, then how to generate the specific cst file. can anyone, suggest or share link for this?

 

Thanks!
 

Tue, 2017-06-27 04:31
malkeet.singh

Constraints documentation is here: https://www.rosettacommons.org/docs/latest/rosetta_basics/file_types/constraint-file

 

The sample file is all-backbone-heavy-atom coordinate constraints.  You don't actually need that for simple relax - i'm pretty sure there's a flag for it in relax already (constrain_relax_to_start_coords)

Wed, 2017-06-28 05:58
smlewis

hello!
 

in ddG minimization protocol:

Score12, 13 and pre_talaris_2013_standard are there. here, pre_talaris_13_std (X) score12_patch is equal to score 12. so what is teh difference between score13 and pre_talaris_13_std. they files have different values.

 

thanks!

Tue, 2017-06-27 05:58
malkeet.singh

https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/Scorefunction-History

score12 and pre_talaris_13 are the same thing - the latter is a backwards-compatibility mode created after we switched to Talaris.

 

I don't know what score13 is exactly, but it's probably something purpose built for DDG, or maybe it's related to the scorefunction development that was meant to replace score12 but became talaris13 instead.

Wed, 2017-06-28 06:00
smlewis

score13 was indeed an abandoned attempt at making a successor to score12.  It's not really related in any way to talaris2013, other than being part of the long effort to improve upon score12.

Thu, 2017-06-29 07:32
rmoretti

Hello!

Thanks to rosetta developers !!!!

I finished calculting ddG for my protein mutations and it worked fine! But I have some questions regarding the protocol.

1.  Do i need to run use 'relax protocol' prior to perform ddG calcultions because minimization is performed in ddG too (minimize_with_cst)?

2. Could I prepare the protein using 'relax protocol' and generate .cst file seperately without minimizing the protein in ddG protocol (minimize_with_cst)? because in min_with_cst (ddG protocol), my protein terminal is disturbed significantly. In relax protocol, I can control it, but not aware how to control it in min_with_cst of ddG calculation.

3. How to prepare .cst file for minimizer ? 

Many thnaks!

Wed, 2017-06-28 03:48
malkeet.singh

Should you relax first?  I don't think anyone knows.  The Kellogg ddg paper from several years ago makes a determination of  whether you should or not (I don't remember the answer) - so if you are copying that protocol exactly, do what the paper says.  Otherwise, results vary depending on the setting.  Global relax tends to introduce a lot of noise, but no relax at all also introduces problems leftover from the crystal structure.

 

2.  I don't understand the question.  Probably yes.

 

3.  I've linked you to the constraint file documentation above.  I don't know where the script used to make that exact sample file is but I've asked.

Wed, 2017-06-28 06:14
smlewis

Hello!

After, having interesting time on the ddG tutorials, I shifted to real work- my protein. my protein has a ATP molelcule and a Mg ion in the PDB file. 

1. In the 'relax' run, ATP was not recognized, so i ingoned it. it is not present in the output PDB as well, but i need it for further calculations. How to keep it and process in relax (without ignoring)?

2. then I used minimize_with_cst and finally the ddg_monomer......linixgccrelease: here it is ending up with in error 'caught exception atom 'CA 227' not found' which is a mg2+ ion. How to proceed for it  (i have already used a flag -ignore_unrecognized_res, but still the error persists)

 

Thanks!

 

Wed, 2017-06-28 06:41
malkeet.singh

1. You need a params file for ATP.  It's common, so one way is via https://www.rosettacommons.org/docs/latest/build_documentation/Build-Documentation#setting-up-rosetta-3_obtaining-additional-files_pdb-chemical-components-dictionary.  You can also generate a parameters file yourself with molfile_to_params - look around the forums / wiki for more information.

2. I beleive the ddg_monomer application is just flatly incompatible with ligands (thus "monomer").  

Wed, 2017-06-28 07:05
smlewis

Hello smlewis!

Thanks for your reply!  :) 

 

1. I tried as per you suggestion using "-in:file:load_PDB_component" flag . I downloaded CCD (chem comp dic) and copied that to /main/database/chemical, and ran the relax protocol. It worked fine and the ATP and Mg2+ are present in the relax output PDF file, but the positions of ATP and Mg2+ are highly disturbed (protein BB is OK). Could you suggest some reason for this? 

2. When -in:file:load_PDB_component flag is used, how the ligands are treated? I mean, If the crystal structure conformation of ligand is taken and simply incorporated into the relax protocols' output PDB or the ligand conformation is optimized respect to protein conformations.

3.  As you told that ddg_monomer is incompatible with the ligands, so ii tried with predict_ddg.py script but i am getting this error. 

genesis12:/home/gnss/singhma/rosetta/test/cftr> /home/gnss/singhma/rosetta/software/rosetta_2017/demos/public/mp_ddg/predict_ddG.py -p 2pze_relax-zn.pdb -r 75 -m P -a 10 -t true -v 7.4
Traceback (most recent call last):
  File "/home/gnss/singhma/rosetta/software/rosetta_2017/demos/public/mp_ddg/predict_ddG.py", line 25, in <module>
    import rosetta.protocols.membrane
ImportError: No module named rosetta.protocols.membrane
 

Could you please suggest me its solution? and moreover, it is possible to use protein with ATP and Mg2+ to calculate ddG using predcit_ddg script?

 

Thanks!

malkeet

Thu, 2017-06-29 04:32
malkeet.singh

On 1., check the tracer output of the runs - if you're getting a notice about repacking the ADP residue because of missing atoms, that might be the cause.

On 2. The -in:file:load_PDB_component option is a very minimal approach to keep non-protein residues present. The ligand should indeed be present and interacting with the rest of the protein during the protocol, but there's little treatment of the dynamic flexibility of the ligand. Instead it relies on the input coordinates to be present and preserved through the protocol. If you repack (even an implicit repacking due to missing atoms) then there's a reasonable chance you'll mess up the conformation of your ligand.

Another thing to keep in mind is that the -in:file:load_PDB_component option for ligand loading will take the version of the ligand that's in the wwPDB's chemical components dictionary. Depending on how they've represented things, this may not be what you want. For example, many of their compounds are in their uncharged form, rather than the protonation state they'd be in for neutral solution. This may be an issue with ATP, as you may end up with protonated phosphates. It may be better to manually create an ATP params file and feed that to Rosetta with -extra_res_fa instead: https://www.rosettacommons.org/demos/latest/tutorials/prepare_ligand/prepare_ligand_tutorial  This will also allow you to make a rotamer library for ATP, which will help address the packing issue.

3. Do you have PyRosetta installed? The predict_ddG.py script is a PyRosetta script, and thus requires a working installation of PyRosetta. If you just have the commandline version of Rosetta, it won't work.

The other caveat is that predict_ddG.py  is intended for membrane proteins. If you have a soluble protein, it might not work out well.

 

Thu, 2017-06-29 07:46
rmoretti

Thanks for your super quick reply!

 

The ATP conformation problem seems to be due to repacking as -in:file:load_PDB_component option is taking the information from wwPDB's chemical components dictionary. so thePARAM file might overcome this problem. I will try it.

the ultimate question is: I have to analyze the ddG change in protein mutants, and my protein has Mg2+ and is non-transmembrane too. So what should i do to handle it, as predict_ddg.py script tackle membrane proteins and monomer_ddg can't handle ligands (Mg2+). Is there any way to proceed with the work using Rosetta ddG applications?

Thanks!

 

Thu, 2017-06-29 10:09
malkeet.singh

There's various ddG of folding scripts for RosettaScripts running around, which can probably be re-purposed for your needs. The group who has put in the most effort into these protocols recently is probably the Kortemme lab at UCSF. I'm not sure if they've published anything on their work yet, though. Their most recent publication in the area that I can find (https://doi.org/10.1371/journal.pone.0130433) apparently still uses the ddg_monomer application (see https://guybrush.ucsf.edu/benchmarks/benchmarks/DDG)

 

That said, I belive the ddg_monomer application has had some bug fixes in it to allow non-protein residues. If you're not using a very recent weekly release, you may want to try updating your Rosetta version and see if that fixes things.

Thu, 2017-06-29 13:15
rmoretti

Hello!

I ran the relax protocol with following parameters, with which it was successful but ligand was not incorporated in the final PDB.

for metal, I used : -in:auto:setup_metals >> metal incorporated in the final version of PDB.

param file of the crystal ligand used with : -extra:res_fa 'path to param file' >> no ligand in the final PDB.

 

Could you pls suggest how to tackle the crystal ligand in relax protocol?

 

in other PDB (metal and ligand), the error is displayed: segmentation fault (core dumped)>> what is this??

 

malkeet

Thu, 2017-07-06 07:12
malkeet.singh

For the first question:

Do we know that it read and tried to use your parameter file?  Did you leave the chemical components dictionary active, or did you turn that flag off?  I suspect Rocco wrote it such that it will use a user-passed parameter file over an automatic parameter set from the chemical components dictionary - but if they aren't lined up perfectly, it will use the wrong one and you won't notice the problem.  So, check these things:

1) make sure that the params file and the PDB are using the exact same 3-letter code for the ligand

2) make sure that -ignore_unrecognized_res and the flags for the chemical components dictionary are OFF

 

For the second queston: If you google segmentation fault, you'll see that it is a highly generic error that just means that illegal memory was accessed.  We can't tell you what the problem was just from the fact that it segfaulted.  The standard advice is A) look at the log file to see if there is something useful nearby, B) recompile and rerun in debug mode, which will sometimes produce more useful errors (at the cost of being slower).  (the  extra checks debug mode does are what makes it slow, but it also makes the errors clear instead of just crashing).

Thu, 2017-07-06 08:08
smlewis

Hello!

In the relax protocol, if run without constraints, the repacking of side chains significantly change the final structure. Is there some specific reason for repacking the side chains instead of simple minimization?

Any paper explaining the basics of this protocol?

Mon, 2017-07-10 06:46
malkeet.singh

One way to look at this is "how fixed are those sidechains in reality"?  For most structures, only the core sidechains have sufficient density to unambiguously assign rotamers, and even then they'll move as the protein 'breathes'.  

The point of relax is to take a structure and modify it to fit Rosetta's scorefunction.  It can acheive much lower scores by repacking the sidechains - both by editing out lots of sidechain-involved clashes, and by refitting rotamers into rotamer wells.  The rotamericity of crystal structures is sometimes pretty poor (or at least without sufficient evidence to go off-rotamer).  If the point of relax is to find lower-energy structures, repacking is better because if finds lower-energy structures.  If you are relaxing for some other reason, it might not be best.

Relaxing with strong constraints is better when either A) you know the crystal is right and the change Rosetta wants to make is wrong, or B) you just want to improve the worst parts of the structure without giving Rosetta free rein.

There are no papers on relax diretly.  Relax was created as the back half of ab-initio: it lets you start with a centroid model from ab initio structure prediction and turn it into a fully atomic model.  This HAS to repack, because there was no sidechain data in the starting model.  I assume some of the older (but not oldEST) papers on structure prediction describe relax.  

Mon, 2017-07-10 10:49
smlewis

Hello!!

In the relax protocol, I used another PDB which don’t have any metal ion (2RNU). I deleted the sulfate ion and use chain A of this protein that contained a protein, waters and crystal ligand.

For relax, I used –relax:consrain_relax_to_start_coords and crystalligand.param file of crystal ligand (CL) in two manners: only crystal conformation and a pool of generated conformers.

When I input pool of CL conformers with flag –extra_res_fa (.param file)>>CL is dispositioned badly in the final output PDB.

In case, I used a CL’s crystal conformation>> the output ligand is weird: the molecule is broken into pieces (bonds are broken).

 

PS: in case the PDB has only protein (no ligand)> the whole procedure is successful (I tried it on tutorial PDB 1ubq.pdb)!

 

Clean PDB with ligand: I copied this script into my working directory and then passed chmod +x command to make it executable. When I ran it for my PDB, following error appeared

genesis:/home/gnss/singhma/rosetta/test/> python clean_pdb_keep_ligand.py 2nru.pdb

Traceback (most recent call last):

  File "clean_pdb_keep_ligand.py", line 19, in <module>

    from amino acids import longer names

ImportError: No module named amino acids

 

Thanks in advance!

Malkeet

Tue, 2017-07-11 05:44
malkeet.singh

Assuming source/src/apps/public/relax_w_allatom_cst/clean_pdb_keep_ligand.py is the one you are using - it works for me, locally.  I have a vague memory of trying to use it and having an issue, but I don't remember how it was resolved.  It was either adding something to the $PATH or $PYTHONPATH, or maybe finding the module it wanted to import from somewhere else in Rosetta's hierarchy (probably in tools) and copying/symlinking it to the directory I wanted to run in.  Rosetta/tools/python_pdb_structure/amino_acids.py is probably the file it is looking for.

If you only want to do it a few times, it's just as simple to run a functioning clean_pdb and then copy the ligand back in with a text editor.

 

 

> When I input pool of CL conformers with flag –extra_res_fa (.param file)>>CL is dispositioned badly in the final output PDB.

Rosetta is presumably telling you that it doesn't like where the ligand is, for whatever reason.  You'll have to use your own knowledge of the system to determine if Rosetta is wrong or if the system is wrong.  Constraints are the way to override Rosetta.

> In case, I used a CL’s crystal conformation>> the output ligand is weird: the molecule is broken into pieces (bonds are broken).

How large is the molecule?  It's not split up into multple params files?  Are you using cartesian minimization in your relax?  My best guess is that the cartesian step, if present, doesn't understand the molecule (lack of parameters for it; the tools that generate ligand params can't do so for cartesian minimization) and is letting the bonds it doesn't know about break.

Tue, 2017-07-11 07:45
smlewis

Hello Rosetta Developers!

In continuation to my previous discussion on this forum, I completed ddg calculations for ~40 protein mutations without considering ligand and metal ion. I got good results. Now, I am looking forward to calculate ddg values considering ligand (ATP) and metal ion into the active site (mg), which is supposed to further improve the results.

1. first I need to relax the protein-crystal ligand (CL) complex: here if i use CL without constraints, the CL in the final PDB is totally disturbed. I tried to understand 'cstfile' provided in the tutorial of the minimizer. But I could not get it exactly. I have to impose restraint on the ATP molecule for which I have coordinates available. Could you please explain a bit how to proceed for this.

2. In relax protocol, there are parameters for Mg2+ so can I use them (or anyhow by doing some changes) to include and score in ddg_monomer run (ddg calculation)?

3. Similarly, is there any way to include (with restrains) ATP into ddg calcaultions? 

Many Thanks!

Malkeet

Wed, 2017-08-23 09:38
malkeet.singh

See the constraint file documentation for more information on putting together a constraint file. You should be able to use that during the relax/minimization phases.  

I'd probably recommend against coordinate constraints, and look more toward atom pair (distance) constraints. That is, you pick pairs of atoms which should stay about the same distance apart, and impose a constraint between them. I'd suggest an iterative approach, where you do the relax/minimization, look at what moved but shouldn't, add constraints to those pairs, rerun the relax/minimization, and then iterate the process until you get what you want.

Adding ATP to the run should theoretically be as simple as adding the appropriate params file for ATP. However, I know that various versions of the ddG protocol have historically had issues with non-protein residues, so there may be bugs you'll need to work around/through.

Wed, 2017-08-23 11:45
rmoretti

Hello!

bingo! My monomer_ddg worked for metal and ligand. Below, I shall mention the hassles. 

- The first step in the ddg_monomer (ddg calculation) is the protein minimization with constraints using minimize_with_cst.static.linuxgccrelease. In this step protein, metal and ligand combination works well and generate a xxx.cst file (constraints file). 

- thereafter the ultimate step is ddg calclulation using ddg_monomer.static.linuxgccrelease where metal and ligand are not recognized and run failed.   

- the reason is, minimize_with_cst.static.linuxgccrelease script generates xxx.cst file in which the constraint is generated on CA atoms and if you open the file and observe, metal and ligand atoms are also marked as CAs which indeed are not. So changing the notation of these atoms to correct one solved the problem. For instance, I have Mg ion in my protein so I changed CA marked at mg position to MG and it worked. 

 

Dear Rosetta developers! 

Thanks for bearing with me and replying timely to my questions.

Best of luck!

malkeet

 

Tue, 2017-08-29 01:38
malkeet.singh

Hello!

My ddg calclulations using ref2015 is taking about >1 day/mutation running on 2.8 GHz processor. Is it likely to happen ?or there is something fishy since this was not the case if I use talaris on same CPU.  

 

Thanks!

Malkeet

Tue, 2017-08-29 04:46
malkeet.singh

Hello!

Prior to use my protein for dg calculations, I want to use relax and also don't want metal and ATP to move inappropriately.  As per your suggestions, I tried constraints.

-atompair: In this,  I used all N and P of ATP and CA of two residues (~16 lines or constraints). for Mg, I used 4 constraints with CA of 4 amino acids. I used the constraint values starting from 0.5 to 50 even, my ATP and Mg are disturbed in the final structure. I didn't get why? I am using ATP.param file in which there is one structure (obtained from crystal structure) because I don't want different conformation of ATP. ()

-coordinate constraint: in this,  I used the X,Y,Z coordinates of ATP and Mg and CA of one amino acid as a reference to set constraint. Here also the Mg and ATP in the final structure are totally disturbed. Moreover, in the sample cstfile, after Harmonic two numbers are given. could you tell me the significance of both (e.g. 0.2  0.1).

these are general flags that I used

-run:min_type lbfgs_armijo_nonmonotone
-run:min_tolerance 0.001
-constraints:cst_file cstfile
-score:weights ref2015_cst
-out:suffix _minwithcsts
-relax:constrain_relax_to_start_coords
-out:level 400
-relax:default_repeats 1
-ignore_waters
-use_input_sc
-nstruct 1

-extra_res /path/ATP.params

 

kindly suggest me solution for these hassles.

Many thanks in advance!

Malkeet 

 

Tue, 2017-08-29 09:16
malkeet.singh