You are here

Total score problems with high-res ddg_monomer

3 posts / 0 new
Last post
Total score problems with high-res ddg_monomer
#1

Good Afternoon,

I'm running the high-res protocol of the ddg_monomer on a 439 residue enzyme structure (following the the Kellogg paper and online docs protocols) and I'm seeing some odd score outputs.

There are three types of odd outputs in the ddg_predictions.out file:

1: Residues with one to two mutations with extreme total score(s):
On the whole, the total score for most mutations are around 0 +/- 4. However, throughout the protein there are a significant number of residues with one or two mutations that have a total score of -60 to -80 or +80 to +120. An example of a residue with one of these mutations (D to Y) is in the first output file.

2: Residues with completely extreme negative or positive scores:
As above, most residues have mutations total scores around 0, there are some residues that all of their mutations are completely extremely negative or positive (around -80 or +90). The second output file is an example of this.

3: Inconsistent results from the two cluster runs:
I'm splitting the mutation alphabet for each residue in half (ie: A to L and M to Y) such that each residue has two ddg_monomer runs (due to walltime limits at my cluster). While rare, I'm finding that there are some residues with mutations in one run with a total score on the order of 0 to 4 and the second run with mutations on the order of +80 or -80 for each total score. An example of this is in output file three.

I've run this protocol with this structure on Rosetta 3.5, 3.4, and currently 3.3 and have seen similar results. I've played with setting a constant seed and that has helped the third output (ie: using the random seed from the low scoring run in the other run). And played with -use_input_sc but no improvement with that flag.

Is this behavior normal? Should I set a constant seed for all runs? Is it just that the structure is too big?

Thank you in advance!

Command line (For Rosetta 3.3):
/mnt/home/rosetta_source/bin/ddg_monomer.linuxgccrelease -in:file:s /mnt/home/min_cst_0.5.L_PMF1_0001.pdb -resfile mutations.res -ddg:weight_file soft_rep_design -database /mnt/home/rosetta_database/ -fa_max_dis 9.0 -ddg::iterations 50 -ddg::dump_pdbs false -ignore_unrecognized_res -ddg::local_opt_only false -ddg::min_cst true -constraints::cst_file /mnt/home/input.cst -ddg::suppress_checkpointing true -in::file::fullatom -ddg::mean false -ddg::min true -ddg::sc_min_only false -ddg::ramp_repulsive true -unmute core.optimization.LineMinimizer -ddg::output_silent true -use_input_sc

AttachmentSize
1ddg_predictions.txt4.74 KB
2ddg_predictions.txt4.74 KB
3ddg_predictions.txt4.74 KB
Post Situation: 
Tue, 2015-03-10 15:54
jklesmith

I'm guessing that you have an issue with pre-optimizing your structures. For the ddg_protocol (as with many protocols) you need to pre-optimize the structure to get rid of major errors in the starting conformation. If you don't, then you'll see variations in your results where mutations get anomalously high/low values because some major problem in the structure just happens to be relieved due to the repacking/relax during structure optimization.

Did you use the minimize_with_cst application on your input structure prior to running ddg_monomer? (https://www.rosettacommons.org/docs/latest/ddg-monomer.html) Depending on how things are working, that may or may not work sufficiently well. Take a look at the output of that application (the structure you're going to use for ddg_monomer input), paying particular attention to the per-residue energies at the end of the output PDB. If there are any that are abnormally positive, you may need to adjust the preparation protocol to fix up those issues prior to running ddg_monomer.

Wed, 2015-03-25 12:30
rmoretti

Hello - I have a similar problem to jklesmith, though it's not identical.

i cannot repeat the results I obtain when I run exactly the same experiment (just copy all files to a new directory and run again) more than one time. Attached is a png of the correlation between run 1 and 2. The black line is 1:1 correlation. You can see that some mutants look ok (green ring) while others are very badly different (red rings).

I have checked the structures (I output all 50 for the iterations) and see that there is almost zero difference between the structures in the region surrounding the mutation, and very minor differences in the loop regions. I've also attached a couple of images to demonstrate this. They are of the N47A (ddg is +3, or -6 REU in the repeat) mutation highlighted on the plot. The differences in the two TRP residues in the region of the mutant (pink sticks in the image) occur between the repacked WT and the mutant (but there is no discernable difference in packing between the mutant and the repeated mutant).

I have used the minimization prep as instructed, and included cst file for the calculation.

I have tried also increasing the number of iterations to 50 (correlation between repeats was even worse when only using 20 iterations) and also selecting the ddg::mean true flag, to see if this could even things out (Thanks to Julia K for the suggestions!) but still this is the best I can come up with.

Does anyone have any further suggestions? I really don't mind if the absolute numbers are correct, I'm just looking to bucket into stabilizing in presence of ligand /stabilizing in absence of ligand/stabilizing in either/destabilizing, and would like to be able to reproduce a trend.
Thank you for your help!

Fri, 2015-03-27 05:56
jennifer