You are here

"ddg_monomer.linuxgccrelease" by following "High Resolution Protocol"

8 posts / 0 new
Last post
"ddg_monomer.linuxgccrelease" by following "High Resolution Protocol"
#1

(I would appreciate it if any of the following can be commented)

Dear friends,
I am trying to use "ddg_monomer.linuxgccrelease" by following "High Resolution Protocol" in
https://www.rosettacommons.org/docs/latest/ddg-monomer.html

1. To pre-minimize, the command line is

/path/to/minimize_with_cst.linuxgccrelease -in:file:l lst -in:file:fullatom -ignore_unrecognized_res -fa_max_dis 9.0 -database /path/to/rosetta_database/ -ddg::harmonic_ca_tether 0.5 -score:weights standard -ddg::constraint_weight 1.0 -ddg::out_pdb_prefix min_cst_0.5 -ddg::sc_min_only false -score:patch rosetta_database/scoring/weights/score12.wts_patch > mincst.log

Can I ask
1) As there is no "standard" score weights file, should I use "score12.wts"?

2) It seems that the following running is successful:

~/Cheng/rosetta_2014.30.57114_bundle/main/source/bin/minimize_with_cst.linuxgccrelease -in:file:l /mnt/hgfs/Downloads/lst.txt -in:file:fullatom -ignore_unrecognized_res -fa_max_dis 9.0 -database ~/Cheng/rosetta_2014.30.57114_bundle/main/database -ddg::harmonic_ca_tether 0.5 -score:weights score12 -ddg::constraint_weight 1.0 -ddg::out_pdb_prefix min_cst_0.5 -ddg::sc_min_only false -score:patch ~/Cheng/rosetta_2014.30.57114_bundle/main/database/scoring/weights/score12.wts_patch > mincst.log

However, I got the error message at the end of "mincst.log" as below (attached):

"Error: FileData::dump_pdb: Unable to open file:.//min_cst_0.5./mnt/hgfs/Downloads/ddg_test_0001.pdb for writing!!!"

Is this a vital problem, or can I ignore this?

2. To generate the distance restraints file, the command line is

./convert_to_cst_file.sh mincst.log > input.cst

1)I was told "Permission denied" in the beginning. After adding "sudo", I was told "...convert_to_cst_file.sh: command not found"

lanselibai@ubuntu:/mnt/hgfs/Downloads$ ~/Cheng/rosetta_2014.30.57114_bundle/main/source/src/apps/public/ddg/convert_to_cst_file.sh mincst.log > input.cst
bash: /home/lanselibai/Cheng/rosetta_2014.30.57114_bundle/main/source/src/apps/public/ddg/convert_to_cst_file.sh: Permission denied

lanselibai@ubuntu:/mnt/hgfs/Downloads$ sudo ~/Cheng/rosetta_2014.30.57114_bundle/main/source/src/apps/public/ddg/convert_to_cst_file.sh mincst.log > input.cst
[sudo] password for lanselibai:
sudo: /home/lanselibai/Cheng/rosetta_2014.30.57114_bundle/main/source/src/apps/public/ddg/convert_to_cst_file.sh: command not found

Can I ask how to solve this?

3. To use "ddg_monomer.linuxgccrelease"
1) "PDB numbering" for mutation: the rule is to number from 1 to the very end consecutively. However, my antibody actually has heavy chain (HC) and light chain (LC). The residues of HC and LC are numbered separately from 1 to the each end. Is there a way to specify the mutation point either in HC or in LC?

This problem also happens when dealing with the loop modeling and homology comparative modeling. I have to treat the two chains separately and manually combine the two PDB files together. Is there a more efficient way to do this?

I also find that "pmut_scan_parallel.linuxgccrelease" can specify the chain identity, which is also much simpler than "ddg_monomer.linuxgccrelease". Should I simply use "pmut_scan_parallel.linuxgccrelease"?

2) Why there are red and green command lines listed in the "Option" section on the website?

3) Is the "fix_bb_monomer_ddg.linuxgccrelease" mentioned in the paper just the "ddg_monomer.linuxgccrelease"?

4) I tested "ddg_monomer.linuxgccrelease" and was told

Warning: Unable to locate database file chemical/atom_type_sets/fa_standard/
ERROR: Unable to open atomset file: ~/Cheng/rosetta_2014.30.57114_bundle/main/database/chemical/atom_type_sets/fa_standard//atom_properties.txt
ERROR:: Exit from: src/core/chemical/AtomTypeSet.cc line: 178
caught exception
[ERROR] EXCN_utility_exit has been thrown from: src/core/chemical/AtomTypeSet.cc line: 178
ERROR: Unable to open atomset file: ~/Cheng/rosetta_2014.30.57114_bundle/main/database/chemical/atom_type_sets/fa_standard//atom_properties.txt

If "main/database/" is not the right one, can I ask what is the correct directory for database?

Thank you very much.

Yours sincerely
Cheng

AttachmentSize
mincst.log102.31 KB
Category: 
Post Situation: 
Wed, 2014-10-01 09:54
lanselibai

1.1) With the weekly releases, we changed the default scorefunction to talaris2013. With this we renamed the "standard" weights (which were no longer standard) to "pre_talaris_2013_standard". What you should do is change "standard" to "pre_talaris_2013_standard". You will also have to add the flag "-restore_pre_talaris_2013_behavior" to the commandline to reverse all the non-weight related changes to the scorefunction that also occurred with the change-over. (You'll want to do this with any associated ddg_monomer runs, too, even ones where "standard" isn't used.)

1.2) It looks like the minimize_with_cst application does some non-standard filename manipulation, assuming that the filenames which you give it are in the current directory, rather than absolute paths. If you actually want the minimized structure, the error is vital to correct. You'll need to change your setup to run minimize_with_cst in the same directory as your input PDBs, and to change your lst.txt file to use just the filename of the input PDB, rather than the full path.

2.1) That script is not being annotated on your system as being executable, so your shell is refusing to run it like a program. You can either change it to be executable with "chmod a+x convert_to_cst_file.sh", or you can explicitly specify the interpreter to use like "tcsh ./convert_to_cst_file.sh mincst.log > input.cst" (as it is a tcsh shell script)

3.1) The resfile syntax input should be in PDB numbered form, where you can specify both the chain an (possibly non-consecutive) PDB numbering. You can specify heavy or light chain with the chain letter for each (e.g. H&L, or A&B, however you have the input PDB set up.) The ddg_monomer-specific multi-mutation enabled format is pose numbering specific, though. That's an unfortunate circumstance of some Rosetta input files. You either have to renumber your inputs such that PDB numbering matches pose numbering, or do the conversion manually whenever the circumstances call for it.

3.2) That's an artifact of how the webpage is put together. It's trying to be clever with "syntax highlighting", but it's not understanding Rosetta option file syntax. The colors mean nothing.

3.3.) My understanding is that they are equivalent, yes.

3.4) The expansion of "~" to your home directory is something that the shell does to commandline parameters - Rosetta doesn't do it for option file options. So if you give the database directory in an option file, you have to write out the full path to your home directory, rather than using the "~" shortcut. Otherwise Rosetta doesn't know where to find "~"

Mon, 2014-10-06 16:37
rmoretti

Hi R Moretti,
Thank you very much for your instructions one by one. I really appreciate it.

1. To pre-minimize, as you suggested

A) use "pre_talaris_2013_standard.wts"
B) add "-restore_pre_talaris_2013_behavior" to all the command lines
C) use current path (i.e. "./name.pdb") instead of absolute path for input file PDB file in the "lst":

~/Cheng/rosetta_2014.30.57114_bundle/main/source/bin/minimize_with_cst.linuxgccrelease -in:file:l lst -in:file:fullatom -ignore_unrecognized_res -fa_max_dis 9.0 -database ~/Cheng/rosetta_2014.30.57114_bundle/main/database -ddg::harmonic_ca_tether 0.5 -score:weights ~/Cheng/rosetta_2014.30.57114_bundle/main/database/scoring/weights/pre_talaris_2013_standard.wts -restore_pre_talaris_2013_behavior -ddg::constraint_weight 1.0 -ddg::out_pdb_prefix min_cst_0.5 -ddg::sc_min_only false -score:patch ~/Cheng/rosetta_2014.30.57114_bundle/main/database/scoring/weights/score12.wts_patch > mincst.log

I got the message in the command line:

Number of residue types is greater than MAX_RESIDUE_TYPES. Rerun with -override_rsd_type_limit. Or if you have introduced a bunch of patches, consider declaring only the ones you want to use at the top of your app (with the options) with the command option[ chemical::include_patches ].push_back( ... ).

and error message in the "mincst.log" (last line, attached)

Error: FileData::dump_pdb: Unable to open file:.//min_cst_0.5../name_0001.pdb for writing!!!

As you also said at https://www.rosettacommons.org/node/3757, is this common if "-restore_pre_talaris_2013_behavior" is included? How can I adjust the command line to pre-minimize? In addition, no pre-minimized PDB was generated.

2. It works fine for me now as you said.

3.
3.1) Okay, I will use resfile. Can I ask how do you think of "pmut_scan_parallel.linuxgccrelease"? I think it is much simpler than "ddg_monomer.linuxgccrelease". For myself, I only want to know the ddg of different.

3.2) I got that.

3.3) I see.

3.4) Wow, that works as you said!

Thank you very much.

Yours sincerely
Cheng

File attachments: 
Tue, 2014-10-07 13:51
lanselibai

Dear friends,
Can I ask the usage of the options file (attached)

1) The options file is based on "High Resolution Protocol". Based on previous posts, I should use the old scorefunction and restore by using -restore_pre_talaris_2013_behavior. Can I particularly ask if the following three flags are correct in the options file?

-ddg:minimization_scorefunction pre_talaris_2013_standard
-ddg::minimization_patch /home/lanselibai/Cheng/rosetta_2014.30.57114_bundle/main/database/scoring/weights/score12.wts_patch
-restore_pre_talaris_2013_behavior

Do I really need to add "-restore_pre_talaris_2013_behavior"? After 10 hours run, I was told "Number of residue types is greater than MAX_RESIDUE_TYPES" in the error file (attached). However, the discussion in https://www.rosettacommons.org/node/3757 seems to say we should not include "-restore_pre_talaris_2013_behavior". So what exactly should I do?

2) Of course it is easy to use the old scorefunction. I am just so curious why we cannot simply use the new standard scorefunction, which is talaris2013? I know the paper was published before 2013, but why we have to keep using the old scorefunction?

3) Maybe there is a tiny error on
https://www.rosettacommons.org/docs/latest/ddg-monomer.html
In the section of "Options A) High Resolution Protocol Flags"
The description for constraint-file is "<cbeta-distance-constraint-file>". Should it be "calpha"?

Thank you very much.

Yours sincerely
Cheng

Sat, 2014-12-13 04:42
lanselibai

You certainly can use the new scorefunction, if you want. The only caveat there is that the protocol was benchmarked with the older scorefunction, so if you want to replicate the published protocol - particularly if you're thinking about using the numbers derived from the paper as comparisions - you'll want to use the older scorefunction. I'm guessing the newer scorefunction will probably work. You may want to double check or recalibrate with known systems, though.

Regarding the "Number of residue type" issue, the quick fix is just to add "-override_rsd_type_limit" to the command line or option file. The message is there primarily for developers, rather than users, and in recent versions of Rosetta it no longer results in a program halt.

For the constraint, the constraint file is general enough to accept both C-beta and C-alpha distance constraints. Either one should work, although the behavior will of course be slightly different in the two cases.

Tue, 2014-12-16 12:05
rmoretti

Hi, R Moretti,
Thank you for your help. Can I ask

1) I do not fully understand the caveat regarding the new scorefunction. For example, what is the meaning of

"the protocol was benchmarked with the older scorefunction" (how to benchmark?)
"the numbers derived from the paper" (what number?)
"recalibrate with known system" (how to recalibrate?)

As far as I can understand, the "Performance" in Table 1 of the paper, which indicates the ddG prediction and ddG wet data, is based on score12. Therefore, the performance may be different if score2013 is used. Is this correct?

2) If there is nothing more complicated about using the score2013, I think I will go for score2013. So I will use

-ddg:minimization_scorefunction score2013
-override_rsd_type_limit
#DO NOT specify -ddg::minimization_patch
#DO NOT use -restore_pre_talaris_2013_behavior

Is that correct?

3) If there is something more that needs to be modified, I may use the score12. So I will use

-ddg:minimization_scorefunction score12
-override_rsd_type_limit
#DO NOT specify -ddg::minimization_patch
-restore_pre_talaris_2013_behavior

Is that correct? Should I use "score12" or "pre_talaris_2013_standard", are they the same?

4) Actually, I tried to test mutating one residue by including the following options and it worked.

-ddg:minimization_scorefunction score12
-ddg::minimization_patch score12
# NOT use -restore_pre_talaris_2013_behavior & -override_rsd_type_limit. Of course, as they are necessary, I will include them later.

4.1) I am confused by the patch usage now. In the supplement word document of the paper (E. Kellogg, 2011), for Row 16 in Table 1, "-ddg::minimization_scorefunction standard -ddg::minimization_patch score12" is used. Because I thought "standard" at that time should be score12, so I used

-ddg:minimization_scorefunction score12
-ddg::minimization_patch score12

But my confusion is, if score12 stands for score12.wts, why "score12.wts" is used as patch instead of "score12.wts_patch"? As I understand, the patch file is to modify the score file. So the final score used should be weight * patch (i.e. *= in patch file), or weight replaced by patch (i.e. = in patch file)"

4.2) For the REU, the pre-minimized wild type input PDB is of -737.91 REU. However, based on ddg.log after this successful run, all the 50 iteration structures are of lower than -1100 REU. Does it mean that my input PDB is far from minimized? My homology protein is of 442 residues, and it is based on 100 relax (i.e.-relax:quick). So it is about -1.67 REU/residue. I know it is kind of away from 2 REU/residue, but do you think -1100 REU (i.e. -2.48 REU/residue) is over-relaxed?

As a result, the ddG for that point mutation is -1.555. Therefore, the ddG is based on those -1100 REU iteration structures instead of the input -737.91 REU. So can I theoretically regard that this mutation can improve ~0.14% REU (i.e.-1.555/-1100) instead of ~0.21% REU (i.e.-1.555/-737.91) compared to the wild type input?

4.3) Is my resfile usage correct? The successful run is based on a 1A.mutfile:

total 1

1

D 1 A

However, when I change it to a resfile my_input_resfile:

1 A PIKAA A

I was told
"core.pack.task.ResfileReader: On line 1 command '1' is not recognized." in the ddg.log, and

"ERROR:: Exit from: src/core/pack/task/ResfileReader.cc line: 1520
terminate called after throwing an instance of 'utility::EXCN_utility_exit'
/cm/local/apps/sge/current/default/spool/node-f34/job_scripts/4679132: line 20: 22715 Aborted ddg_monomer.linuxgccrelease @ /home/ucbechz/Scratch/20141215_ddg_renumber_multi_test/input/options_1 > ddg.log" in the error file.

Thank you very much. Maybe I have asked too many questions.

Yours sincerely
Cheng

Wed, 2014-12-17 13:23
lanselibai

1) See the original Kellogg et al. paper. By "benchmarking" I mean the tests they did there to see if the protocol performed well when compared to experimental values. By "numbers" I mean the correlation between the experimental result and the computational protocol. And by "recalibrate" I mean re-running the protocol on systems with known experimental values and calculating the new correlation between the experimental results and the computational protocol.

You're exactly correct - if you switch from score12 to talaris2013, the performance observed will be different - it may be better, it may be slightly worse. In particular, use of computed values to quantitatively predict experimental results (rather than just qualitative more stable/less stable comparisons) would be greatly affected, as the magnitude of talaris2013 scores don't match up with those of score12.

2) It's talaris2013 instead of score2013, but yes.

3&4) score12 is pre_talaris_2013_standard with modifications from score12.wts_patch. So "-ddg:minimization_scorefunction score12" would be the same as "-ddg:minimization_scorefunction pre_talaris_2013_standard -ddg::minimization_patch score12"

Which of score12.wts and score12.wts_patch is used when you specify "score12" is contextual. If you want a full weights file, score12.wts is used. If you need a patch, score12.wts_patch is used instead.

4.2) DIfferent scorefunctions have different minima and different magnitudes. The minimized score for talaris2013 won't be the same as the minimized score for score12 - it won't even be the same as the no-movement rescoring of the same structure with score12.

4.3) See the resfile documentation (https://www.rosettacommons.org/docs/latest/resfiles.html) for the resfile format - you need to have the "start" line to tell the resfile you've moved out of the general section and into the per-residue section.

Fri, 2015-01-02 11:23
rmoretti

Hi R Moretti,
Thank you very much.

1, 2, 3) I got it.

4.2) I see. So it is the difference between talaris2013 and score12 that makes them different. Since the talaris2013 is already there, it is not necessary to still stick to the old score12. So I will use the new talaris2013.

4.3) Okay, I will look at it.

Yours sincerely
Cheng

Sat, 2015-01-03 08:18
lanselibai