You are here

A problem with RosettaLigand

17 posts / 0 new
Last post
A problem with RosettaLigand
#1

Hello, everyone,

I just started to learn Rosetta, and was wondering if I could get some help here.

I was trying to follow the Ligand Docking example which was used during the Workshop held in April this year .
While the protocol was done with rosetta3.5, I am using rosetta3.4.
And I got some problems during the run.

I prepared the following files as instructed:

receptor: 2Q5K_mutant.PDB
Ligand: SAQ_aligned.pdb
Ligand conformers: SAQ_conformers.pdb
Ligand parameter file: SAQ.params
Water: water.pdb
XML file: water_dock.xml
option file: options.txt

and under the 'bin' directory, the flags were as follows:
./rosetta_scripts.macosgccrelease -database /users/cheuser/rosetta3.4-build/rosetta_database @options.txt

During the run, the error message was shown as below:
segmentation fault 11

Where should I modify to fix this?
The files I used were attached.
Thanks in advance!

Post Situation: 
Fri, 2013-07-19 16:09
Rosettasz

First off, I wouldn't recommend running Rosetta out of the bin directory. Rosetta has a tendency to expect files in, and output files to the current directory. It's better in the long run to get in the habit of creating a new directory for each project or portion of a project and running Rosetta from that directory. (Specifying the path to the Rosetta executable, as appropriate.)

Secondly, posting the example files as PDFs is less than ideal, as it makes it hard for those helping you to download them and try them out on their own machine. (Also, errors in input can sometimes be caused by spacing/alignment issues, which converting to PDF may mask.) Posting them as plain text (with a .txt extension) is prefered.

Thirdly, you neglected to post your options.txt file. (I can guess the basic flags, but if there's some strange flag interaction you're seeing, having the full set of flags would be necessary.)

##

At any rate, the format of your SAQ_conformers.pdb file is all messed up. You want individual records for each conformer sequentially, separated by TER or END cards. It looks like you have a single giant molecule with all of the atoms smooshed together. That doesn't appear to be an issue with the segmentation fault. It just means you're not going to get the results you should.

Assuming you're not doing anything funky with your flags, one possibility for the segfault is that you don't have your ligand in your starting PDB. Your XML is trying to rotate and translate chain X (and chain W), but you don't have any chain X. Even if you're combining the SAQ_aligned.pdb with the protein structure, that has the ligand as chain B, not as chain X.

Rosetta tries to find a chain X and fails, resulting in the crash. (Ideally it should fail more gracefully, though.) Fix your chain designations, and make sure your starting pose has those chains, and that particular crash should go away.

Fri, 2013-07-19 20:01
rmoretti

Hi rmoretti,

Thanks a lot for you suggestions and the notes. I'll keep those in mind in the future.

I originally changed the chain name from "chain X" to "chain B" in the PDB file , because the error message showed that "SlideTogether" mover requires chain tag.
I saw somewhere when searching the internet that someone tried to change the chain tag name, and i thought it may work (I barely have any programing background other than C language in college).

So, what should I do to get rid of the above chain tag trouble?

The followings are file information:

Input file: 2Q5K_mutant.pdb
Ligand file: SAQ_aligned.pdb
Ligand conformers file: SAQ_conformers.pdb
Ligand parameter file: SAQ.params
water file: water.pdb
water_dock file: water_dock.xml
option file: options.txt

Flags:
./rosetta_scripts.macosgccrelease -database /users/cheuser/rosetta3.4-build/rosetta_database @options.txt

Error message:
ERROR: 'SlideTogether' mover requires chain tag.

PS, this was still run under bin directory since I couldn't get it work in a new folder. i gotta get help from others, but in a short term i am planning on using the bin directory.

Thanks!

Mon, 2013-07-22 13:08
Rosettasz

The error you're getting with SlideTogether is because the interface has changed between versions. It used to take the "chain" tag, and only took in a single chain. Now it takes a "chains" tag, and can take a comma separated list of chains. The switch happened between Rosetta 3.4 and Rosetta 3.5. You're using 3.4, but using parameters for 3.5. You need to use the old "chain" (no s) designation and a single chain letter to run with Rosetta 3.4. Luckily, I think all you have to do is change the "chains" to "chain" for the SlideTogether mover in the water_dock.xml file.

The letter you use to specify the chain shouldn't matter, as long as the same letter is used in all locations where that particular ligand is specified (all X's or all B's, both in the pdbs and the XML) and that the chain letter uniquely specifies that particular ligand.

P.S. What problems were you experiencing when you tried to run Rosetta somewhere outside the bin directory?

Mon, 2013-07-22 17:52
rmoretti

The tag in slidetogethr is "chain" not "chains" so:

This should say

(replace X with whatever your chain is actually called).

I've fixed the bug in the current development version of Rosetta so in the future specifying a non-existant chain will cause rosetta to exit with an error like "chain_id X does not exist" instead of a messy crash.

Mon, 2013-07-22 17:43
delucasl

Thank you very much for the help.
I finally could get it work by making the changes as you suggested in the .xml file.

For the bin directory problem,
I made a new directory called "BinTest" in the bin directory, and copied the above files in, as well as a file called "rosetta_scripts.macosgccrelease".
The flag was: ./rosetta_scripts.macosgccrelease -database /users/cheuser/rosetta3.4-build/rosetta_database @options.txt

When I runn this in the new directory, it showed:
-bash: ./rosetta_scripts.macosgccrelease: No such file or directory.

It looks like rosetta_scripts.macosgccrelease can't work by simply copying to a new place.

Thu, 2013-07-25 08:02
Rosettasz

When running a program from the commandline you need to give the path to the executable. Either a relative path ("./" means "in the current directory"), or an absolute path.

If instead of running "./rosetta_scripts.macosgccrelease" you'll want to use something like "/users/cheuser/rosetta3.4-build/rosetta_source/bin/rosetta_scripts.macosgccrelease", (where "/users/cheuser/rosetta3.4-build/rosetta_source/bin/" is the absolute path to the rosetta_scripts.macosgccrelease executable.)

If that's too much typing for you, you can try setting up your PATH environment variable to include the Rosetta bin directory in the executable search path. This is how regular system commands can be run without the path. - If you're not careful you can mess up your system trying to do this, though, so I might recommend using the full path to the executable until you're much more comfortable with the commandline.

Thu, 2013-07-25 12:09
rmoretti

Again, thank you very much for your time helping with the basic questions.
I could run successfully in a new directory by designating the specific pathway for the executable as you suggested.

Now, I encountered another problem, and I'd appreciate your further help.
I am trying to use the same protocol to dock a ligand (Folate) and cofactor (NADPH) to my receptor protein (2W3M).
For this purpose I tried two methods and I got problems for both of them.

Method 1: Treat NADPH as another ligand, and for this second ligand NADPH, mimic the same settings of Folate that Rosetta used during ligand docking.

STEP:
- generate conformer libraries for Folate, NADPH
- generate parameter files for Folate, NADPH
- align Folate and NADPH to the binding pocket of receptor 2W3M according to the crystal structure
- modify the XML file so that Folate and NADPH have the same docking procedure

Input files:
- receptor: 2W3M_wt.pdb
- ligand: FOL_aligned.pdb
- ligand : NDP_aligned.pdb
- ligand conformer library: FOL_confs.pdb
- ligand conformer library: NDP_confs.pdb
- ligand parameter file: FOL.params
- ligand parameter file: NDP.params
- XML file: Dock.xml
- option file: options.txt

Flag:
/users/cheuser/rosetta3.4-build/rosetta/bin/rosetta_scripts.macosgccrelease -database /users/cheuser/rosetta3.4-build/rosetta_database @options.txt

Error message:
ERROR: unrecognized mm_atom_type_name W
ERROR:: Exit from: src/core/chemical/MMAtomTypeSet.hh line: 79

Mon, 2013-07-29 11:32
Rosettasz

The "unrecognized mm_atom_type_name W" error is because your NDP.params file has a "W" in the fourth column of the ATOM lines. I'm not sure why that would be the case, unless you manually changed that. Replace the W with X (as it is in the FOL.params file), and that error should disappear.

Mon, 2013-07-29 12:31
rmoretti

Thanks a lot!
It works now.
Yes as you guessed, i replaced X with W in the parameter file, regarding two ligands NADPH and Folate are being docked simultaneously and I thought the two should be recognized by Rosetta by different chain tags W and X, respectively.

Tue, 2013-07-30 07:00
Rosettasz

The params files don't contain any notion of chain or residue numbering. The matching of residue types (params files) to PDB residues numbers and chains is done primarily through the three letter residue name, and secondarily (for types that share the three letter atom names) through heuristics based on the number of matching atom names.

The "X" in that column is actually the molecular mechanics atom type for that particular atom. (The third column is the Rosetta atom type). Since most protocols don't use the molecular mechanics atom types, molfile_to_params.py doesn't bother trying to figure out what the appropriate atom type would be. Instead it just sticks an "X" value in that location, indicating that the MM atom types were not assigned.

Wed, 2013-07-31 13:06
rmoretti

Method 2: Treat NADPH as water in the previous protocol, and for NADPH mimic the same settings of water that Rosetta used during ligand docking.

STEP:
- generate conformer libraries for Folate, NADPH
- generate parameter files for Folate, NADPH
- align Folate and NADPH to the binding pocket of receptor 2W3M according to the crystal structure
- copy the NADPH parameter file (NDP.params) to the following directory: /users/cheuser/rosetta3.4-build/rosetta_database/chemical/residue_type_sets/fa_standard/residue_types/NADPH
- add a line "residue_types/NADPH/NDP.params" to the end of the file "residue_types.txt", located in : /users/cheuser/rosetta3.4-build/rosetta_database/chemical/residue_type_sets/fa_standard

Input files:
- receptor: 2W3M_wt.pdb
- ligand: FOL_aligned.pdb
- ligand conformer library: FOL_confs.pdb
- ligand parameter file: FOL.params
- NADPH: NDP_aligned.pdb
- XML file: Dock.xml
- option file: options.txt

Modified files in Rosetta:
NDP_params.params
residue_types_modified.txt

Flag:
/users/cheuser/rosetta3.4-build/rosetta/bin/rosetta_scripts.macosgccrelease -database /users/cheuser/rosetta3.4-build/rosetta_database @options.txt

Error message:
ERROR: unrecognized mm_atom_type_name W
ERROR:: Exit from: src/core/chemical/MMAtomTypeSet.hh line: 79

Mon, 2013-07-29 11:36
Rosettasz

Finally, I have another problem regarding using non-canonical amino acids.
I am trying to integrate non-canonical amino acids in the current version of Rosetta LigandDock, and the supplementary material of the paper "Incorporation of noncanonical amino acids into rosetta and use in computational protein-peptide interface design" published in PloS ONE in 2012 was used as a reference.
When I tried to run the UnfoldedStateEnergyCalculator, it showed:
Reading in rot lib /users/cheuser/rosetta3.4-build/rosetta_database//rotamer/ncaa_rotlibs/C40.rotlib...done!
Segmentation fault: 11.
 

What we did was as follows:
Since the parameter files for NCAA are already in our Rosetta system,
 (Path: /users/cheuser/rosetta_database/chemical/residue_type_sets/fa_standard/residue_types/l-ncaa) 

1. Add a line "residue_types/l-ncaa/ornithine.params" at the end of the file "residue_types.txt" which is located in ~/database/chemical/residue_type_sets/fa_standard.

2. Copy the whole rotamer library named "ncaa_rotlibs" to ~/rosetta3.4-build/rosetta_database.

3. Add two lines "NCAA_ROTLIB_PATH C40.rotlib" and "NCAA_ROTLIB_NUM_ROTAMER_BINS 3 3 3 3" to the end of the file "ornithine.params", located in ~/rosetta3.4-build/rosetta_database/chemical_residue_type_sets/fa_standard/residue_types/l-ncaa.

4. run the flags:
 /users/cheuser/rosetta3.4-build/rosetta/bin/UnfoldedStateEnergyCalculator.macosgccrelease -database /users/cheuser/rosetta3.4-build/rosetta_database -ignore_unrecognized_res -ex1 -ex2 -extrachi_cutoff 0 -l /users/cheuser/desktop/shun/C40Test/inputs/cullpdb_pc20_res1.6_R0.25_d110520_chains1859_list_pruned -residue_name C40 -mute all -unmute devel.UnfoldedStateEnergyCalculator -unmute protocols.jd2.PDBJobInputer -no_optH true -detect_disulf false
 
5. After a few sec, the above error message appeared.
Reading in rot lib /users/cheuser/rosetta3.4-build/rosetta_database//rotamer/ncaa_rotlibs/C40.rotlib...done!
Segmentation fault: 11.

 
The difference we found between the supplementary material and our running are:
1. the pathway to ornithine.params.
We used "~/rosetta_database/chemical/.../ornithine.params", while the paper used "~/minirosetta_database/chemical/.../ornithine.params".

2. We added ""residue_types/l-ncaa/ornithine.params" at the end of the file "residue_types.txt" to match the format, while the paper added "l-ncaa/ornithine.params".

Are these possible reasons? Would you please advise us to fix this problem?

Thanks in advance!

Mon, 2013-07-29 11:38
Rosettasz

I might suggest upgrading to the more recent 3.5 revision to see if there there have been bugs in the protocol that have been fixed.

If the segfault still occurs (or you really want to stick with 3.4), the next step would be to recompile in debug mode. (With the scons command, do "mode=debug" instead of "mode=release" and then use the ".macosgccdebug" executables. (Both debug and release can exist together.) If you're lucky, running in debug mode will pop up a useful error message. If not, you may need to run the program under a debugger, and get a backtrace at the point where the error occurs.

Mon, 2013-07-29 12:48
rmoretti

Thank you very much for the suggestions.
It finally could be done after moving the rotamer library to a new directory.
Feeling excited about almost getting to the end of this.

I had another question when comparing the supplementary materials from above paper and the rosetta database:

In the supplementary material, the parameters for ornithine (C40) are as follows:
BOLZMANN UNFOLDED ENERGIES:
fa_atr: -2.462
fa_rep: 1.545
fa_sol: 1.166
mm_lj_intra_rep: 1.933
mm_lj_intra_atr: -1.997
mm_twist: 2.733
pro_close: 0.009
hbond_sr_bb: -0.006
hbond_lr_bb: 0.000
hbond_bb_sc: -0.001
hbond_sc: 0.000
dslf_ss_dst: 0.000
dslf_cs_ang: 0.000
dslf_ss_dih: 0.000
dslf_ca_dih: 0.000

However, the values are different in our database:
PATH: ~/rosetta3.4-build/rosetta_database/scoring/score_functions/unfolded/unfolded_state_residue_energies_mm_std

AA fa_atr fa_rep fa_sol mm_lj_intra_rep mm_lj_intra_atr mm_twist pro_close hbond_sr_bb hbond_lr_bb hbond_bb_sc hbond_sc
C40 -1.54459 1.82841 0.84529 1.87870 -1.92507 3.05321 0.01208 0.00000 0.00000 -0.00080 -0.00003

So, where is this difference coming from and which values should be the correct one to use if we want to dock our ligands to the receptor?

Thank you.

Thu, 2013-08-01 13:10
Rosettasz

Hi Rosettasz

I am not 100% sure where the difference is coming from. I suspect that the values in the protocol capture (the papers supporting information) may have been run on a smaller subset of proteins or with different options than in the paper for demonstration purposes. The values in the file are the ones that the energy function was trained with (to set the weights on the individual terms) so I would use those.

Doug

Fri, 2013-08-02 09:36
renfrew

Dear Dr. Renfrew,

Thanks a lot for your reply.
The use of non-canonical amino acids in Rosetta LigandDock is working in our computer!

Wed, 2013-08-07 09:19
Rosettasz