You are here

Scoring residue vs. organic ligand

13 posts / 0 new
Last post
Scoring residue vs. organic ligand
#1

Hi all,

I have a pdb files with coordinates of a residue conformations interacting with a organic molecule. I want to score this conformations with score.linuxrelease app. The program crash when reading the coordinates of the ligand can not identify the name of the ligand (LIG).

There is a way to overcome this? Is necesary some kind of parametrization?

Thanks in advance

Yasser

Post Situation: 
Tue, 2014-04-08 12:18
almeida

In order to read a residue/ligand which Rosetta doesn't natively recognize you need to produce "params" files for Rosetta. In the manual there's a page on preparing ligands (https://www.rosettacommons.org/manuals/archive/rosetta3.5_user_guide/df/...), and there's also more information in the ligand docking manual (https://www.rosettacommons.org/manuals/archive/rosetta3.5_user_guide/d4/...).

Basically, you process an mol/sdf or mol2 file of your ligand with the molfile_to_params.py script (run with just -h to see options), which will produce a params file for you. When using it in Rosetta you need to be careful that you match up the name of the ligand in both the params file and the input PDB file, as well as matching the name of the atoms. (molfile_to_params.py also produces a pdb file which has correctly named residue and atoms.)

Tue, 2014-04-08 16:59
rmoretti

Thanks for your answer!

I used the molfile_to_params.py script that generated the params file and the fixed pdb file of the ligand.

I don't need to run a ligand_dock protocol to search conformers. I just want to score them. Find attached one of the pdb file I want to score. I concat the coordinates of Arg, which act as receptor and the ligand fixed with molfile_to_params.py.

If this possible with any Rosetta app?

Wed, 2014-04-09 06:51
almeida

Digging in the Rosetta score app documentation and using the pdb file previously attached, I could get a scoring. Find the output default.sc score file. I realize that there are many fields with 0 and -nan.

Maybe this is the result to use a protein-protein scoring function? There is a way for specify a protein-ligand scoring function?

Wed, 2014-04-09 07:12
almeida

The reason for your zeros is that you have all zeros in the occupancy column of the arginine. Rosetta discards atoms with zero occupancy and tries to rebuild them. The only problem is that once it throws out all the zero occupancy atoms on the arginine, there's nothing left, so it ignores the residue completely. Without the arginine, you're just scoring the ligand by itself, so most of the terms are zero, as they either deal with residue-residue interactions only, with amino acids only, or with other features (such as prolines and disulfides) which you don't have. The -nan's you're getting are from the gdtmm calculations, which are likely being thrown off by there being no amino acids in the pose (division by zero results in nan's).

Pass the option "-ignore_zero_occupancy false" to your application to force Rosetta to read the arginine atoms in.

We do have a scoring function for protein-ligand interactions, (ligand.wts) but it differs mainly in the weighting of the terms, rather than which terms are active. The one big difference (which may be important in your case) is that the ligand.wts adds in the hack_elec/fa_elec Coulombic electrostatic term. In standard score12 this is taken care of by the (amino acid specific) fa_pair term. Note that for the weekly releases we've now switched to the talaris2013 scorefunction, which replaces fa_pair by fa_elec, even for protein-protein interactions.

In practice for protein-ligand interactions you're mainly be looking at the fa_atr, fa_rep, (the two LJ potential terms), the hbond_sc (the sidechain-sidechain hydrogen bonding term - for Rosetta a ligand is all sidechain), and the hack_elec/fa_elec term. fa_sol is also important, but with just isolated residues the solvation term is not likely to be very informative. The other terms are mainly concerned with protein-specific features.

Wed, 2014-04-09 07:51
rmoretti

Thanks!

I ran the score app with "-ignore_zero_occupancy false" and "-score:weights ligand" and worked very well. As you set, the modifications in the ligand score function are relevants for my purpose.

I'm using the last stable version of Rosetta 3.5, which don't include the talaris2013 score function. I downloaded a version named mytalaris2013 score function from http://ralph.bakerlab.org/download/35/mytalaris.wts, but I don't know if this is the same of the original one. Where I can get the talaris2014.wts file? I put the file mytalaris2013.wts in .../rosetta-3.5/rosetta_database/scoring/weights and I run the app I get:

ERROR: Unable to open weights/patch file.

How to include mytalaris2014.wts in Rosetta 3.5? Do I have to recompile all the program???????

Wed, 2014-04-09 08:56
almeida

First off, it's talaris201*3* - there is no talaris201*4*. We included the date to be less arbitrary about versioning than something like "score12", but it's not something we're going to be necessarily updating every year.

Also, talaris2013 isn't just a weights file -- there's a number of other changes that go along with it. You would have to obtain one of the weekly release versions in order to get the full benefit of talaris2013. (By the way, it's slightly inaccurate to classify the difference between the weeklies and Rosetta 3.5 as calling 3.5 "stable". The process for validating the weeklies is basically the same as that for the 3.3, 3.4 and 3.5 releases, it just happens more regularly. There's nothing less "developmental" about 3.5 versus the weeklies, it's just that we changed our release process from being once or twice a year to being about once a week.)

That said, the benefit of using talaris2013 on protein/ligand interactions is still up in the air. For the moment, at least, we still recommend using the ligand.wts scorefunction for protein-ligand docking. As the ligand.wts scorefunction was parametrized in a pre-talaris context, this requires addition of the -restore_pre_talaris_2013_behavior flag to the weekly releases, and would result in (more or less) the same results as you would get with Rosetta3.5.

Wed, 2014-04-09 09:33
rmoretti

Hi again... following the thread,

I ran the score app with ligand.wts score function, for a ligand and a tyrosine. I used two versions of TYR with the same ligand. The output file default.sc give many 0 in the fields you pointed out before and almost all the rest. Please see the files attached.

Why is this?

;)

Wed, 2014-04-16 11:01
almeida

So the energies you're getting are entirely due to the rotameric propensities of the tyrosine, as well as the "reference" energy of the tyrosine. You're getting no inter-residue energies.

I'm guessing that you're having difficulties reading/recognizing your ligand. I'd remove the "-ignore_unrecognized_res true" from your flags file and see what happens. I bet you get an "unrecognized residue" error. You need to match up the three letter residue name in your LG.params file (the IO_STRING line) with the name in the PDB. You also need to make sure that the atom names match. You currently are naming it "LIG", but I suspect your params file has it named "LG1".

The other thing you need to fix is your column alignments on xyz_TYR-H_chemfinder_1a3v_holo_single_point.pdb You're shifted slightly. PDBs are column based, rather than whitespace separated, so alignment matters.

One thing you can do is add the "-output" option to your score run to output a PDB which will show your structure as Rosetta sees it. This is a good way to tell if Rosetta is having problems reading in residues, or is moving around atoms, etc.

Thu, 2014-04-17 09:05
rmoretti

Thanks... I will check this.

On the other hand, working with other ligand, methanol, when I run molfile_to_params.py I got this error:

Centering ligands at ( -1.123, 2.688, 9.604)
Atom names contain duplications -- renaming all atoms.
Total naive charge -0.215, desired charge 0.000, offsetting all atoms by 0.036
WARNING: fragment 1 has 6 total atoms including H; protein residues have 7 - 24 (DNA: 33)
WARNING: fragment 1 has 2 non-H atoms; protein residues have 4 - 14 (DNA: 22)
Average 6.0 atoms (2.0 non-H atoms) per fragment
(Proteins average 15.5 atoms (7.8 non-H atoms) per residue)
WARNING: no root atom specified, using auto-selected NBR atom instead.
Traceback (most recent call last):
File "/home/almeida/bin/molfile_to_params.py", line 1419, in
sys.exit(main(sys.argv[1:]))
File "/home/almeida/bin/molfile_to_params.py", line 1392, in main
build_fragment_trees(m)
File "/home/almeida/bin/molfile_to_params.py", line 571, in build_fragment_trees
(nbr, nbr_dist) = choose_neighbor_atom(molfile, frag_id)
File "/home/almeida/bin/molfile_to_params.py", line 55, in decorated
cache[args] = f(*args)
File "/home/almeida/bin/molfile_to_params.py", line 751, in choose_neighbor_atom
if not one_is_ok: raise ValueError("No acceptable neighbor atoms in molecule!")
ValueError: No acceptable neighbor atoms in molecule!

Find attached the original pdb file and the mol2 file, generated with Chimera.

Mon, 2014-04-21 09:10
almeida

You've hit upon one of the edge cases in molfile_to_params. It requires that the autodetermined "neighbor atom" be an atom that is bonded to at least two heavy atoms. Since methanol only has two heavy atoms, it can't find a suitable neighbor atom.

The way around this is to specify the atom you wish to use for a neighbor atom. For mol2 input files this involved making an "m-ctrl" file. This is a text file that has the appropriate molfile_to_params specific "M" records in it. So in your case you would have a text file with a single line like "M NBR 2" to tell molfile_to_params to use the second atom (the carbon) as the "neighbor" atom. You would then pass that m-ctrl file to molfile_to_params with the --m-ctrl option. (All this should be in the help information for molfile_to_params.py -- simply run with the -h option to view it.)

Wed, 2014-04-23 13:09
rmoretti

When I run molfile_to_params with --m-ctrl it work fine with a single ligand. Now a I have a single mol2 file with multiple (130) conformers of methanol. When I ran molfile_to_params (molfile_to_params.py --m-ctrl neighbor_atom sampling.mol2. See files attached), it wrote the corresponding pdb files for each ligand in mol2 file. This is correct? All these output pdb files have completely diferent coordinates, centered in the C-O bond of methanol. On the other hand, when run molfile_to_param for mol2 files independently, it work fine and the output pdb coordinates are the same of the source pdb -> mol2 files.

Thu, 2014-04-24 08:06
almeida

Multiple conformers for ligands are typically only used in packing, and when they are used, they're always superimposed on the neighbor atom. So it doesn't matter what their coordinates are in the PDB_ROTAMERS input file -- they'll be aligned on the neighbor atom when used for packing.

For this reason, it's a little pointless to have 130 conformers of methanol. You only have one rotatable bond so you're effectively just sampling 130 hydroxyl proton locations, which is a little excessive especially if you're using a protocol with minimization.

Assuming that you don't want superposition of the ligand, and you really want to include rigid body orientation in your methanol rotamers, what you'd want to do is add three "virtual atoms" to your residue (use atoms with element type "V" - we don't currently do any modeling with Vandium) to establish the reference coordinate frame. Bond them together with single bonds, and then add a bond from one of them to the carbon atom of the methanol. You'd then specify the "center" virtual atom (one not bonded to the carbon atom) as your neighbor atom, so you align on the virtual atom, allowing the carbon and thus the rest of the methanol to move around.

For making the rotamer library you then have two options. The first is to add the virtual atoms and bonds to each of the various conformations in your mol2 file (being sure to put them in the same absolute xyz location for all conformers). The other is to just manually build you rotamer library file, pasting in ATOM lines for the virtual atoms. The only trick here is to make sure that the naming convention is the same in the params file and the rotamer library file. (Using the "--keep-names" flag of molfile_to_params may help with this.)

Thu, 2014-04-24 14:43
rmoretti