You are here

RosettaDock - Constraints File

10 posts / 0 new
Last post
RosettaDock - Constraints File
#1

Hello, again.

I am attempting to run some global docking simulations of two proteins. Not much is known about the interaction; however, I do know residues that are not involved in forming the protein-protein interface. Is there a way to put this information into a constraints file so that decoys that use these residues are not considered (or thrown out)?

I know that having the line:
SiteConstraint CA 16B A FLAT_HARMONIC 0 1 5

Is basically forcing the CA of residue 16 on chain B to be involved in the docking interaction with Chain A. Is there a way to make this command the inverse (say that resi 16 can't be involved in the interaction)?

On the rosetta dock tutorial on the Gray Lab webpage (http://graylab.jhu.edu/~mdaily/tutorial/constraints.html), they give an example of assigning penaltys to certain residues in a constraints file with the following format:

1
20
16 B
0 B
filter 10

I understand that this will assigning a penalty of 20 to any decoy using resi 16, and filter any structures having a score greater than 10. I tried to implement this, but the following error was returned (I am assuming that the format for the Site Constraints listed on that webpage is not compatable with the current version of Rosetta):

ERROR: ConstraintFactory::newConstraint: 1 does not name a known ConstraintType --> check spelling or register new Constraint type in ConstraintFactory!

Thanks for your help!
--Eddie

Post Situation: 
Wed, 2012-09-19 06:23
edpryor

What you're really looking to do is change the function associated with the constraint.

Right now you have a Flat Harmonic function, centered on 0, with a spring constant of 1 and a window of 5. That is, there's no penalty for 5 around zero, and then starting at 5 (and -5, though I think for your purposes negative values never come into play) the penalty goes up as d^2 for every d units past 5 you go.

I might recommend trying a SIGMOID function( (1/(1+exp(-slope*( x-x0 ))) - 0.5). This is set up to give a favorable value near zero (and for negative numbers) when the slope is positive, but you could try a negative slope to give a favorable value further away from zero. e.g.

SiteConstraint CA 16B A SIGMOID 5.0 -2.0

Would give a sigmoid constraint centered around 5 with values being disfavorable near zero and favorable greater than 5 (slope -4). You will likely need to play around with the slope and cutoff to get things the way you want. Also keep in mind that a sigmoid never quite reaches zero, so the constraint will always want to push things apart, but with a steep enough slope the effect will be negligible.

--

I should also mention there are a number of different functional forms available for constraints. Some are documented in the constraint file format documentation ( http://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/de/d... ), but some are undocumented, and you'd have to look at the code in rosetta_source/src/core/scoring/constraints/ (or see http://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d1/d... ) to learn about them.

Wed, 2012-09-19 11:23
rmoretti

Also, the documentation page you are reading is for Rosetta++ (Rosetta 2.3), not Rosetta 3.4. The concepts apply, but command lines will not.

Wed, 2012-09-19 12:14
smlewis

Hi All,
Regarding site constraints,
SiteConstraint CA 16B A FLAT_HARMONIC 0 1 5
Does it mean that CA of 16B residue should be close to any CA atom of chain A or close to any atom of chain A?
And how to change the constraints to consider the distance b/w any atom of 16B and any atom of chain A ( SiteConstraint 16B A FLAT_HARMONIC 0 1 5 ???) and CB from chain B and CA from chain A ( SiteConstraint CA 16B CB A FLAT_HARMONIC 0 1 5 ??)

Thanks.

Fri, 2013-12-06 06:02
nawsad

It's to any CA atom of chain A. There's no way to change that - it's hard coded in the definition of SiteConstraint. You can change which atom you want in the target residue, but not on the partner chain. For example:

SiteConstraint NZ 21C D FLAT_HARMONIC 0 1 5

Will specify that the terminal lysine nitrogen of residue 21 on chain C should be within 5 angstroms of any CA atom of chain D.

Unfortunately there isn't any constraint currently which will do an any/any constraint or even a specific-atom/any-in-chain constraint. You'd have to simulate that yourself with an AmbiguousConstraint wrapping a number of AtomPair constraints. (This is effectively how the SiteConstraint works internally - it just sets up an AmbiguousConstraint which has AtomPair constraints between the specified atom an all the Calphas on the specified chain.

Fri, 2013-12-06 09:21
rmoretti

Hi,

Is it possible to introduce a constraint or filter into the ligand docking application... say if you know that a certain portion of your ligand forms a hydrogen bond with a protein residue in the binding pocket?

Thanks

Mon, 2013-12-16 14:06
joeg

Yes, although the recommended format for ligand docking is the enzdes/match style constraints, rather than the standard constraint format. (They convert to the same thing internally, but the enzdes/match style constraints are in a format that's slightly better suited to representing the types of interactions you typically see in ligand-protein interactions.)

The format of the enzdes/match style constraints is discussed at https://www.rosettacommons.org/manuals/archive/rosetta3.5_user_guide/d5/...

To apply the constraints during ligand docking, you would pass the filename with the option -enzdes:cstfile if you're using the ligand_dock application, or by using the AddOrRemoveMatchCsts mover if you're doing the ligand docking through RosettaScripts.

Tue, 2013-12-17 07:43
rmoretti

Hello rmoretti,

Thanks so much for this reply. I've been trying to get this to work and was wondering if I could ask some more specific questions.

I've been following the Combs et al 2013 Nature Protocol paper to help learn ligand docking. The command used is this:
$ /Users/jplaks/Rosetta/rosetta-3.5/main/source/bin/rosetta_scripts.default.macosclangrelease @ligand_dock.options -database /Users/jplaks/Rosetta/rosetta-3.5/main/database -nstruct 1
and I've attached the options and xml file used. (in practice, the files have the proper extensions, but to upload them to the forum, I've changed them to text files)

Question 1:
I guess this is docking through Rosetta scripts. Why are there multiple executables in source/bin for doing ligand docking (including the above command, the command without "default", and Ligand_dock commands) and which is the best to use?

Question 2:
This is a bit more specific, but when I try to replicate the ligand docking portion of the above paper using the commands and files I've attached, it works just fine with Rosetta 3.5. However, running it with week 52 weekly release gives me this error...

asic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/cenpack_log.txt
protocols.jd2.parser.ScoreFunctionLoader: defined score function "ligand_soft_rep" with weights "ligand_soft_rep"
setting ligand_soft_rep weight hack_elec to 0.42

ERROR: unrecognized score_type type hack_elec
ERROR:: Exit from: src/core/scoring/ScoreTypeManager.cc line: 429
Error: ERROR: Exception caught by JobDistributor while trying to get pose from job 'co-dock_0001'
Error:

[ERROR] EXCN_utility_exit has been thrown from: src/core/scoring/ScoreTypeManager.cc line: 429
ERROR: unrecognized score_type type hack_elec

Error: Treating failure as bad input; canceling similar jobs
protocols.jd2.FileSystemJobDistributor: job failed, reporting bad input; other jobs of same input will be canceled: co-dock_0001
protocols.jd2.JobDistributor: no more batches to process...
protocols.jd2.JobDistributor: 1 jobs considered, 1 jobs attempted in 4 seconds
rl1-1-221-125-dhcp:try jplaks$

Deleting the hack_elec references in my xml file eliminates this error, but I'm confused as to why this happens in the newer weekly release and not the original 3.5 release.

Question 3 (and final... sorry for being so verbose):

Trying to specify constraints in a csf file, I have added the AddOrRemoveMatchCsts mover to my xml file (new file also attached), written a cst file (py.txt), and added appropriate "remarks" to my input PDB, and I get the following error:

protocols.rosetta_scripts.ParsedProtocol: =======================BEGIN MOVER AddOrRemoveMatchCsts - py_const=======================
{
protocols.toolbox.match_enzdes_util.EnzConstraintIO: read enzyme constraints from py.txt ... done, 1 cst blocks were read.
protocols.toolbox.match_enzdes_util.EnzCstTemplateRes: Found residue LG1 for CstBlock 1 without REMARK line in pose at position 260.
Error: catalytic map in pdb file and information in cst file don't match, unequal number of constraints. should be 1, is 0
ERROR:: Exit from: src/protocols/toolbox/match_enzdes_util/EnzConstraintIO.cc line: 358
protocols.jd2.JobDistributor:

[ERROR] Exception caught by JobDistributor for job co-dock_0001

[ERROR] EXCN_utility_exit has been thrown from: src/protocols/toolbox/match_enzdes_util/EnzConstraintIO.cc line: 358

protocols.jd2.JobDistributor: co-dock_0001 reported failure and will NOT retry
protocols.jd2.JobDistributor: no more batches to process...
protocols.jd2.JobDistributor: 1 jobs considered, 1 jobs attempted in 83 seconds
rl1-1-221-125-dhcp:try jplaks$

I'm confused, because I added the REMARK to the PDB and yet it is telling me there is no REMARK line in the pose.

Thanks so much for your time. Sorry this is such a long post. I'm grateful for any insights you can give me.
-joe

Thu, 2014-01-30 08:49
joeg

The ligand_dock application in the bin directory was the application that was used prior to the advent of RosettaScripts. The RosettaScripts interface is much more flexible, however, and can do everything that the ligand_dock application can, and more as well.

The various extensions in the bin directory have to do with how Rosetta compiles applications. The "full" name of the executable has three dotted sections e.g. "rosetta_scripts.default.linuxgccrelease". The "default" means a regular compile, as opposed to specialty compiles like MPI ("rosetta_scripts.mpi.linuxgccrelease"). For convenience, there's also a version without the middle section in the bin directory. That's identical to the version which was last compiled. So if you just have the default version compiled, rosettascripts.linuxgccrelease is identical to rosetta_scripts.default.linuxgccrelease. If compiled the regular version and then the MPI version (they can exist in the same tree), then it would be identical to rosetta_scripts.mpi.linuxgccrelease. Under some settings I belive you get a no-dot version (just "rosetta_scripts"), and again this is equal to the last version that you compiled - MPI or not, debug or release, etc. -- It doesn't matter which extension you use, as long as it corresponds to the compile that you want (if you only compiled under one setting, they're all the same).

Q2 - Going from Rosetta3.5 to the weekly releases we had some significant changes in how Rosetta does scoring. The obvious outcome of this is that the default scorefunction went from score12 to talaris2013. This change isn't too important for ligand docking, as it uses a specialized scorefunction, but a relevant change is a change in nomenclature of "fa_elec" instead of "hack_elec" - just change the name in the XML file.

There's a bunch of other parameter changes to the scorefunction too, which will affect you even when you're using the ligand scorefunctions. To recapitulate the behavior of ligand docking under Rosetta3.5, you'll want to add the flag "-restore_pre_talaris_2013_behavior" to your ligand docking commandlines.

Q3 - When you're using the rosetta_scripts application with enzdes style constraints (as opposed to the enzyme_design or match applications), you need to provide the flag "-run:preserve_header" on the commandline to tell Rosetta that it needs to preserve the header information.

Thu, 2014-01-30 16:22
rmoretti

Thanks so much; it's working splendidly now.

Sat, 2014-02-08 09:57
joeg