You are here

Using Degenerate Protons in Rosetta3.x

11 posts / 0 new
Last post
Using Degenerate Protons in Rosetta3.x
#1

Hello,

How does Rosetta3.x handle degenerate protons or how do I implement those in a constraint file?

I am trying to add NMR derived constraints for methyl-groups to my AbInitio Modelling. I found a reference and the corresponding website that describes this issue in Rosetta2.3 (http://www.rosettacommons.org/guide/NMR).

However, adding either

AmbiguousNMRDistance HG2# 3 H 11 BOUNDED 1.5 5.00 0.3

or

AmbiguousNMRDistance #HG2 3 H 11 BOUNDED 1.5 5.00 0.3 (with -IUPAC flag)

results in errors similar to the following:

std::cerr: Exception was thrown:
Atom HG2# 3 not found
std::cout: Exception was thrown:
Atom HG2# 3 not found

(with #HG2 in the latter constraint file format case)

If such an option does not exist anymore I tend to change my constraint file to use the respective heavy-atoms and add some additional distance bias.

Best regards and many thanks for looking into this,
Marcel

Post Situation: 
Wed, 2011-06-08 03:25
jurkm

I'm not aware of an option to do this anymore.

I think what you need is AmbiguousConstraints, which allow you to provide two possible constraints, of which only the lowest-scoring is applied at any one time. (Both are evaluated, one is scored).

The format is:
AmbiguousConstraint
constraint 1
constraint 2
...
END_AMBIGUOUS

http://www.rosettacommons.org/manuals/archive/rosetta3.2.1_user_guide/co...

Is this what you needed...?

Wed, 2011-06-08 07:50
smlewis

Though the AmbiguousConstraint format will blow up the constraint file considerably (certainly depending on the implementation), it should do exactly what I was looking for.

Thank you (once again) for your quick and useful answer.

By the way, does Rosetta handle AmbiguousNMRDistance constraints in the same way? I.e. evaluating all given ones and only scoring the lowest or applicable ones? I observed that if I give my constraints in the AtomPair format the scores are way off and the structures do not converge. With AmbiguousNMRDistance this is not the case but not all constraints seem to be considered. Is this all about the "curvature" option?

AmbiguousNMRDistance atom_name_i res_no_i atom_name_j res_no_j BOUNDED min_dist max_dist curvature

In the literature this value is often set to 0.3.

Wed, 2011-06-08 23:44
jurkm

w/r/t curvature:

In general, constraints are defined like so:
CONSTRAINT_TYPE constraint arguments FUNCTION_TYPE function arguments.

In your example, BOUNDED is the function type, so curvature is an argument to BOUNDED, not AmbiguousNMRConstraint. You can read on the link in my previous post that bounded provides a zero-potential well with sloping edges. Curvatures controls how quickly the score ramps up as the constrained quality leaves the zero-potential well.

w/r/t AmbiguousDistanceConstraint:
I dug into the code for AmbiguousNMRConstraint. It inherits from AmbiguousConstraint and appears to work the same way. I didn't initially realize AmbiguousNMRConstraint was a Rosetta3 thing, I misunderstood you and thought it was a leftover Rosetta2 thing. (Unfortunately the constraint documentation was never updated to include its existence). Looking at the read_def function for AmbiguousNMRConstraint makes me suspect that the problem you had before was the # sign in your hydrogen definitions. Why did you have a # sign?

Thu, 2011-06-09 08:32
smlewis

I see. Due to the lack of documentation I thought that AmbigousNMRConstraint may be only used in the way I posted (because I found it in different NMR-related Rosetta literature).

I added the # sign to account for the degeneracy of NMR-derived proton constraints. In my spectra I can't distinguish between hydrogen HG11, HG12 or HG13 of a methyl-group of a valine, for instance. It rotates so fast that any putative difference is averaged. Therefore, I only obtain the distance between the mean position of all three methyl-group hydrogens and whatever other proton from my NOESY-spectra.
There are two ways to deal with this in structure calculation software that I am aware of: a) averaging the hydrogen coordinates or b) adding some bias and take the heavy atom those hydrogens are bounded to (like described in the old Rosetta2.3 NMR-Guide). a) seems more accurate but given the fact that distances from NOEs are quite error-prone this 0.3-0.5 A offset is usually insignificant (unless NOE-derived distances were calculated quite laboriously from different experiments and including quantum-mechanic approaches).

What would be your suggestions for the following points:

1) Shall I format my constraints before feeding them to Rosetta to relate only to heavy-atoms to account for the proton degeneracy? This seems easier to handle than to make ambiguous constraints for each group of degenerate protons.

2) Since AmbiguousNMRConstraint is evaluated in the same way as AmbiguousConstraint I understand now why quite a lot of constraints seem to be neglected. Would ATOMPAIR in combination with the BOUNDED function be better? Until now I only tested ATOMPAIR + HARMONIC which, following your advise, is definitely the wrong way to handle this.

Fri, 2011-06-10 00:29
jurkm

Rosetta won't rotate methyl hydrogen the way real physics does. Once they're placed they're static. It is certainly a reasonable idea to rely on the nearby heavyatom instead, as you suggest. (The only hydrogens I ever use in constraints are backbone hydrogens, which of course aren't degenerate.)

Ambiguous constraints are used when you want the model to satisfy any of a group of possibilities. For example, you have five positive charges on one side of an interface, and 5 negative charges on the other side, and you want at least one of the many possible pairs to form.

Here, ambiguousness to account for degeneracy would be used like so:

AmbiguousConstraint
AtomPair HG11 3 H 11 BOUNDED 1.5 5 3
AtomPair HG12 3 H 11 BOUNDED 1.5 5 3
AtomPair HG13 3 H 11 BOUNDED 1.5 5 3
END_AMBIGUOUS

This says, "whichever of HG11/2/3 has the best distance, score only that one". (Since 1.5 to 5 is a large range, probably at least two would return 0 score at any one time anyway).

Now that I understand better what you're doing, I may be able to point you to something inside AmbiguousNMRDistanceConstraint. If you load up the file rosetta_source/src/core/scoring/constraints/AmbiguousNMRDistanceConstraint.cc and go to line 95 or so, you will see a huge block of code translating atom names. Translated into English, it appears this code is doing things like "if the name is HG, then convert it to mean 1HG2, 2HG2, 3HG2, 1HG1, 2HG1". So maybe the AmbigousNMRDistanceConstraint is internally handling the degeneracy. Unfortunately there doesn't appear to be a well documented key to what it's doing - you'll want to look at that code directly.

w/r/t BOUNDED versus HARMONIC functional forms: I suggest you fire up a plotting program and input the functional forms for those to see what they look like at different distances (or do it by hand, whatever). The constraint and the functional form are not interdependent, you can mix and match. I use BOUNDED for just about everything because I can have a wide zero-score basin. I also like to combine all my constraints (via ambiguous) with CONSTANT constraints set at a score of a few units; this prevents highly-unsatisfied constraints from blowing up the protein during minimization (if atoms are 100 angstroms apart, I want the score to stay equal if it moves to 99 angstroms apart, and I want the absolute value to be small).

Fri, 2011-06-10 13:17
smlewis

Thank you very much. Your explanations helped a lot.
I had a close look at the source file and tried a run with a constraint file as you suggested. Namely using HG1 or HD1 as a general descriptor for all three protons of a methyl-group. This works indeed very well. So there is no need for any "#".

For the record:
Degenerate protons are implemented in Rosetta using the AmbiguousNMRDistance constraint definition and the respective general identifier (e.g. HG1 for HG11; HG12; HG13) without any symbols or extra characters

Unfortunately, this brings me directly to a follow-up problem:
If I mix AtomPair and AmbiguousNMRDistance constraints I receive the following error which indicates that Rosetta might not be reading and therefore not using all constraints. (Occurs right at the beginning of abinitio "Stage 4".)

core.io.constraints: read constraints from ./constraints.fa_cst
core.io.constraints: read constraints section --NO_SECTION---
core.io.constraints: no section header [ xxx ] found, try reading line-based format... DON'T MIX
core.io.constraints: read constraints from ./constraints.fa_cst
core.io.constraints: ERROR: reading of AtomPair failed.

Sorting the restraints to have all AtomPair constraints and after that the AmbiguousNMRDistance constraints in the input file gives a similar error:

core.io.constraints: ERROR: reading of AmbiguousNMRDistance failed.

I understand that Rosetta usually looks for sections in the constraints input file. If I define all AtomPair constraints in a "[ atompairs ]" section Rosetta quits with a segmentation fault error message during reading of the constraints file (even if there are only AtomPair definitions present).

How should I combine AtomPair and AmbiguousNMRDistance constraints? Or how does a section end? According to the ConstraintIO.cc definition something like this seems not implemented; i.e. one section ends when another begins.

PS: Would it be hard to have a copy of AmbiguousNMRDistance constraints which inherits from AtomPairs and not from Ambiguous constraints (maybe called NMRDistance or something like that)? The idea behind would be to have degenerate hydrogens and full scoring for all constraints.

Tue, 2011-06-14 06:49
jurkm

Can you post your constraint file for me to look at?

I've never tried using sections for anything (unless the standard ambiguous constraint I mentioned counts). The constraints themselves are a mess of different compatibilities - some can be read in from a file, some must be constructed in code. I always get the "NO_SECTION" statement and it works anyway.

I was wrong (misled by poor naming schemes) about how the inheritance hierarchy works for these constraints. The inheritance is irrelevant to the function here anyway. I think the ambiguity is only with respect to interpreting HG1 for HG11; HG12; HG13. The near-total lack of documentation and commentary doesn't make interpreting it any easier.

Tue, 2011-06-14 07:28
smlewis

My constraint file looks like this:

AtomPair H 6 H 105 BOUNDED 1.5 5.50 0.3
AtomPair H 6 H 26 BOUNDED 1.5 5.50 0.3
AtomPair H 8 H 101 BOUNDED 1.5 3.75 0.3
AtomPair H 8 H 7 BOUNDED 1.5 5.50 0.3
AtomPair H 8 H 9 BOUNDED 1.5 5.50 0.3
AtomPair H 9 H 10 BOUNDED 1.5 4.50 0.3
AtomPair H 9 H 23 BOUNDED 1.5 4.50 0.3
AmbiguousNMRDistance H 10 HG1 50 BOUNDED 1.5 4.50 0.3
AtomPair H 11 H 10 BOUNDED 1.5 4.50 0.3
AtomPair H 13 H 14 BOUNDED 1.5 3.25 0.3
AmbiguousNMRDistance H 14 HG1 10 BOUNDED 1.5 5.50 0.3
AtomPair H 15 H 16 BOUNDED 1.5 3.75 0.3
AmbiguousNMRDistance H 17 HE1 83 BOUNDED 1.5 5.50 0.3

No error is displayed when the AmbigNMRDist lines are commented out or changed to Atompair. What is quite strange, though, is that changing one of the other AtomPair definitions to AmbigNMRDist does not give an error. So this is somehow not entirely reproducible.

Everything else works, so I am still very thankful for your help and the main issue is definitely solved.

Thu, 2011-06-16 00:50
jurkm

It has belatedly occurred to me that part of the problem may be centroid/fullatom. If you are doing abinitio modeling, it starts in centroid mode; only the amide proton is actually in the structure. That's the clear difference between the failing lines and the not-failing lines. I have no idea why AtomPair with HGs doesn't fail on non-existent atoms if that is indeed the cause of the failure.

Fri, 2011-06-17 13:20
smlewis

Actually, this was not a centroid/fullatom issue. After I got my constraints correctly formatted (http://www.rosettacommons.org/content/maximum-number-constraints) everything runs quite smoothly now.

Thanks for the helpful suggestions and advise.

Fri, 2011-06-24 00:12
jurkm