Potentially useful experimental data takes many forms. The very nature of Monte Carlo simulation strongly supports the incorporation of any type of experimental constraint, because all you need it to do is allow it to influence the distribution of generated structures.

Constraints vs. Restraints

Rosetta calls constraints what other computational packages refer to as restraints. In Rosetta, a constraint does not fix a degree of freedom and remove it from sampling and scoring. Rather, we are adding a scoring term that penalizes deviations from particular values of that degree of freedom. Restraints have no particular name - it's just making sure that degree of freedom is not free, for example in a MoveMap.

Input structures

Truly, the largest experimentally derived sampling bias to a biological problem is any input structures that might be available. After all, you are using those structures precisely because you trust them enough not to want to perform ab initio structure prediction, so you want the bias that starting from them provides. At the same time, input structures are not perfect:

  • Crystal structures are of variable resolution and frequently lack hydrogens.
  • Crystal structures are of shockingly high quality variation within a single structure.
  • At the resolution where hydrogens cannot be visualized (at least 90% of the PDB) asparagine and glutamine oxygens and nitrogens cannot be distinguished from each other (ditto histidine tautomers) and are frequently misassigned.
  • NMR structures are frequently resolved via a few hundred constraints, rather than the thousands upon thousands in crystal structures.

Most of all, the force fields used in these optimization efforts are arithmetically and algorithmically distinct from the Rosetta energy function. It is critical to obtain structures that are geometrically similar to the starting structure but that exist closer to a local minimum of the scoring function. This is important because every unit of strain energy in your starting structure can inappropriately bias sampling: bad moves can be accepted that would otherwise have been rejected because they relieve strain that already should have been addressed. There is a complete write-up of preparing starting structures appropriately.

Specialized Rosetta executables

Rosetta has individual modules to handle particular forms of experimental constraint:

  • mr_protocols is typically used alongside Phaser; it uses Rosetta's comparative modeling to rebuild gaps and insertions in the template, as well as missing density, from fragments, followed by relaxation with constraints to experimental density. You can then use Phaser again to re-score against crystallographic data.
  • ERRASER refines RNA structures from electron density (crystallographic data); it constitutes a workflow of erraser_minimize, swa_rna_analytical_closure, and _swa_rna_main. It requires the use of the refinement program PHENIX.
  • loops from density is a script to take badly fit electron data and a cutoff suggesting how much of the pose you're willing to rebuild and to generate input "loops" files for loop modeling.
  • Chemical shift files provide data to a variety of protocols often collectively referred to as CSROSETTA that incorporate NMR constraints to refine structures

Experimental constraints

Frequently, you will encounter situations where you have knowledge about the experimental system that does not neatly fit into any of the above situations, or which provides very sparse or even conflicting information. This is all right: Rosetta's capacity for working with constraints: Guides to specific types of structural perturbations using RosettaScripts