Rosetta 3.2.1 Release Manual |
src/apps/public/flexpep_docking/FlexPepDocking.cc
protocols/flexpep_docking/FlexPepDockingProtocol.cc
test/integration/tests/flexpepdock/
) and demo folder (mini/demo/barak
).
Raveh B*, London N* and Schueler-Furman O Sub-angstrom Modeling of Complexes between Flexible Peptides and Globular Proteins Proteins, 2010 (in press, DOI 0.1002/prot.22716)
test/integration/tests/flexpepdock/input/1ER8_rb1_tor10_5.pdb for an example) .
Native structure:
This is a reference structure for RMSD comparisons and statistics of final decoys. In case a native structure is not available, use a dummy structure or the starting structure
(see test/integration/tests/flexpepdock/input/1ER8.pdb for an example).
Constraint file (optional):
As in any other Rosetta protocol, please refer to the constraints file documentation page for more information.
Note that the -flexpep_prepack, -rbMCM/torsionsMCM and the -flexPepDockingMinimizeOnly flags denote different modes of functionality, and are therefore mutually exclusive (-rbMCM and -torsionsMCM can be used together, but not with the other flags).
More information on common Rosetta flags can be found in the relevant rosetta manual pages. In particular, flags related to the job-distributor (jd2), scoring function, constraint files and packing resfiles are identical to those in any other Rosetta protocol).
-native
Flag
Description
-in:file:s
Or
-in:file:silent
Specify starting structure
(in:file:s for PDB format, in:file:silent for silent file
format).
-in:file:silent_struct_type
-out:file:silent_struct_type
Format of silent file to be read
in/out. For silent output, use the binary file type since
other types may not support ideal form
Specify the native structure for
which to compare in RMSD calculations. This is a required flag.
When the native is not known use the starting structure as
native.
-nstruct
Number of decoys to create in the
simulation
-unboundrot
Add the rotamers of the specified
structure to the rotamer library (usually used to include
rotamers of unbound monomer)
-use_input_sc
pass accepted rotamers from the
input structure between Monte-Carlo with Minimization (MCM)
cycles. Unlike the -unboundrot flag, not all rotamers from the
input structure are added each time to the rotamer library, but
only those accepted at the end of each round the remaining
conformations are lost.
-ex1/-ex1aro -ex2/-ex2aro -ex3 -ex4
Adding extra side-chain rotamers
(highly recommended). The -ex1 and -ex2aro flags were used in our
own tests, and therefore are recommended as default values.
-database
The Rosetta database
Flag
Description
Type
Default
-rep_ramp_cycles
The number of outer cycles for the
protocol. In each cycle, the repulsive energy of Rosetta is gradually
rampped up and the attractive energy is rampped down, before inner-cycles
of Monte-Carlo with Minimiation (MCM) are applied.
Integer
10
-mcm_cycles
Number of inner-cycles for both
rigid-body and torsion-angle Monte-Carlo with Minimization (MCM)
procedures.
Integer
8
-smove_angle_range
Defines the perturbations size of
small/sheer moves.
Real
6.0
-extend_peptide
start the protocol with the
peptide in extended conformation (neglect original peptide
conformation ; extend from the anchor residue)
Boolean
false
- A typical run in three steps:
-
pre-pack your initial complex
FlexPepDocking.{ext}
-database ${mini_db} -s input.pdb -native native.pdb -flexpep_prepack
-ex1 -ex2aro [-unboundrot unbound.pdb]
-
generate 100 (or more) decoys with the -lowres_preoptimize flag, and additional 100 decoys (or more) without this flag, by two separate runs (the low resolution can be skipped if you are in a hurry)
FlexPepDocking.{ext}
-database ${minidb} -s start.pdb -native native.pdb
-out:file:silent decoys.silent -out:file:silent_struct_type binary
-rbMCM -torsionsMCM -ex1 -ex2aro -use_input_sc
-nstruct 100 -unboundrot unbound.pdb [ -lowres_preoptimize ]
-
Open the output score file of both runs (score.sc by default), sort it by decoy score (second column), and choose the top-scoring decoys as candidate models.
- Always pre-pack:
Unless you know what you are doing, always pre-pack the input structure (using the pre-packing mode), before running the peptide docking protocol. Our docking protocol focuses on the interface between the peptide and the receptor. However, we rank the structures based on their overall energy. Therefore, it is important to create a uniform energetic background in non-interface regions. The main cause for energetic differences between decoys are non-optimal side-chain rotamers in these regions. Therefore, pre-packing the side-chains of each monomer before docking is highly recommended, and may significantly affect the final decoy ranking.
- Decoy Selection:
In order to get good results, it is recommended to generate a large number of decoys (at least 200, optimally 2000). The selection of decoys should be made based on their score. While selection of the single top-scoring decoy may suffice in some cases, it is recommended to inspect the top-5 or top-10 scoring decoys. In particular, this set of models allow to identify hot spot and motif residues as those with particularly strong sub-Angstrom structural convergence, compared to more variable side chain conformations at other positions.
- Low-resolution pre-optimization
The -lowres_preoptimize flag can be used to add a preemptive centroid-mode optimization step, before performing full atom, high-resolution docking. As a rule of thumb, it is recommended to use this flag when the quality of the initial starting structure is less defined (roughly more than 3A peptide backbone-RMSD), and thus sampling an extended range makes sense. In theory, this flag can be also specified independently (without the -rb_mcm or torsion_mcm flags). In this case, only low-resolution sampling followed by side-chain repacking will be performed. This mode of operation was not tested.
- The unbound rotamers flag:
In many cases, the unbound receptor (or peptide) may contain side-chain conformations that are more similar to the final bound structure than those in the rotamer library. In order to save this useful information, it is possible to specify a structure whose side-chain conformations will be appended to the rotamer library during prepacking or docking, and may improve the chances of getting a low-scoring near-native result. This option was originally developed for the RosettaDock protocol.
- Extra rotamer flags:
It is highly recommended to use the Rosetta extra rotamer flags that increase the number of rotamers used for prepacking (we used the -ex1 and -ex2aro flags in our own runs, but feel free to experiment with other flags if you think you know what you are doing. Otherwise, stick to -ex1 and -ex2aro).
- When you should / should not use FlexPepDock
FlexPepDocking is not intended for fully blind docking. It is intended for obtaining high-resolution peptide models given a coarse-grain starting structure, that should be somewhat close to the native solution (about 5A backbone-RMSD for the native peptide, even though in some cases, the protocol works well for starting structures with up to 12A bb-RMSD from the native). The initial structure can be obtained from homologues, from known experimental or computation information about the correct binding site, etc. In many cases, it may be useful to use a constraint file to force the peptide to reach the vicinity of a known binding site.
It is also assumed that the secondary structure of the peptide in the initial coarse-grain structure is approximately identical to the native. While the protocol is designed to allow substantial peptide backbone flexibility, it is not designed to switch between secondary structures (from strand to helix conformation, etc.). An initial secondary structure can be assigned based on prior information (homologue structures, etc.), from experimental information (CD experiments, etc.) or from complementary computational predictions (e.g. conformational sampling and ab-initio folding)
- Typical running time:
In our tests, producing 200 models typically takes 10 CPU hours (approximately 3 minutes per decoy). Substantial speedup gain is obtained by running parrallel proccesses using appropriate job-distributor flags.
The output of a FlexPepDock run is a score file (score.sc by default) and k decoy structures (as specified by the -nstruct flag and the other common Rosetta input and output flags). The score of each decoy is the second column of the score file. Decoy selection should be made based on this column.
Interpretation of FlexPepDock-specific score terms: (for the common Rosetta scoring terms, please also see the relevant manual page).
total_score*
Total score of the complex
I_bsa
Buried surface area of the
interface
I_hb
Number of hydrogen bonds across the
interface
I_pack
Packing statistics of the interface
I_sc
Interface score (sum over energy
contributed by interface residues of both partners)
pep_sc
Peptide score (sum over energy
contributed by the peptide to the total score; consists of the
internal peptide energy and the interface energy)
I_unsat
Number of buried unsatisfied HB
donors and acceptors at the interface.
rms (ALL/BB/CA)
RMSD between output decoy and the
native structure, over all peptide (heavy/backbone/C-alpha) atoms
rms (ALL/BB/CA)_if
RMSD between output decoy and the
native structure, over all peptide interface
(heavy/backbone/C-alpha) atoms
startRMS(all/bb/ca)
RMSD between start and native
structures, over all peptide (heavy/backbone/C-alpha) atoms
Except for decoy selection by total score (see Outputs section), no special post-processing steps are needed. However, advanced users may optionally use Rosetta cluster_commands for for assessing whether top-scoring models converge to a consensus solution. As a general rule, we saw in Raveh et al. that interface side-chains that point towards the receptor, in particular those of hot-spot residues and of known binding motif residues, tend to converge spatially better than side-chains of other residues (see Figure 4 in Raveh et al.). That said, clustering is an optional step, and is not considered an integral part of the FlexPepDock protocol as described and tested in Raveh et al.