|Rosetta 3.2 Release Manual|
bin/loopmodel.<my_os>gccrelease. The major protocol movers are
src/protocols/moves/KinematicMover, which performs a localized kinematic perturbation to a peptide chain,
src/protocols/loops/LoopMover_KIC, which wraps the KinematicMover into a Monte Carlo protocol for loop modeling, and
src/protocols/loops/LoopRelaxMover, which allows the protocols in LoopMover_KIC to be combined with other loop modeling protocols. A basic usage example that briefly remodels an 8-residue loop is the
kinematic_looprelaxintegration test, which resides at
-loops:vicinity_samplingis set, the non-pivot torsions are sampled around their starting values by
-loops:vicinity_degreedegrees to focus sampling around the starting conformation, rather than drawing random Ramachandran values for larger perturbations.
The KIC loop modeling protocols are used to address either of two general problems:
It is also possible to use the KIC protocols for a number of new applications (see New things since last release below):
mini/bin/loopmodel.<my_os>gccrelease -database <minirosetta_database_path> -loops:remodel perturb_kic -loops:refine refine_kic -loops:input_pdb <my_starting_structure>.pdb -loops:loop_file <my_loopfile>.loop -nstruct <num_desired_models> -ex1 -ex2 -extrachi_cutoff 0 -overwrite
-loops:remodel perturb_kicand refine, activated by
-loops:refine refine_kic, which may be invoked seperately or sequentially by command line flags.
-loops:remodel perturb_kic. If the 'extend loop' field of the loop definition is set to '1' (see Input Files), the loop will first be placed into a random closed conformation with idealized bond lengths, bond angles, and omega angles. Otherwise, the input conformation is used as the starting structure. Each loop is then subject to a single cycle of simulated annealing Monte Carlo.
score4Llow-resolution scoring function. The number of Monte Carlo steps is determined by the
inner_cycles. The number of
outer_cyclesis set by
-loops:outer_cycles. The number of inner cycles is
min( 1000, number_of_loop_residues * 20 ), or is set to
-loops:max_inner_cyclesif provided. At the end of each outer cycle, the pose is set to the lowest energy conformation observed so far in the simulation, unless the flag
-loops:kic_recover_lastis set, in which case the last accepted conformation passes on to the next outer cycle. From the first step to the last, the temperature decreases exponentially from
score12with an upweighted chain break term). This stage is invoked by
-loops:refine refine_kic. At the beginning of this stage, unless the flag
-loops:fix_natscis provided, all residues within the neighbor distance of a loop (defined by
-loops:neighbor_dist) are repacked and then subject to rotamer trials. The backbones of all loop residues, and the side-chains of all loop residues and neighbors are then subject to energy minimization (using the DFP algorithm). If
-loops:fix_natscis set, only the loop residues (and not the neighbors) will be subject to repacking, rotamer trials, and minimization. Consequently, if this stage has been preceeded by the centroid stage, and the
-loops:fix_natscflag is omitted, the side-chains surrounding the loop will be optimized for the perturbed loop conformation, rather than the starting loop conformation, which provides a more challenging task for benchmarking purposes, since the wild-type neighboring side-chains must be reconstructed in addition to the loop backbone and side-chain conformations. The simulation then proceeds through a single cycle of simulated annealing Monte Carlo, following the same scheduling as the centroid stage, except that two rounds of kinematic closure moves are attempted per inner cycle, and if
-loops:max_inner_cyclesis not set, the number of inner cycles is
min( 200, 10 * total_number_loop_residues ). Here, the number of loop residues includes all the loop definitions (if multiple loops are defined), because refine-stage kinematic closure moves are applied randomly to any of the loops. After each kinematic move, the loop and neighbor residues are subject to rotamer trials, and DFPmin is applied to the loop backbone, and loop and neighbor side-chains. Every
repack_cycle, which is set by
-loops:refine_repack_cycles, all of the loop and neighbor side-chains are repacked. For all of these optimizations, if the
-loops:fix_natscflag is set, the neighbor residues will remain fixed, and only the loop residues will be subject to rotamer trials, minimization, and repacking. From the first step to the last, the temperature decreases exponentially from
-loops:refine refine_kic, with the 'extend loop' field in the loop definition file set to '1'.
For loop refinement, just use
For details on the geometric steps taken by the underlying kinematic solver, please see the supplementary material of Mandell et al. referenced above.New things since last release.
-loops:input_pdb. The starting structure must have real coordinates for all residues outside the loop definition, plus the first and last residue of each loop region.
loops:loop_fileand shared across all loop modeling protocols. For each loop to be modeled, include the following on one line:
column1 "LOOP": The loop file identify tag column2 "integer": Loop start residue number column3 "integer": Loop end residue number column4 "integer": Cut point residue number, >=startRes, <=endRes. default - let the loop modeling code choose cutpoint column5 "float": Skip rate. default - never skip column6 "boolean": Extend loop. Default false
test/integration/tests/kinematic_looprelax/input/4fxn.loop, which looks like this:
LOOP 88 95 92 0 1
-database Path to the Rosetta database. [Path] -loops:remodel Selects a protocol for the centroid remodeling stage. Legal values: 'perturb_kic','perturb_ccd','quick_ccd','quick_ccd_moves','old_loop_relax','no'. default = 'quick_ccd'. For KIC, use 'perturb_kic'. [String] -loops:refine Selects the all-atom refinement stage protocol. Legal values: 'refine_kic','refine_ccd','no'. default = 'no'. For KIC, use 'refine_kic'. [String] -loops:input_pdb Path/name of input pdb file. Note that this is NOT -s/-l like normal. [File] -loops:loop_file Path/name of loop definition file. default = 'loop_file'. [File]
-in:file:native Path/name of native pdb file. Backbone rmsd to this structure will be reported in each output decoy. If no native structure is provided, backbone rmsd to the starting structure is reported. [File] -in:file:fullatom Read the input structure in full-atom mode. Set this flag to avoid repacking the input structure before modeling in non-KIC protocols (KIC refine alway begins by repacking the loop side-chains, including the neighboring side-chains if -loops:fix_natsc is 'false'). default = 'false'. [Boolean] -loops:fix_natsc Don't repack, rotamer trial, or minimize loop residue neighbors. default = 'false'. [Bolean] -ex1 (-ex2, -ex3, -ex4) Include extra chi1 rotamers (or also chi2, chi3, chi4) -extrachi_cutoff 0 Set to 0 to include extra rotamers regardless of number of neighbors -extra_res_cen Path to centroid parameters file for non-protein atoms or ligands -extra_res_fa Path to all-atom parameters file for non-protein atoms or ligands -overwrite Overwrite existing models (Rosetta will not output without this flag if same-named model exists)
-loops:neighbor_dist Only optimize side-chains with C-beta atoms within this many angstroms of any loop C-beta atom. default = '10.0'. To speed up runs, try '6.0'. [Float] -loops:vicinity_sampling Sample non-pivot torsions within a vicinity of their input values. default = 'false'. For a description of pivot and non-pivot torsions, please see Purpose, above. [Boolean] -loops:vicinity_degree Number of degrees allowed to deviate from current non-pivot torsions when using vicinty sampling (smaller number makes tighter sampling). default = '1.0'. [Float] -loops:kic_max_seglen Maximum number of residues in a KIC move segment. default = '12'. [Integer] -loops:remodel_init_temp Initial temperature for simulated annealing in 'perturb_kic'. default = '2.0'. [Float] -loops:remodel_final_temp Final temperature for simulated annealing in 'perturb_kic'. default = '1.0'. [Float] -loops:refine_init_temp Initial temperature for simulated annealing in 'refine_kic'. default = '1.5'. [Float] -loops:refine_final_temp Final temperature for simulated annealing in 'refine_kic'. default = '0.5'. [Float] -loops:max_kic_build_attempts Number of times to attempt initial closure in 'perturb_kic' protocol. Try increasing to 10000 if initial closure is failing. default = 100. [Integer] -loops:outer_cycles Number of outer cycles for Monte Carlo (described above in Protocol). default = '3'. [Integer] -loops:max_inner_cycles Maximum number of inner cycles for Monte Carlo (default described above in Protocol). [Integer] -loops:kic_recover_last Keep the last sampled conformation at the end of each outer cycle instead of the lowest energy conformation sampled so far. default = 'false'. [Boolean] -loops:optimize_only_kic_ region_sidechains_after_move Should rotamer trials and minimization be performed after every KIC move but only within the loops:neighbor_dist of the residues in the moved KIC segment. Speeds up execution when using very large loop definitions (such as when whole chains are used for ensemble generation). default = 'false'. [Boolean] -loops:fast Signifcantly reduces the number of inner cycles. For quick testing, not production runs. default = 'false'. [Boolean] -run:test_cycles Sets the number of outer cycles and inner cycles to 3. For extremely quick testing and sanity checks, not for production runs. default = 'false'. [Boolean]
-ex2. To consistently reconstruct long loops (e.g., 12-residues or longer) to high accuracy, it is recommended to generate 1000 models by using
-nstruct 1000(or by running several smaller jobs over multiple processor cores). The KIC protocol was optimized for de novo reconstruction of 12-residue protein loops in different environments with different end-to-end distances. Shorter loops or largely buried peptide segments may require substantially fewer models. The KIC method was also shown to reconstruct 9 different 18-residue loops from SH3 domains to sub-angstrom accuracy, for which 5000 models were generated per case. On average, each model generated by the combined remodel and refine protocol shown in the Quick Start Example section takes 15-20 minutes for a 12-residue loop on a single CPU-core, although the time required can vary depending on loop burial and amino acid composition.
If the starting structure includes non-protein ligands, it is required to convert these HETATMs into Rosetta atom types and include centroid (for remodel) and all-atom (for refine) parameter files via the
-extra_res_fa command lines. The script
rosetta/mini/src/python/apps/public/molfile_to_params.py may be used to create the all-atom parameter file (include the '-c' option to also generate the centroid parameter file). The mofile_to_params.py script requires an MDL Molfile of the ligand as input. OpenBabel may be used to convert PDB ligands to Molfiles.
KIC has also been used to generate backbone ensembles for flexible backbone design. A recent study found that designing on a backbone ensemble generated by KIC correctly predicted an average of 82% of amino acids across 17 positions observed in phage display experiments on the Herceptin-HER2 interface (for details see Babor et al., referenced above). Loop definitions followed the description given in Whole protein ensemble generation under New things since last release. The command line options were
loopmodel.linuxgccrelease -database minirosetta -loops:refine refine_kic -loops:input_pdb structure.pdb -loops:loop_file modeling.loops -loops:outer_cycles 1 -loops:refine_init_temp 1.2 -loops:refine_final_temp 1.2 -loops:vicinity_sampling -loops:vicinity_degree 3 -loops:optimize_only_kic_region_sidechains_after_move -ex1 -ex2 -nstruct 100
-in:file:nativeis provided, the reported rmsd is the backbone (N, Ca, C, O) rmsd to the native loop(s). If not, the reported rmsd is the backbone rmsd to the starting loop(s) conformation. Example output looks like this:
loop_cenrms: 3.49022 (rmsd to native/start after centroid stage) loop_rms: 0.858667 (rmsd to native/start after all-atom stage) total_energy: -389.076 (total score of the system) chainbreak: 0.0254188 (score of the chainbreak term, smaller value means well-closed loops. should be < 1.0)
-resfileis added to the command line, the packer will include the specified residues for design every
-loops:refine_repack_cyclescycles in the
refine_kicprotocol. This feature can be used, for example, to redesign protein interfaces exhibiting conformational plasticity.
-loops:outer_cycles 1 -loops:refine_init_temp 1.2 -loops:refine_final_temp 1.2 -loops:vicinity_sampling -loops:vicinity_degree 3. These parameters were also used to recover an average of 82% of residues observed in phage display experiments across 17 positions in the Herceptin-HER2 interface (see Babor et al., above).
src/protocols/loops/LoopRelaxMover.cc. The key lines are reproduced here.
protocols::loops::Loops loops = protocols::loops::get_loops_from_file();
core::kinematics::FoldTree f_new; protocols::loops::fold_tree_from_loops( pose, loops, f_new, true ); pose.fold_tree( f_new );
IndependentLoopMoverOP remodel_mover( static_cast< loops::IndependentLoopMover * > ( loops::get_loop_mover( "perturb_kic"), loops ).get() ) ); remodel_mover->apply( pose );
protocols::loops::LoopMover_Refine_KIC refine_kic( loops ); refine_kic.apply( pose );