"Skeleton" XML format
Copy, paste, fill in, and enjoy
<dock_design> <SCOREFXNS> </SCOREFXNS> <TASKOPERATIONS> </TASKOPERATIONS> <FILTERS> </FILTERS> <MOVERS> </MOVERS> <APPLY_TO_POSE> </APPLY_TO_POSE> <PROTOCOLS> </PROTOCOLS> </dock_design>
Anything outside of the < > notation is ignored and can be used to comment the xml file
General Description and Purpose
RosettaScripts is meant to provide an xml-scriptable interface for conducting all of the tasks that interface design developers produce. With such a scriptable interface, it is hoped, it will be possible for non-programmers to 'mix-and-match' different design strategies and apply them to their own needs. It is also hoped that through a common interface, code-sharing between different people will be smoother. Note that at this point, the only movers and filters that are implemented in this application are the ones described below. More will be made available in future releases. At this point these include protocols from the protein-interface design, protein docking, enzyme-design, ligand-docking and -design, and DNA-interface design groups. General movers for loop modeling are also available.
At the most abstract level, all of the computations that are needed in interface design fall into two categories: Movers and Filters. Movers change the conformation of the complex by acting on it, e.g., docking/design/minimization, and filters decide whether a given conformation should go on to the subsequent steps. Filters are meant to reduce the amount of computation that is conducted on conformations that show no promise. Then, a RosettaScript is merely a sequence of movers and filters.
The implementation for this behaviour is done by the following components:
- DockDesign, Filter, and Mover
DockDesign maintains a vector of pairs of movers and their associated filters. By using the TrueFilter or the NullMover, filters and movers can be essentially decoupled by any protocol. The setup of having pairs of movers and filters is used simply because in most contexts filters will be conceptually associated with a mover and vice versa.
- DockDesignParser.cc This function parses an xml file and populates DockDesignMover with pairs of Movers and Filters. All of the movers and filters that are supported should also be defined in this function.
Example XML file
The following simple example will compute ala-scanning values for each residue in the protein interface:
<dock_design> <SCOREFXNS> <interface weights=interface/> </SCOREFXNS> <FILTERS> <AlaScan name=scan partner1=1 partner2=1 scorefxn=interface interface_distance_cutoff=10.0 repeats=5/> <Ddg name=ddg confidence=0/> <Sasa name=sasa confidence=0/> </FILTERS> <MOVERS> <Docking name=dock fullatom=1 local_refine=1 score_high=soft_rep/> </MOVERS> <APPLY_TO_POSE> </APPLY_TO_POSE> <PROTOCOLS> <Add mover_name=dock filter_name=scan/> <Add filter_name=ddg/> <Add filter_name=sasa/> </PROTOCOLS> </dock_design>
Rosetta will carry out the order of operations specified in PROTOCOLS, starting with docking (in this case this is all-atom docking using the soft_rep weights). It will then apply alanine scanning, repeated 5 times for better convergence, for every residue on both sides of the interface computing the binding energies using the interface weight set (counting mostly attractive energies). The binding energy (ddg) and surface area (sasa) will also be computed. All of the values will be output in a .report file. Notice that since ddg and sasa are assigned confidence=0, they are not used here as filters that can terminate a trajectory per se, but rather for reporting the values for the complex. An important point is that filters never change the sequence or conformation of the structure, so the ddg and sasa values are reported for the input structure following docking, with the alanine-scanning results ignored.
Additional example xml scripts, including examples for docking, protein interface design, and prepacking a protein complex, amongst others, can be found at: https://svn.rosettacommons.org/trac/browser/trunk/mini/demo/rosetta_scripts/
The following command line would run the above protocol, given that the protocol file name is ala_scan.xml
bin/rosetta_scripts.linuxgccrelease -s < INPUT PDB FILE NAME > -use_input_sc -nstruct 20 -jd2:ntrials 2 -database ~/minirosetta_database/ -ex1 -ex2 -parser:protocol ala_scan.xml -parser:view
The ntrials flag specifies how many trajectories to start per nstruct. In this case, each of 20 trajectories would make two attempts at outputting a structure. If no ntrials is specified, a default value of 1 is assumed.
The parser:view flag may be used with rosetta executables that have been compiled using the extras=graphics switch in the following way (from the Rosetta root directory):
scons mode=release -j3 bin extras=graphics
When running with -parser:view a graphical viewer will open that shows many of the steps in a trajectory. This is extremely useful for making sure that sampling is following the intended trajecotry.
Input and Output Files
Running a typical protocol requires input of an xml file and a starting pdb file, as in the example commandline above. Alternatively, to run the protocol on many structures, save a simple list of the pdb files to be used and replace the flag -s <INPUT PDB FILE NAME> in the commandline with -l <INPUT LIST FILE NAME>. Some movers and filters require specific input files (for example, a pdb file containing stub residues for hot-spot residue placement for PlaceStub or PlaceSimultaneously movers), and in such cases the required input file/s are described below and are generally called via the xml script.
During a run, if any defined filters are not satisfied then the trajectory will be killed and no output files returned, and Rosetta will continue on to the next ntrial (or if all ntrials have been attempted and failed, Rosetta will continue with any remaining nstructs as defined in the commandline). For a successful run in which all filters are satisfied, the output will include a pdb file and a score.sc file. The pdb file ends with an energy table for all residues and lists the values of any filters in the same order they are used in the xml protocol. The output pdb name is identical to the input pdb file name with a suffix denoting the nstruct number. The score.sc file tabulates the energy terms and filter values for every successful nstruct.
Using an IntelliSense editor to help with generating RosettaScripts
An xml-schema was generated for us by Avner Aharoni (Microsoft) using Visual Studio. Using this schema in a compatible editor provides a specific editor for writing RosettaScripts, complete with word completion, grammatical error warnings and help with options. We are currently aware of two editors that are fully compatible with this schema
Editing RosettaScripts in emacs
The nXML emacs add-on is compatible with the RosettaScripts.rnc schema (found in src/apps/public/rosetta_scripts/RosettaScripts.rnc).
- Download nXML from http://www.thaiopensource.com/nxml-mode/
- Read the nXML portion of the emacsWiki at http://www.emacswiki.org/cgi-bin/wiki/NxmlMode
- Load the RosettaScripts.rnc file into emacs+nXML
- Load your protocol
- Have fun!
Editing RosettaScripts in VisualStudio
MS-Windows users can download Visual Studio Express (free of charge) which provides an xml editor that is compatible with the RosettaScripts.xsd schema (found in src/apps/public/rosetta_scripts/RosettaScripts.xsd). The following instructions were provided by Avner Aharoni:
- Download VB express from http://www.microsoft.com/express/download/
- Save the schema in the following folder C:\Program Files\Microsoft Visual Studio 9.0\Xml\Schemas
- Create empty xml file on disk (a file with the .xml suffix)
- Open it in the Visual Studio Express, go to its properties (view =-> property window F4) and set the RosettaScripts.xsd schema for use.
Options Available in the XML Protocol File
This file lists the Movers, Filters, their defaults, meanings and uses as recognized by RosettaScripts. It is written in an xml format and using many free viewers (e.g., vi) will highlight key xml notations, so long as the file has extension .xml
Whenever an xml statement is shown, the following convention will be used:
<...> to define a branch statement (a statement that has more leaves) <.../> a leaf statement. "" defines input expected from the user with ampersand (&) defining the type that is expected (string, float, etc.) () defines the default value that the parser will use if that is not provided by the protocol.
The following are defined internally in the parser, and the protocol can use them without defining them explicitly.
Has an empty apply. Will be used as the default mover in <PROTOCOLS> if no mover_name is specified. Can be explicitly specified, with the name "null".
Returns true. Useful for defining a mover without using a filter.
This is a special filter that uses previously defined filters to construct a compound logical statement with AND, OR, XOR, NAND and NOR operations. By making compound statements of compound statements, esssentially all logical statements can be defined.
<CompoundStatement name=(&string)> <OPERATION filter_name=(true_filter &string)/> <.... </CompoundStatement>
where OPERATION is any of the operations defined in CAPS above.Note that the operations are performed in the order that they are defined. No precedence rules are enforced, so that any precedence has to be explicitly written by making compound statements of compound statements.Also note that the first OPERATION is ignored, and the value of the first filter is simply assigned to the filter's results.
- score12: The default all-atom scorefunction used by rosetta ab-initio and design
- score_docking: high resolution docking scorefxn (standard+docking_patch)
- score_docking_low: low resolution docking scorefxn (interchain_cen)
- soft_rep: soft_rep_design weights.
- score4L: low resolution scorefunction used for loop remodeling (chainbreak weight on)
- score_empty: all weights = 0.
This section defines scorefunctions that will be used in Filters and Movers. This can be used to define any of the scores defined in the rosetta_database
<"scorefxn_name" weights=(standard &string) patch="&string"> <Reweight scoretype="&string" weight="&Real"> </"scorefxn_name">
where scorefxn_name will be used in the Movers and Filters sections to use the scorefunction. The name should therefore be unique and not repeat the predefined score names. The Reweight tag is optional and allows you to change/add the weight for a given scoretype.
Global Scorefunction modifiers
The apply_to_pose section may set up constraints, in which case it becomes necessary to set the weights in all of the scorefunctions that are defined. The default weights for all the scorefunctions are defined globally in the apply_to_pose section, but each scorefunction definition may change this weight. For example, to multiply the favor_native_residue bonus by 6.0
<my_spiffy_score weights="soft_rep_design" patch="dock" fnr=6.0/>
The following modifiers are recognized:
fnr=(the value set by apply_to_pose for favor_native_residue &float)
<hs_hash=(the value set by apply_to_pose for hotspot_hash &float)
This is a section that is used to change the input structure. The most likely use for this is to define constraints to a structure that has been read from disk.
Sets constraints on the sequence of the pose that can be based on a sequence alignment or an amino-acid transition matrix.
<profile weight=(0.25 &Real) file_name=(<input file name >.cst &string)/>
sets residue_type type constraints to the pose based on a sequence profile. file_name defaults to the input file name with the suffix changed to ".cst". So, a file called xxxx_yyyy.25.jjj.pdb would imply xxxx_yyyy.cst. To generate sequence-profile constraint files with these defaults use DockScripts/seq_prof/seq_prof_wrapper.sh
SetupHotspotConstraints (formerly hashing_constraints)
<SetupHotspotConstraints stubfile=(stubs.pdb &string) redesign_chain=(2 &integer) cb_force=(0.5 &float) worst_allowed_stub_bonus=(0.0 &float) apply_stub_self_energies=(1 &bool) apply_stub_bump_cutoff=(10.0 &float) pick_best_energy_constraint=(1 &bool) backbone_stub_constraint_weight=(1.0 &Real)/>
- stubfile: a pdb file containing the hot-spot residues
- redesign_chain: which is the host_chain for design. Anything other than chain 2 has not been tested.
- cb_force: the Hooke's law spring constant to use in setting up the harmonic restraints on the Cb atoms.
- worst_allowed_stub_bonus: triage stubs that have energies higher than this cutoff.
- apply_stub_self_energies: evaluate the stub's energy in the context of the pose.
- pick_best_energy_constraint: when more than one restraint is applied to a particular residue, only sum the one that makes the highest contribution.
- backbone_stub_constraint_weight: the weight on the score-term in evaluating the constraint. Notice that this weight can be overridden in the individual scorefxns.
Each mover definition has the following structure
<"mover_name" name="&string" .../>
where "mover_name" belongs to a predefined set of possible movers that the parser recognizes and are listed below, name is a unique identifier for this mover definition and then any number of parameters that the mover needs to be defined.
ParsedProtocol (formerly DockDesign)
This is a special mover that allows making a single compound mover and filter vector (just like protocols).
<ParsedProtocol name=( &string)> <Add mover_name=( null &string) filter_name=( true_filter &string)/> ... </ParsedProtocol>
Implements a simple IF( filter( pose ) ) THEN mover( pose )
<If name=( &string) filter_name=(&string) mover_name=(&string)/>
Allows looping over a mover using either iterations or a filter as a stopping condition (the first turns true). By using DockDesign mover above with loop can be useful, e.g., if making certain moves is expensive and then we want to exhaust other, shorter moves.
<LoopOver name=(&string) mover_name=(&string) filter_name=( false_filter &string) iterations=(10 &Integer) drift=(true &bool)/>
drift: true- the state of the pose at the end of the previous iteration will be the starting state for the next iteration. false- the state of the pose at the start of each iteration will be reset to the state when the mover is first called. Note that "falling off the end" of the iteration will revert to the original input pose, even if drift is set to true.
This mover is somewhat deprecated in favor of the more general GenericMonteCarlo mover.
Allows sampling structures by MonteCarlo with a mover. The score
evaluation of pose during MC are done by filterOP that can do
report_sm(), not only ScoreFunctionOP.
You can choose either formats.
1) scoring by filterOP
<GenericMonteCarlo name=(&string) mover_name=(&string) filter_name=(&string) trials=(10 &integer) sample_type=(low, &string) temperature=(0, &Real) drift=(1 &bool)/>
2) scoring by ScoreFunctionOP
<GenericMonteCarlo name=(&string) mover_name=(&string) scorefxn_name=(&string) trials=(10 &integer) sample_type=(low, &string) temperature=(0, &Real) drift=(1 &bool)/>
sample_type: low- sampling structures having lower scores, sampling structures having higher scores
drift: true- the state of the pose at the end of the previous iteration will be the starting state for the next iteration.
false- the state of the pose at the start of each iteration will be reset to the state when the mover is first called ( Of course, this is not MC ).
Calling another RosettaScript from within a RosettaScript
<Subroutine name=(&string) xml_fname=(&string)/>
- xml_fname: the name of the RosettaScript to call.
This definition in effect generates a Mover that can then be incorporated into the RosettaScripts PROTOCOLS section. This allows a simplification and modularization of RosettaScripts.
Recursions are allowed but will cause havoc.
Placement and Placement-associated Movers & Filters
All movers described in this section are described by their developers as highly experimental. The objective of the placement methods are to help in the task of generating hot-spot based designs of protein binders. The starting point for all of them are a protein target (typically chain A), libraries of hot-spot residues, and a scaffold protein.
A few keywords used throughout the following section have special meaning and are briefly explained here.
- Hot-spot residue: typically a residue that forms optimized interactions with the target protein. The goal here is to find a low-energy conformation of the scaffold protein that incorporates as many such hot-spot residues as possible.
- Stub: used interchangeably with hot-spot residue. This is a dismembered residue in a specified location against the target surface.
- Placement: positioning of the scaffold protein such that it incorporates the hot-spot residue optimally.
This is a special mover associated with PlaceSimultaneously, below. It carries out the auctioning of residues on the scaffold to hotspot sets without actually designing the scaffold. If pairing is unsuccessful Auction will report failure.
<Auction name=( &string) host_chain=(2 &integer) max_cb_dist=(3.0 &Real) cb_force=(0.5 &Real)> <StubSets> <Add stubfile=(&string)/> </StubSets> </Auction>
Note that none of the options, except for name, needs to be set up by the user if PlaceSimultaneously is notified of it. If PlaceSimultaneously is notified of this Auction mover, PlaceSimultaneously will set all of these options.
Map out the residues that might serve as a hotspot region on a target. This requires massive user guidance. Each hot-spot residue should be roughly placed by the user (at least as backbone) against the target. Each hot-spot residue should have a different chain ID. The method iterates over all allowed residue identities and all allowed rotamers for each residue. Tests its filters and for the subset that pass selects the lowest-energy residue by score12. Once the first hot-spot residue is identified it iterates over the next and so on until all hot-spot residues are placed. The output contains one file per residue identity combination.
<MapHotspot name="&string" clash_check=(0 &bool) file_name_prefix=(map_hs &string)> <Jumps> <Add jump=(&integer) explosion=(0 &integer) filter_name=(true_filter & string) allowed_aas=("ADEFIKLMNQRSTVWY" &string) scorefxn_minimize=(score12 &string) mover_name=(null &string)/> .... </Jumps> </MapHotspot>
- clash_check: whether the rotamer set is prescreened by the packer for clashes. Advised to be off always.
- file_name_prefix: Prefix for the output file names.
- explosion: How many chi angles to explode (giving more rotamers.
- allowed_aas: 1-letter codes for the allowed residues.
- scorefxn_minimize: which scorefxn to use during rb/sc minimization.
- mover_name: a mover (no restrictions) to run just before hot-spot residue minimization.
This is a special mover associated with PlaceSimultaneously, below. It carries out the rigid-body minimization towards all of the stubsets.
<PlacementMinimization name=( &string) minimize_rb=(1 &bool) host_chain=(2 &integer) optimize_foldtree=(0 &bool) cb_force=(0.5 &Real)> <StubSets> <Add stubfile=(&string)/> </StubSets> </PlacementMinimization>
Remodels loops using kinematic loop closure, including insertion and deletion of residues. Handles hotspot constraint application through these sequence changes.
<PlaceOnLoop name=( &string) host_chain=(2 &integer) loop_begin=(&integer) loop_end=(&integer) minimize_toward_stub=(1&bool) stubfile=(&string) score_high=(score12 &string) score_low=(score4L&string) closing_attempts=(100&integer) shorten_by=(&comma-delimited list of integers) lengthen_by=(&comma-delimited list of integers)/>
currently only minimize_toward_stub is avaible. closing attempts: how many kinematic-loop closure cycles to use. shorten_by, lengthen_by: by how many residues to change the loop. No change is also added by default.
At each try, a random choice of loop change will be picked and attempted. If the loop cannot close, failure will be reported.
Hotspot-based sidechain placement. This is the main workhorse of the
hot-spot centric method for protein-binder design. A paper describing
the method and a benchmark will be published soon. The "stub" (hot-spot
residue) is chosen at random from the provided stub set. To minimize
towards the stub (during placement), the user can define a series of
movers (StubMinimize tag) that can be combined with a weight. The weight
determines the strength of the backbone stub constraints that will
influence the mover it is paired with. Finally, a series of user-defined
design movers (DesignMovers tag) are made and the result is filtered
according to final_filter. There are two main ways to use PlaceStub:
- PlaceStub (default). Move the stub so that it's on top of the current scaffold position, then move forward to try to recover the original stub position.
- PlaceScaffold. Move the scaffold so that it's on top of the stub. You'll keep the wonderful hotspot interactions, but suffer from lever effects on the scaffold side. PlaceScaffold can be used as a replacement for docking by deactivating the "triage_positions" option.
<PlaceStub name=(&string) place_scaffold=(0 &bool) triage_positions=(1 &bool) chain_to_design=(2 &integer) score_threshold=(0.0 &Real) allowed_host_res=(&string) stubfile=(&string) minimize_rb=(0 &bool) after_placement_filter=(true_filter &string) final_filter=(true_filter &string) max_cb_dist=(4.0 &Real) hurry=(1 &bool) add_constraints=(1 &bool) stub_energy_threshold=(1.0 &Real) leave_coord_csts=(0 &bool) post_placement_sdev=(1.0 &Real)> <StubMinimize> <Add mover_name=(&string) bb_cst_weight=(10, &Real)/> </StubMinimize> <DesignMovers> <Add mover_name=(&string) use_constraints=(1 &bool) coord_cst_std=(0.5 &Real)/> </DesignMovers> <NotifyMovers> <Add mover_name=(&string)/> </NotifyMovers> </PlaceStub>
- place_scaffold: use PlaceScaffold instead of PlaceStub. this will place the scaffold on the stub's position by using an inverse rotamer approach.
- triage_positions: remove potential scaffold positions based on distance/cst cutoffs. speeds up the search, but must be turned off to use place_scaffold=1 as a replacement for docking (that is, when placing the scaffold at positions regardless of the input structure). triage_positions=1 triages placements based on whether the hotspot is close enough (within max_cb_distance) and whether the hotspot's vectors align with those of the host position (with some tolerance).
- allowed_host_res: A list of residues on the host scaffold where the stub may be placed. The list should be comma-seperated and may contain either rosetta indices (e.g. 123) or pdb indices (e.g. 123A). Note that allowed residues must still pass the triage step (if enabled) and other restrictions on which residues may be designed (e.g. not proline).
- stubfile: using a stub file other than the one used to make constraints. This is useful for placing stubs one after the other.
- minimize_rb: do we want to minimize the rb dof during stub placement? This will allow a previously placed stub to move a a little to accommodate the new stub. It's a good idea to use this with the previously placed stub adding its implied constraints.
- after_placement_filter: The name of a filter to be applied immediately after stub placement and StubMinimize movers, but before the DesignMovers run. Useful for quick sanity check on the goodness of the stub.
- final_filter: The name of a filter to be applied at the final stage of stub placement as the last test, after DesignMovers run. Useful, e.g., if we want a stub to form an hbond to a particular target residue.
- max_cb_dist: the maximum cb-cb distance between stub and potential host residue to be considered for placement
- hurry: use a truncated scorefxn for minimization. large speed increases, doesn't seem to be less accurate.
- add_constraints: should we apply the coordinate constraints to this stub?
- stub_energy_threshold: Decoys are only considered if the single-residue energy of the stub is below this value
- leave_coord_csts: should the coordinate constraints be left on when placement is completed successfully? This is useful if you plan on making moves after placement and you want the hotspot's placement to be respected. Note that designing a residue that has constraints on it is likely to yield crashes. You can use task operations to disallow that residue from designing.
- post_placement_sdev: relating to the item above. The lower the sdev (towards 0) the more stringent the constraint.
The available tracers are:
- protocols.ProteinInterfaceDesign.movers.PlaceStubMover - light-io documentation of the run
- STATS.PlaceStubMover - statistics on distances and score values during placement
- DEBUG.PlaceStubMover - more io intensive documentation
Submovers are used to determine what moves are used following stub
placement. For example, once a stub has been selected, a StubMinimize
mover can try to optimize the current pose towards that stub. A
DesignMover can be used to design the pose around that stub. Using
DesignMover submovers within PlaceStub (instead of RepackMinimize movers
outside PlaceStub) allows one to have a "memory" of which stub has been
used. In this way, a DesignMover can fail a filter without causing the
trajectory to completely reset. Instead, the outer PlaceStub mover will
select another stub, and the trajectory will continue.
There are two types of sub movers that can be called within the mover.
Without defining this submover, the protocol will simply perform a rigid body minimization as well as sc minimization of previous placed stubs in order to minimize towards the stub. Otherwise, a series of previously defined movers can be added, such as backrub, that will be applied for the stub minimization step. Before and after the list of stub minimize movers, there will be a rigid body minimization and a sc minimization of previously placed stubs. The bb_cst_weight determines how strong the constraints are that are derived from the stubs.
- mover_name: a user previously defined design or minimize mover.
- bb_cst_weight: determines the strength of the constraints derived from the stubs. This value is a weight on the cb_force, so larger values are stronger constraints.
Valid/sensible StubMinimize movers are:
Design movers are typically used once the stubs are placed to fill up the remaining interface, since placestub does not actually introduce any further design other than stub placement.
- mover_name: a user previously defined design or minimize mover.
- use_constraints: whether we should use coordinate constraints during this design mover
- coord_cst_std: the std of the coordinate constraint for this mover. The coord constraints are harmonic, and the force constant, k=1/std. The smaller the std, the stronger the constraint
Valid/sensible DesignMovers are:
Movers placed in this section will be notified not to repack the PlaceStub-placed residues. This is not necessary if placement movers are used in a nested (recursive) fashion, as the placement movers automatically notify movers nested in them of the hot-spot residues. Essentially, you want to make the downstream movers (you list under this section) aware about the placement decisions in this upstream mover. These movers will not be run at in this placestub, but will be subsequently aware of placed residues for subsequent use. Useful for running design moves after placestub is done, e.g., in loops. Put task awareness only in the deepest placestub mover (if PlaceStub is nested), where the final decisions about which residues harbour hot-spot residues is made. </UL>
Places hotspot residues simultaneously on a scaffold, rather than iteratively as in PlaceStub. It is faster therefore allowing more backbone sampling, and should be useful in placing more than 2 hotspots.
<PlaceSimultaneously name=(&string) chain_to_design=(2 &Integer) repack_non_ala=(1 &bool) optimize_fold_tree=(1 &bool) after_placement_filter=(true_filter &string) auction=(&string) stub_score_filter=(&string)/> <DesignMovers> <Add mover_name=(null_mover &string) use_constraints=(1 &bool) coord_cst_std=(0.5 &Real)/> </DesignMovers> <StubSets explosion=(0 &integer) stub_energy_threshold=(1.0 &Real) max_cb_dist=(3.0 &Real) cb_force=(0.5 &Real)> <Add stubfile=(& string) filter_name=(&string)/> </StubSets> <StubMinimize min_repeats_before_placement=(0&Integer) min_repeats_after_placement=(1&Integer)> <Add mover_name=(null_mover &string) bb_cst_weight=(10.0 &Real)/> </StubMinimize> <NotifyMovers> <Add mover_name=(&string)/> </NotifyMovers> </PlaceSimultaneously>
Most of the options are similar to PlaceStub above. Differences are mentioned below:
- explosion: which chis to explode
- stub_energy_threshold: after placement and minimization, what energy cutoff to use for each of the hotspots.
- after_placement_filter: After all individual placement filters pass, this is called (might be redundant?)
- min_repeats: How many minimization repeats (over StubMinimize movers) after placement
- movers defined under NotifyMovers will not be allowed to change the identities or rotamers of their hot-spot residues beyond what PlaceSimultaneously has decided on. This would be useful for avoiding losing the hot-spot residues in design movers after placement.
- filters specified in the StubSets section may be set during PlaceSimultaneously's execution by PlaceSimultaneously. This allows filters to be set specifically for placed hot-spot residues. One such filter is AtomicContact.
- rb_stub_minimization: a StubMinimization mover that will be run before PlaceSimultaneously.
- auction: and Auction mover that will be run before PlaceSimultaneously.
- stub_score_filter: a StubScoreFilter that will be run before PlaceSimultaneously.
rb_stub_minimization, auction and stub_score_filter allow the user to specify the first moves and filtering steps of PlaceSimultaneously before PlaceSimultaneously is called proper. In this way, a configuration can be quickly triaged if it isn't compatible with placement (through Auction's filtering). If the configuration passes these filters and movers then PlaceSimultaneously can be run within loops of docking and placement, until a design is identified that produces reasonable ddg and sasa.
This is actually a filter (and should go under FILTERS), but it is tightly associated with the placement movers, so it's placed here. A special filter that is associated with PlaceSimultaneouslyMover. It checks whether in the current configuration the scaffold is 'feeling' any of the hotspot stub constraints. This is useful for quick triaging of hopeless configuration.
<StubScore name=(&string) chain_to_design=(2 &integer) cb_force=(0.5 &Real)> <StubSets> <Add stubfile=(&string)/> </StubSets> </StubScore>
Note that none of the flags of this filter need to be set if PlaceSimultaneously is notified of it. In that case, PlaceSimultaneously will set this StubScore filter's internal data to match its own.
These movers are general and should work in most cases. They are usually not aware of things like interfaces, so may be most appropriate for monomers or basic tasks.
<FavorNativeResidue bonus=(1.5 &bool)/>
sets residue_type_constraint to the pose and sets the bonus to 1.5.
Does minimization over sidechain and/or backbone
<MinMover name="&string" scorefxn=(score12 &string) chi=(&bool) bb=(&bool) jump=(&string) type=(dfpmin_armijo_nonmonotone &string) tolerance=(0.01&Real)/>
Note that defaults are as for the MinMover class! Check MinMover.cc for the default constructor.
- MinMover is also sensitive to a MoveMap block, just like FastRelax.
- scorefxn: scorefunction to use during minimization
- chi: minimize sidechains?
- bb: minimize backbone?
- jump: comma-separated list of jumps to minimize over (be sure this jump exists!)
- type: minimizer type. linmin, dfpmin, dfpmin_armijo, dfpmin_armijo_nomonotone. dfpmin minimzers can also be used with absolute tolerance (add "atol" to the minimizer type). See https://www.rosettacommons.org/internal/doc/mini/all_else/minimization_overview.html for details. Note that you should almost ALWAYS manually set this to some variant of dfpmin, as the constructor default is linmin!
- tolerance: criteria for convergence of minimization. The default is very loose, it's recommended to specify something less than 0.01.
Performs minimization. Accepts TaskOperations via the task_operations option e.g.
to configure which positions are minimized. Options
chi=(&bool) and bb=(&bool)control sidechain or backbone freedom. Defaults to sidechain minimization. Options scorefxn, jump, type, and tolerance are passed to the underlying MinMover
Preforms the fast relax protocol.
<FastRelax name="&string" scorefxn=(score12 &string) repeats=(8 &int) task_operations=(&string, &string, &string > <MoveMap> <Chain number=(&integer) chi=(&bool) bb=(&bool)/> <Jump number=(&integer) setting=(&bool)/> <Span begin=(&integer) end=(&integer) chi=(&bool) bb=(&bool)/> </MoveMap>
- scorefxn (default "score12")
- repeats (default 8)
- task_operations ( default implicitly defined as InitializeFromCommandline, IncludeCurrent, and RestrictToRepacking )
The MoveMap is initially set to minimize all degrees of freedom. The movemap lines are read in the order in which they are written in the xml file, and can be used to turn on or off dofs. The movemap is parsed only at apply time, so that the foldtree and the kinematic structure of the pose at the time of activation will be respected.
Convert pose into poly XXX ( XXX can be any amino acid )
<MakePolyX name="&string" aa="&string" keep_pro=(0 &bool) keep_gly=(1 &bool) keep_disulfide_cys=(0 &bool) />
- aa ( default "ALA" ) using amino acid type for converting
- keep_pro ( default 0 ) Pro is not converted to XXX
- keep_gly ( default 1 ) Gly is not converted to XXX
- keep_disulfide_cys ( default 0 ) disulfide CYS is not converted to XXX
Repacks sidechains with user-supplied options, including TaskOperations
<PackRotamersMover name="&string" scorefxn=(&string) task_operations=(&string,&string,&string)/>
- scorefxn: scorefunction to use for repacking
- taskoperations: comma-separated list of task operations. These must have been previously defined in the TaskOperations section.
Applies a file-defined constraint set to the pose
<ConstraintSetMover name="&string" cst_file=(&string)/>
Multistate design of a protein interface. The target state is the bound (input) complex and the two competitor states are the unbound partners and the unbound, unfolded partners. Uses genetic algorithms to select, mutate and recombine among a population of starting designed sequences. See Havranek & Harbury NSMB 10, 45 for details.
<ProteinInterfaceMS name="&string" generations=(20 &integer) pop_size=(100 &integer) num_packs=(1 &integer) pop_from_ss=(0 &integer) numresults=(1 &integer) fraction_by_recombination=(0.5 &real) mutate_rate=(0.5 &real) boltz_temp=(0.6 &real) anchor_offset=(5.0 &real) checkpoint_prefix=("" &string) gz=(0 &bool) checkpoint_rename=(0 &bool) scorefxn=(score12 &string) unbound=(1 &bool) unfolded=(1&bool) input_is_positive=(1&bool) task_operations=(&comma-delimited list) unbound_for_sequence_profile=(unbound &bool) profile_bump_threshold=(1.0 &Real) compare_to_ground_state=(see below & bool) output_fname_prefix=("" &string)> <Positive pdb=(&string) unbound=(0&bool) unfolded=(0&bool)/> <Negative pdb=(&string) unbound=(0&bool) unfolded=(0&bool)/> . . . </ProteinInterfaceMS>
The input file (-s or -l) is considered as either a positive or negative state (depending on option, input_is_positive). If unbound and unfolded is true in the main option line, then the unbound and the unfolded states are added as competitors. Any number of additional positive and negative states can be added. Unbound and unfolded takes a different meaning for these states: if unbound is checked, the complex will be broken apart and the unbound state will be added. If unfolded is checked, then the unbound and unfolded protein will be added.
unbound_for_sequence_profile: use the unbound structure to generate an ala pose and prune out residues that are not allowed would clash in the monomeric structure. Defaults to true, if unbound is used as a competitor state. profile_bump_threshold: what bump threshold to use above. The difference between the computed bump and the bump in the ala pose is compared to this threshold.
compare_to_ground_state: by default, if you add states to the list using the Positive/Negative tags, then the energies of all additional states are zeroed at their 'best-score' values. This allows the user to override this behaviour. See code for details.
output_fname_prefix: All of the positive/negative states that are defined by the user will be output at the end of the run using this prefix. Each state will have its sequence changed according to the end sequence and then a repacking and scoring of all states will take place according to the input taskfactory.
Rules of thumb for parameter choice. The Fitness F is defined as:
F = Sum_+( exp(E/T) ) / ( Sum_+( exp(E/T) ) + Sum_-( exp(E/T) ) + Sum_+((E+anchor)/T) )
where Sum_-, and Sum_+ is the sum over the negative and positive states, respectively.
the values for F range from 1 (perfect bias towards +state) to 0 (perfect bias towards -state). The return value from the PartitionAggregateFunction::evaluate method is -F, with values ranging from -1 to 0, correspondingly. You can follow the progress of MSD by looking at the reported fitnesses for variants within a population at each generation. If all of the parameters are set properly (temperature etc.) expect to see a wide range of values in generation 1 (-0.99 - 0), which is gradually replaced by higher-fitness variants. At the end of the simulation, the population will have shifted to -1.0 - -0.5 or so.
For rules of thumb, it's useful to consider a two-state, +/- problem, ignoring the anchor (see below, that's tantamount to setting anchor very high) In this case FITNESS simplifies to:
F = 1/(exp( (dE)/T ) + 1 )
and the derivative is:
F' = 1/(T*(exp(-dE/T) + exp(dE/T) + 2)
where dE=E_+ - E_-
A good value for T would then be such where F' is sizable (let's say more than 0.05) at the dE values that you want to achieve between the positive and negative state. Since solving F' for T is not straightforward, you can plot F and F' at different temperatures to identify a reasonable value for T, where F'(dE, T) is above a certain threshold. If you're lazy like me, set T=dE/3. So, if you want to achieve differences of at least 4.5 e.u between positive and negative states, use T=1.5.
To make a plot of these functions use MatLab or some webserver, e.g., http://www.walterzorn.com/grapher/grapher_e.htm.
The anchor_offset value is used to set a competitor (negative) state at a certain energy above the best energy of the positive state. This is a computationally cheap assurance that as the specificity changes in favour of the positive state, the stability of the system is not overly compromised. Set anchor_offset to a value that corresponds to the amount of energy that you're willing to forgo in favour of specificity.
The "off rotamer" sidechain-only Monte Carlo sampler. For a rather large setup cost, individual moves can be made efficiently.
The underlying mover is still under development/benchmarking, so it may or may not work with backbone flexibility or amino acid identity changes.
<SidechainMC name="&string" scorefxn=(&string) ntrials=(&int) temperature=(&real) preserve_detailed_balance=(&bool) prob_uniform=(&real) prob_withinrot=(&real) prob_random_pert_current=(&real)/>
- ntrials: number of Monte Carlo trials to make per mover application - should be at least several thousand
- temperature: Boltzmann acceptance temperature - usually around 1.0
- prob_uniform: probability of a "uniform" move - all sidechain chis are uniformly randomized between -180° and 180°
- prob_withinrot: "within rotamer" - sidechain chis are picked from the Dunbrack distribution for the current rotamer
- prob_random_pert_current: "random perturbation of current position" - the current sidechain chis are perturbed ±10° from their current positions, biased by the resulting Dunbrack energy. Note that if your score function contains a Dunbrack energy term, this will result in double counting issues.
- - If the previous three probabilities do not add to 1.0, the remainder is assigned to a "between rotamer" move - a random rotamer of the current amino acid is chosen, and chi angles for that rotamer are selected from the Dunbrack distribution
This mover goes through each repackable/redesignable position in the pose, taking every permitted rotamer in turn, and evaluating the energy. Each position is then updated to the lowest energy rotamer. It does not consider coordinated changes at multiple residues, and may need several invocations to reach convergence.
In addition to the score function, the mover takes a list of task operations to specify which residues to consider. (See TaskOperations(Parser/RosettaScripts).)
<RotamerTrialsMover name="&string" scorefxn=(&string) task_operations=(&string,&string,&string) show_packer_task=(0 &bool) />
This mover goes through each repackable/redesignable position in the pose, taking every permitted rotamer in turn, minimizing it in the context of the current pose, and evaluating the energy. Each position is then updated to the lowest energy minimized rotamer. It does not consider coordinated changes at multiple residues, and may need several invocations to reach convergence.
In addition to the score function, the mover takes a list of task operations to specify which residues to consider. (See TaskOperations(Parser/RosettaScripts).)
<RotamerTrialsMinMover name="&string" scorefxn=(&string) task_operations=(&string,&string,&string)/>
Protein Interface Design Movers
These movers are at least somewhat specific to the design of protein-protein interfaces. Attempting to use them with, for example, protein-DNA complexes may result in unexpected behavior.
Computational 'affinity maturation' movers (highly experimental)
These movers are meant to take an existing complex and improve it by subtly changing all relevant degrees of freedom while optimizing the interactions of key sidechains with the target. The basic idea is to carry out iterations of relax and design of the binder, designing a large sphere of residues around the interface (to get second/third shell effects).
We start by generating high affinity residue interactions between the design and the target. The foldtree of the design is cut such that each target residue has a cut N- and C-terminally to it, and jumps are introduced from the target protein to the target residues on the design, and then the system is allowed to relax. This produces deformed designs with high-affinity interactions to the target surface. We then use the coordinates of the target residues to generate harmonic coordinate restraints and send this to a second cycle of relax, this time without deforming the backbone of the design. Example scripts are available in demo/rosetta_scripts/computational_affinity_maturation/
Introduce a random mutation in a position allowed to redesign to an allowed residue identity. Control the residues and the target identities through task_operations. This can be used in conjunction with GenericMonteCarlo to generate trajectories of affinity maturation.
<RandomMutation name=(&string) task_operations=(&string comma-separated taskoperations) scorefxn=(score12 &string)/>
Creates a disjointed foldtree where each selected residue has cuts N- and C-terminally to it.
<HotspotDisjointedFoldTree name=(&string) ddG_threshold=(1.0 &Real) resnums=("" comma-delimited list of residues &string) scorefxn=(score12 &string) chain=(2 &Integer) radius=(8.0 &Real)/>
- ddG_threshold: The procedure can look for hot-spot residues automatically by using this threshold. If you want to shut it off, specify a number above 100R.e.u. and set the residues in resnums
- chain: Anything other than chain 1 is untested, but should not be a big problem to make work.
- radius: what distance from the target protein constitutes interface. Used in conjunction with the ddG_threshold to set the target residues automatically.
Adds harmonic constraints to sidechain atoms of target residues (to be used in conjunction with HotspotDisjointedFoldTree). Save the log files as those would be necessary for the next stage in affinity maturation.
<AddSidechainConstraintsToHotspots name=(&string) chain=(2 &Integer) coord_sdev=(1.0 &Real) resnums=(comma-delimited list of residue numbers)/>
- resnums: the residues for which to add constraints. Notice that this list will be treated in addition to any residues that have cut points on either side.
- coord_sdev: the standard deviation on the coordinate restraints. The lower the tighter the restraints.
Does both centroid and full-atom docking<pre style="white-space:pre-wrap"><Docking name="&string" score_low=(score_docking_low &string) score_high=(score12 &string) fullatom=(0 &bool) local_refine=(0 &bool) movable_jumps=(1 &Integer vector) optimize_fold_tree=(1 &bool) conserve_foldtree=(0 &bool) design=(0 &bool) task_operations=("" comma-separated list)/>
- score_low is the scorefxn to be used for centroid-level docking
- score_high is the scorefxn to be used for full atom docking
- movable_jumps is a comma-separated list of jump numbers over which to carry out rb motions
- optimize_fold_tree: should DockingProtocol make the fold tree for this pose? This should be turned to 0 only if AtomTree is used
- conserve_foldtree: should DockingProtocol reset the fold tree to the input one after it is done
- design: Enable interface design for all chains downstream of the rb_jump
Performs something approximating r++ prepacking (but less rigorously without rotamer-trial minimization) by doing sc minimization and repacking. Separates chains based on jump_num, does prepacking, then reforms the complex. If jump_num=0, then it will NOT separate chains at all.
<Prepack name=(&string) scorefxn=(score12 &string) jump_number=(1 &integer) task_operations=(comma-delimited list)/>
RepackMinimize does the design/repack and minimization steps using different score functions as defined by the protocol. repack_partner1 (and 2) defines which of the partners to design. If no particular residues are defined, the interface is repacked/designs. If specific residues are defined, then a shell of residues around those target residues are repacked/designed and minimized. repack_non_ala decides whether or not to change positions that are not ala. Useful for designing an ala_pose so that positions that have been changed in previous steps are not redesigned. min_rigid_body minimize rigid body orientation. (as in docking)
<RepackMinimize name="&string" scorefxn_repack=(score12 &string) scorefxn_minimize=(score12 &string) repack_partner1=(1 &bool) repack_partner2=(1 &bool) design_partner1=(0 &bool) design_partner2=(1 &bool) interface_cutoff_distance=(8.0 &Real) repack_non_ala=(1 &bool) minimize_bb=(1 &bool * see below for more details) minimize_rb=(1 &bool) minimize_sc=(1 &bool) optimize_fold_tree=(1 & bool) task_operations=("" &string)> <residue pdb_num/res_num, see below/> </RepackMinimize>
- interface_cutoff_distance: Residues farther away from the interface than this cutoff will not be designed or minimized.
- repack_non_ala: if true, change positions that are not ala. if false, leave non-ala positions alone. Useful for designing an ala_pose so that positions that have been changed in previous steps are not redesigned.
- minimize_bb*: minimize back bone conformation? (*see line below)
- minimize_bb_ch1 and/or minimize_bb_ch2: allows to specify which chain(s)' backbone will be minimized
- minimize_rb: minimize rigid body orientation? (as in docking)
- optimize_fold_tree: see above
- task_operations: comma-separated list of task operations. This is a safer way of working with user-defined restrictions than automatic_repacking=false.
If no repack_partner1/2 options are set, you can specify repack=0/1 to control both. Similarly with design_partner1/2 and design=0/1
Same as for RepackMinimize with the addition that a list of target residues to be hbonded can be defined. Within a sphere of 'interface_cutoff_distance' of the target residues,the residues will be set to be designed.The residues that are allowed for design are restricted to hbonding residues according to whether donors (STRKWYQN) or acceptors (EDQNSTY) or both are defined. If residues have been designed that do not, after design, form hbonds to the target residues with energies lower than the hbond_energy, then those are turned to Ala.
<DesignMinimizeHbonds name=(design_minimize_hbonds &string) hbond_weight=(3.0 &float) scorefxn_design=(score12 &string) scorefxn_minimize=score12) donors="design donors? &bool" acceptors="design acceptors? &bool" bb_hbond=(0 &bool) sc_hbond=(1 &bool) hbond_energy=(-0.5 &float) interface_cutoff_distance=(8.0 &float) repack_partner1=(1 &bool) repack_partner2=(1 &bool) design_partner1=(0 &bool) design_partner2=(1 &bool) repack_non_ala=(1 &bool) min_rigid_body=(1 &bool) task_operations=("" &string)> <residue pdb_num="pdb residue and chain, e.g., 31B &string"/> <residue res_num="serially defined residue number, e.g., 212 &integer"/> </DesignMinimizeHbonds>
- hbond_weight: sets the increase (in folds) of the hbonding terms in each of the scorefunctions that are defined.
- bb_hbond: do backbone-backbone hbonds count?
- sc_hbond: do backbone-sidechain and sidechain-sidechain hbonds count?
- hbond_energy: what is the energy threshold below which an hbond is counted as such.
- repack_non_ala,task_operations:see RepackMinimize
- optimize_fold_tree: see DockingProtocol
Turns either or both sides of an interface to Alanines (except for prolines and glycines that are left as in input) in a sphere of 'interface_distance_cutoff' around the interface. Useful as a step before design steps that try to optimize a particular part of the interface. The alanines are less likely to 'get in the way' of really good rotamers.
<build_Ala_pose name=(ala_pose &string) partner1=(0 &bool) partner2=(1 &bool) interface_distance_cutoff=(8.0 &float) task_operations=("" &string)/>
- task_operations: see RepackMinimize, above
To be used after an ala pose was built (and the design moves are done) to retrieve the sidechains from the input pose that were set to Ala by build_Ala_pose. OR, to be used inside mini to recover sidechains after switching residue typesets. By default, sidechains that are different than Ala will not be changed, unless allsc is true. Please note that naming your mover "SARS" is almost certainly bad luck and strongly discouraged.
<SaveAndRetrieveSidechains name=(save_and_retrieve_sidechains &string) allsc=(0 &bool) task_operations=("" &string)/>
- task_operations: see RepackMinimize, above.
Sets up an atom tree for use with subsequent movers. Connects pdb_num on host_chain to the nearest residue on the neighboring chain. Connection is made through connect_to on host_chain pdb_num residue
<AtomTree name=(&string) docking_ft=(0 &bool) pdb_num/res_num=(see above) connect_to=(see below for defaults &string) anchor_res=(pdb numbering) connect_from=(see below) host_chain=(2 &integer)/>
- docking_ft: set up a docking foldtree? if this is set all other options are ignored.
- connect_to: Defaults to using the farthest carbon atom from the mainchain for each residue, e.g., Cdelta for Gln.
- connect_from: user can specify which atom the jump should start from. Currently only the pdb naming works. If not specified, the "optimal" atomic connection for anchor residue is chosen (that is to their functional groups).
Allows random spin around an axis that is defined by the jump. Works preferentially good in combination with a loopOver or best a GenericMonteCarlo and other movers together. Use SetAtomTree to define the jump atoms.
<SpinMover name=(&string) jump=(1 &integer)/>
Produces a set of rotamers from a given residue. Use after AtomTree to generate inverse rotamers of a given residue.
<TryRotamers name=(&string) pdb_num/res_num=(see above) automatic_connection=(1 &bool) jump_num=(1, &Integer) scorefxn=(score12 &string) explosion=(0 &integer) shove=(&comma-separated residue identities)/>
- explosion: range from 0-4 for how much rotamer explosion to include.
explosion in this context means EX_FOUR_HALF_STEP_STDDEVS (+/- 0.5,
1.0, 1.5, 2.0 sd)
- 1 = explode chi1
- 2 = explode chi1,2
- 3 = explode chi1,2,3
- 4 = explode chi1,2,3,4
- shove: use the shove atom-type (reducing the repulsive potential on backbone atoms) for a comma-separated list of residue identities. e.g., shove=3A,40B.
- automatic_connection: should TryRotamers set up the inverse-rotamer fold tree independently?
Do backrub-style backbone and sidechain sampling.
<Backrub name=(backrub &string) partner1=(0 &bool) partner2=(1 &bool) interface_distance_cutoff=(8.0 &Real) moves=(1000 &integer) sc_move_probability=(0.25 &float) scorefxn=(score12 &string) small_move_probability=(0.0 &float) bbg_move_probability=(0.25 &float) temperature=(0.6 &float) task_operations=("" &string)> <residue pdb_num="pdb residue and chain, e.g., 31B &string"/> <residue res_num="serially defined residue number, e.g., 212 &integer"/> <span begin="pdb or rosetta-indexed number, eg 10 or 12B &string" end="pdb or rosetta-indexed number, e.g., 20 or 30B &string"/> </Backrub>
With the values defined above, backrub will only happen on residues 31B, serial 212, and the serial span 10-20. If no residues and spans are defined then all of the interface residues on the defined partner will be backrubbed by default. Note that setting partner1=1 makes all of partner1 flexible. Adding segments has the effect of adding these spans to the default interface definition Temperature controls the monte-carlo accept temperature. A setting of 0.1 allows only very small moves, where as 0.6 (the default) allows more exploration. Note that small moves and bbg_moves introduce motions that, unlike backrub, are not confined to the region that is being manipulated and can cause downstream structural elements to move as well. This might cause large lever motions if the epitope that is being manipulated is a hinge. To prevent lever effects, all residues in a chain that is allowed to backrub will be subject to small moves. Set small_move_probability=0 and bbg_move_probability=0 to eliminate such motions.
bbg_moves are backbone-Gaussian moves. See The J. Chem. Phys., Vol. 114, pp. 8154-8158.
- task_operations: see RepackMimimize, above
Removes Hotspot BackboneStub constraints from all but the best_n residues, then reapplies constraints to only those best_n residues with the given cb_force constant. Useful to prune down a hotspot-derived constraint set to avoid getting multiple residues getting frustrated during minimization.
<BestHotspotCst name=(&string) chain_to_design=(2 &integer) best_n=(3 &integer) cb_force=(1.0 &Real)/>
- best_n: how many residues to cherry-pick. If there are fewer than best_n residues with constraints, only those few residues will be chosen.
- chain_to_design: which chain to reapply constraints
- cb_force: Cbeta force to use when reapplying constraints
dumps a pdb. Recommended ONLY for debuggging as you can't change the name of the file during a run.
<DumpPdb name=(&string) fname=(dump.pdb &string)/>
DomainAssembly (Not tested thoroughly)
Do domain-assembly sampling by fragment insertion in a linker region. frag3 and frag9 specify the fragment-file names for 9-mer and 3-mer fragments.
<DomainAssembly name=(&string) linker_start_(pdb_num/res_num, see above) linker_end_(pdb_num/res_num, see above) frag3=(&string) frag9=(&string)/>
Finds loops in the current pose and loads them into the DataMap for use by subsequent movers (eg - LoopRemodel)
<LoopFinder name="&string" interface=(1 &Size) ch1=(0 &bool) ch2=(1 &bool) min_length=(3 &Integer) max_length=(1000 &Integer) iface_cutoff=(8.0 &Real) resnum/pdb_num=(see above) CA_CA_distance=(15.0 &Real) mingap=(1 &Size)/>
- interface: only keep loops at the interface? value = jump number to use (0 = keep all loops)
- ch1: keep loops on partner 1
- ch2: keep loops on partner 2
- resnum/pdb_num: if specified, loop finder only takes the loops that are within the defined CA_CA_distance. If this option is occluded, it extracts loops given by chain1, chain2 and interface options.So occlude if you don't know the residue.
- CA_CA_distance: cutoff for CA distances between defined residue and any interface loop residue
- iface_cutoff: distance cutoff for interface
- min_length: minimum loop length (inclusive)
- max_length: maximum loop length (inclusive)
- mingap: minimum gap size between loops (exclusive, so mingap=1 -> single-residue gaps are disallowed). Setting this to 0 will almost certainly cause problems!
Perturbs and/or refines a set of user-defined loops. Useful to sample a variety of loop conformations.
<LoopRemodel name="&string" auto_loops=(0 &bool) loop_start_(pdb_num/res_num, see above) loop_end_(pdb_num/res_num, see above) hurry=(0 &bool) cycles=(10 &Size) protocol=(ccd &string) perturb_score=(score4L &string) refine_score=(score12 &string) perturb=(0 &bool) refine=(1 &bool) design=(0 &bool)/>
- auto_loops: use loops defined by previous LoopFinder mover? (overrides loop_start/end)
- loop_start_pdb_num: start of the loop
- loop_end_pdb_num: end of the loop
- hurry: 1 = fast sampling and minimization. 0 = Use full-blown loop remodeling.
- cycles: if hurry=1, number of modeling cycles to perform. Each cycle is 50 steps of mc-accepted kinematic loop modeling, followed by a repack of the surrounding area. if hurry=0 and protocol=remodel, this controls the max number of times to attempt closure with the remodel protocol (low cycles might leave chain breaks!)
- protocol: Only activated if hurry=0. Choose "kinematic", "ccd", or "remodel". ccd appears to work best at the moment.
- perturb_score: scorefunction to use for loop perturbation
- refine_score: scorefunction to use for loop refinement
- perturb: perturb loops for greater diversity?
- refine: refine loops?
- design: perform design during loop modeling?
Introduces a disulfide bond into the interface. The best-scoring
position for the disulfide bond is selected from among the residues
targets. This could be quite time-consuming, so specifying a small number of residues in
targets is suggested.
If no targets are specified on either interface partner, all
residues on that partner are considered when searching for a disulfide.
Thus including only a single residue for
targets results in a disulfide from that residue to the best position across the interface from it, and omitting the
targets param altogether finds the best disulfide over the whole interface.
Disulfide bonds created by this mover, if any, are guaranteed to pass a DisulfideFilter.
<DisulfideMover name="&string" targets=(&string)/>
- targets: A comma-seperated list of residue numbers. These can be either with rosetta numbering (raw integer) or pdb numbering (integer followed by the chain letter, eg '123A'). Targets are required to be located in the interface. Default: All residues in the interface. Optional
Adds constraints to the pose using the constraints' read-from-file functionality.
<ConstraintSetMover name=(&string) cst_file=(&string)/>
cst_file: the file containing the constraint data. e.g.,:
... CoordinateConstraint CA 1 CA 380 27.514 34.934 50.283 HARMONIC 0 1 CoordinateConstraint CA 1 CA 381 24.211 36.849 50.154 HARMONIC 0 1 ...
Change a single residue to a different type. For instance, mutate Arg31 to an Asp.
<MutateResidue name=(&string) target=(&string) new_res=(&string) />
- target The location to mutate (eg 31A (pdb number) or 177 (rosetta index)). Required
- new_res The name of the residue to introduce. This string should correspond to the ResidueType::name() function (eg ASP). Required
Test a design mover for its recapitulation of the native sequence. Similar to SequenceRecovery filter below, except that this mover encompasses a design mover more specifically.
<InterfaceRecapitulation name=(&string) mover_name=(&string)/>
The specified mover needs to be derived from either DesignRepackMover or PackRotamersMover base class and to to have the packer task show which residues have been designed. The mover then computes how many residues were allowed to be designed and the number of residues that have changed and produces the sequence recapitulation rate. The pose at parse-time is used for the comparison.
VLB (aka Variable Length Build)
Under development! All kudos to Andrew Ban of the Schief lab for making the Insert, delete, and rebuild segments of variable length. This mover will ONLY work with non-overlapping segments!
IMPORTANT NOTE!!!!: VLB uses its own internal tracking of ntrials! This allows VLB to cache fragments between ntrials, saving a very significant amount of time. But each ntrial trajectory will also get ntrials extra internal VLB apply calls. For example, "-jd2:ntrials 5" will cause a maximum of 25 VLB runs (5 for each ntrial). Success of a VLB move will break out of this internal loop, allowing the trajectory to proceed as normal.
<VLB name=(&string) scorefxn=(string)> <VLB TYPES GO HERE/> </VLB> Default scorefxn is score4L. If you use another scorefxn, make sure the chainbreak weight is > 0. Do not use a full atom scorefxn with VLB!
There are several move types available to VLB, each with its own options. The most popular movers will probably be SegmentRebuild and SegmentInsert.
<SegmentInsert left=(&integer) right=(&integer) ss=(&string) aa=(&string) pdb=(&string) side=(&string) keep_bb_torsions=(&bool)/> Insert a pdb into an existing pose. To perform a pure insertion without replacing any residues within a region, use an interval with a zero as the left endpoint. e.g. [0, insert_after_this_residue]. If inserting before the first residue the Pose then interval = [0,0]. If inserting after the last residue of the Pose then interval = [0, last_residue]. *ss = secondary structure specifying the flanking regions, with a character '^' specifying where the insert is to be placed. Default is L^L. *aa = amino acids specifying the flanking regions, with a character '^' specifying insert. *keep_bb_torsions = attempt to keep the a few torsions from around the insert. This should be false for pure insertions. (default false) *side = specifies insertion on its N-side ("N"), C-side ("C") or decide randomly between the two (default "RANDOM"). Random is only random on parsing, not per ntrial
<SegmentRebuild left=(&integer) right=(&integer) ss=(&string) aa=(&string)/> Instruction to rebuild a segment. Can also be used to insert a segment, by specifying secondary structure longer than the original segment.
Very touchy. Watch out. <SegmentSwap left=(&integer) right=(&integer) pdb=(&string)/> instruction to swap a segment with an external pdb
<Bridge left=(&integer) right=(&integer) ss=(&string) aa=(&string)/> connect two contiguous but disjoint sections of a Pose into one continuous section
<ConnectRight left=(&integer) right=(&integer) pdb=(&string)/> instruction to connect one PDB onto the right side of another
<GrowLeft pos=(&integer) ss=(&string) aa=(&string)/> Use this for n-side insertions, but typically not n-terminal extensions unless necessary. It does not automatically cover the additional residue on the right endpoint that needs to move during n-terminal extensions due to invalid phi torsion. For that case, use the SegmentRebuild class replacing the n-terminal residue with desired length+1.
<GrowRight pos=(&integer) ss=(&string) aa=(&string)/> instruction to create a c-side extension
For more information, see the various BuildInstructions in src/protocols/forge/build/
EnzRepackMinimize, similar in spirit to RepackMinimize mover, does the design/repack followed by minimization of a protein-ligand (or TS model) interface with enzyme design style constraints (if present, see AddOrRemoveMatchCsts mover) using specified score functions and minimization dofs. Only design/repack or minimization can be done by setting appropriate tags. A shell of residues around the ligand are repacked/designed and/or minimized. If constrained optimization or cst_opt is specified, ligand neighbors are converted to Ala, minimization performed, and original neighbor sidechains are placed back.
<EnzRepackMinimize name="&string" scorefxn_repack=(score12 &string) scorefxn_minimize=(score12 &string) cst_opt=(0 &bool) repack_only=(0 &bool) design=(0 &bool) constraints=(1 &bool) fix_catalytic=(0 &bool) minimize_rb=(1 &bool) minimize_bb=(0 &bool) minimize_sc=(1 &bool) minimize_lig=(0 & bool) min_in_stages=(0 &bool) backrub=(0 &bool) cycles=(1 &integer)/>
- scorefxn_repack: scorefunction to use for repack/design (defined in the SCOREFXNS section, default=score12)
- scorefxn_minimize: similarly, scorefunction to use for minimization (default=score12)
- cst_opt: perform minimization of enzdes constraints with a reduced scorefunction and in a polyAla background. (default= 0)
- repack_only: if true, only repack sidechains without changing sequence. (default =0)
- design: optimize sequence of residues spatially around the ligand (detection of neighbors need to be specified in the flagfile or resfile, default=0)
- minimize_bb: minimize back bone conformation of backbone segments that surround the ligand (contiguous neighbor segments of >3 residues are automatically chosen, default=0)
- minimize_sc: minimize sidechains (default=1)
- minimize_rb: minimize rigid body orientation of ligand (default=1)
- minimize_lig: minimize ligand internal torsion degrees of freedom (allowed deviation needs to be specified by flag, default =0)
- min_in_stages: first minimize non-backbone dofs, followed by backbone dofs only, and then everything together (default=0)
- constraints: use enzdes style constraints during repack/minimization (default=1)
- fix_catalytic: fix catalytic residues during repack/minimization (default =0)
- cycles: number of cycles of repack-minimize (default=1 cycle) (Note: In contrast to the enzyme_design application, all cycles use the provided scorefunction.)
- backrub:use backrub to minimize (default=0).
Add or remove enzyme-design style pairwise (residue-residue) geometric constraints to/from the pose. A cstfile specifies these geometric constraints, which can be supplied in the flags file (-enzdes:cstfile) or in the mover tag (see below).
The "-run:preserve_header" option should be supplied on the command line to allow the parser to read constraint specifications in the pdb's REMARK lines. (The "-enzdes:parser_read_cloud_pdb" also needs to be specified for the parser to read the matcher's CloudPDB default output format.)
<AddOrRemoveMatchCsts name="&string" cst_instruction=( "void", "&string") cstfile="&string" keep_covalent=(0 &bool) accept_blocks_missing_header=(0 &bool) fail_on_constraints_missing=(1 &bool)/>
- cst_instruction: 1 of 3 choices - "add_new" (read from file), "remove", or "add_pregenerated" (i.e. if enz csts existed at any point previosuly in the protocol add them back)
- cstfile: name of file to get csts from (can be specified here if one wants to change the constraints, e.g. tighten or relax them, as the pose progresses down a protocol.)
- keep_covalent: during removal, keep constraints corresponding to covalent bonds between protein and ligand intact (default=0).
- accept_blocks_missing_header: allow more blocks in the cstfile than specified in header REMARKs (see enzdes documentation for details, default=0)
- fail_on_constraints_missing: see enzdes documentation for details (default=1).
Movers for ligand docking
These movers replace the executable for ligand docking and provide greater flexibility to the user in customizing the docking protocol. An example XML file for ligand docking is found here (link forthcoming). The movers below are listed in the order found in the old executable.
<StartFrom name="&string" chain="&string"/> <Coordinates x=(&float) y=(&float) z=(&float)/> </StartFrom>
Provide a list of XYZ coordinates. One starting coordinate will be chosen at random and the specified chain will be recentered at this location.
<Translate name="&string" chain="&string" distribution=[uniform|gaussian] angstroms=(&float) cycles=(&int)/>
The Translate mover is for performing a course random movement of a small molecule in xyz-space. This movement can be anywhere within a sphere of radius specified by "angstroms". The chain to move should match that found in the PDB file (a 1-letter code). "cycles" specifies the number of attempts to make such a movement without landing on top of another molecule. The first random move that does not produce a positive repulsive score is accepted. The random move can be chosen from a uniform or gaussian distribution. This mover uses an attractive-repulsive grid for lightning fast score lookup.
<Rotate name="&string" chain="&string" distribution=[uniform|gaussian] degrees=(&int) cycles=(&int)/>
The Rotate mover is for performing a course random rotation throughout all rotational degrees of freedom. Usually 360 is chosen for "degrees" and 1000 is chosen for "cycles". Rotate accumulates poses that pass an attractive and repulsive filter, and are different enough from each other (based on an RMSD filter). From this collection of diverse poses, 1 pose is chosen at random. "cycles" represents the maximum # of attempts to find diverse poses with acceptable attractive and repulsive scores. If a sufficient # of poses are accumulated early on, less rotations then specified by "cycles" will occur. This mover uses an attractive-repulsive grid for lightning fast score lookup.
<SlideTogether name="&string" chain="&string"/>
The initial translation and rotation may move the ligand to a spot too far away from the protein for docking. Thus, after an initial low resolution translation and rotation of the ligand it is necessary to move the small molecule and protein into close proximity. If this is not done then high resolution docking will be useless. Simply specify which chain to move. This mover then moves the small molecule toward the protein 2 angstroms at a time until the two clash (evidenced by repulsive score). It then backs up the small molecule. This is repeated with decreasing step sizes, 1A, 0.5A, 0.25A, 0.125A.
<HighResDocker name="&string" repack_every_Nth=(&int) scorefxn="string" movemap_builder="&string" />
The high res docker performs cycles of rotamer trials or repacking, coupled with small perturbations of the ligand(s). The "movemap_builder" describes which side-chain and backbone degrees of freedom exist. The Monte Carlo mover is used to decide whether to accept the result of each cycle. Ligand and backbone flexibility as well as which ligands to dock are described by LIGAND_AREAS provided to INTERFACE_BUILDERS, which are used to build the movemap according the the XML option.
<FinalMinimizer name="&string" scorefxn="&string" movemap_builder=&string/>
Do a gradient based minimization of the final docked pose. The "movemap_builder" makes a movemap that will describe which side-chain and backbone degrees of freedom exist.
<InterfaceScoreCalculator name=(string) chains=(comma separated chars) scorefxn=(string) native=(string)/>
InterfaceScoreCalculator calculates a myriad of ligand specific scores and appends them to the output file. After scoring the complex the ligand is moved 1000 Å away from the protein. The model is then scored again. An interface score is calculated for each score term by subtracting separated energy from complex energy. If a native structure is specified, 4 additional score terms are calculated:
- ligand_centroid_travel. The distance between the native ligand and the ligand in our docked model.
- ligand_radious_of_gyration. An outstretched conformation would have a high radius of gyration. Ligands tend to bind in outstretched conformations.
- ligand_rms_no_super. RMSD between the native ligand and the docked ligand.
- ligand_rms_with_super. RMSD between the native ligand and the docked ligand after aligning the two in XYZ space. This is useful for evaluating how much ligand flexibility was sampled.
Movers for ligand design
These movers work in conjunction with ligand docking movers. An example XML file for ligand design is found here (link forthcoming). These movers presuppose the user has created or acquired a fragment library. Fragments have incomplete connections as specified in their params files. Combinatorial chemistry is the degenerate case in which a core fragment has several connection points and all library fragments have only one connection point.
<GrowLigand name="&string" chain="&string"/>
Randomly connects a fragment from the library to the growing ligand. The connection point for connector atom1 must specify that it connects to atoms of connector atom2's type, and visa versa.
<AddHydrogens name="&string" chain="&string"/>
Saturates the incomplete connections with H. Currently the length of these created H-bonds is incorrect. H-bonds will be the same length as the length of a bond between connector atoms 1 and 2 should be.
Each filter definition has the following format:
<"filter_name" name="&string" ... confidence=(1 &Real)/>
where "filter_name" belongs to a predefined set of possible filters that the parser recognizes and are listed below, name is a unique identifier for this mover definition and then any number of parameters that the filter needs to be defined.
If confidence is 1.0, then the filter is evaluated as in predicate logic (T/F). If the value is less than 0.999, then the filter is evaluated as fuzzy, so that it will return True in (1.0 - confidence) fraction of times it is probed. This should be useful for cases in which experimental data are ambiguous or uncertain.
Do two residues have any pair of atoms within a cutoff distance? Somewhat more subtle than ResidueDistance (which works by neighbour atoms). Iterates over all atom types of a residue, according to the user specified restrictions (sidechain, backbone, protons)
<AtomicContact name=(&string) residue1=(&integer) residue2=(&integer) sidechain=1 backbone=0 protons=0 distance=(4.0 &integer)/>
Some movers (e.g., PlaceSimultaneously) can set a filter's internal residue on-the-fly during protocol operation. To get this behaviour, do not specify residue2.
A special filter that allows movers to set its value (pass/fail). This value can then be used in the protocol together with IfMover to control the flow of execution depending on the success of the mover. Currently, none of the movers uses this filter.
Reports to tracers which residues are repackable/designable according to use-defined task_operations. Useful for automatic interface detection (use the ProteinInterfaceDesign task operation for that). The residue number that are reported are pdb numbering.
<DesignableResidues name=(&string) task_operations=(comma-separated list) designable=(1 &bool) packable=(0 &bool)/>
Looks for voids at protein/protein interfaces using Will Sheffler's packstat. The number reported is the difference in the holes score between bound/unbound conformations. Be sure to set the -holes:dalphaball option!
<InterfaceHoles name=(&string) jump=(1 &integer) threshold=(200 &integer)/>
- jump: Which jump to calculate InterfaceHoles across?
- threshold: return false if above this number
Calculates the Calpha RMSD over a user-specified set of residues. Superimposition is optional. Selections are additive, so choosing a chain, and individual residue, and span will result in RMSD calculation over all residues selected. If no residues are selected, the filter uses all residues in the pose. Use -in:file:native <filename> to choose an alternate reference pose.
<Rmsd name=(&string) chains=("" &string) threshold=(5 &integer) superimpose=(1 &bool)> <residue res/pdb_num=(see above) /> <span begin_(res/pdb_num)=("" &integer) end_(res/pdb_num)=(""&integer)/> </Rmsd>
- chains: list of chains (eg - "AC") to use for RMSD calculation
- residue: add a new leaf for each residue to include (can use rosetta index or pdb number)
- span: contiguous span of residues to include (rosetta index or pdb number)
- threshold: accept at this rmsd or lower
- superimpose: perform superimposition before rmsd calculation?
Calculates the fraction sequence recovery of a pose compared to a reference pose. This is similar to InterfaceRecapitulation mover above, but does not require a design mover. Instead, the user can provide a list of task operations that describe which residues are designable in the pose.
<SequenceRecovery name=(&string) rate_threshold=(0.0 &Real) task_operations=(comma-delimited list of task_operations) />
- rate_threshold: what is an acceptable recovery rate?
The reference pose against which the recovery rate will be computed can be defined using the -in:file:native command-line flag. If that flag is not defined, the starting pose will be used as a reference.
True if all residues in the interface are more than <distance> residues from the N or C terminus. If fails, reports how far failing residue was from the terminus. If passes, returns "1000"
<TerminusDistance name=(&string) jump_number=(1 &integer) distance=(5 &integer)/>
- jump_number: Which jump to use for calculating the interface?
- distance: how many residues must each interface residue be from a terminus? (sequence distance)
Computes the binding energy for the complex and if it is below the threshold returns true. o/w false. Useful for identifying complexes that have poor binding energy and killing their trajectory.
<Ddg name=(ddg &string) scorefxn=(score12 &string) threshold=(-15 &float) jump=(1 &Integer) repeats=(1 &Integer) repack=(true &bool)/>
- jump specifies which chains to separate. Jump=1 would separate the chains interacting across the first chain termination, jump=2, second etc.
- repeats: averages the calculation over the number of repeats. Note that ddg calculations show noise of about 1-1.5 energy units, so averaging over 3-5 repeats is recommended for many applications.
- repack: Should the complex be repacked in the bound and unbound states prior to taking the energy difference? If false, the filter turns to a dG evaluator. If repack=false repeats should be turned to 1, b/c the energy evaluations converge very well with repack=false
Computes the number of residues in the interface specific by jump_number and if it is above threshold returns true. o/w false. Useful as a quick and ugly filter after docking for making sure that the partners make contact.
<ResInInterface name=(riif &string) residues=(20 &integer) jump_number=(1 &integer)/>
This filter checks whether residues defined by res_num/pdb_num are hbonded with as many hbonds as defined by partners, where each hbond needs to have at most energy_cutoff energy.
<HbondsToResidue name=(hbonds_filter &string) partners="how many hbonding partners are expected &integer" energy_cutoff=(-0.5 &float) backbone=(0 &bool) sidechain=(1 &bool) res_num/pdb_num=(&string - see above)>
- backbone: should we count backbone-backbone hbonds?
- sidechain: should we count backbone-sidechain and sidechain-sidechain hbonds?
Approximates the Boltzmann probability for the occurrence of a rotamer. Residues to be tested are defined using a task_factory (set all inert residues to no repack). A first-pass alanine scan looks at which residues contribute substantially to binding affinity. Then, the rotamer set for each of these residues is taken, each rotamer is imposed on the pose, the surrounding shell is repacked and minimized and the energy is summed to produce a Boltzmann probability. Can be computed in both the bound and unbound state.
This is apparently a good discriminator between designs and natives, with many designs showing high probabilities for their highly contributing rotamers in both the bound and unbound states.
The filter also reports a modified value for the complex ddG. It computes the starting ddG and then reduces from this energy a fraction of the interaction energy of each residue the rotamer probability of which is below a certain threshold. The interaction energy is computed only for the residue under study and its contacts with residues on another chain.
<RotamerBoltzmannWeight name=(&string) task_operations=(comma-delimited list) radius=(6.0 &Real) jump=(1 &Integer) unbound=(1 &bool) ddG_threshold=(1.5 &Real) scorefxn=(score12 &string) temperature=(0.8 &Real) energy_reduction_factor=(0.5 &Real) repack=(1&bool) skip_ala_scan=(0 &bool)> <??? threshold_probability=(&Real)/> . . . </RotamerBoltzmannWeight>
- task_operations: define what residues to work on. Set all residues not to be tested to no repack.
- radius: repacking radius around the rotamer under consideration. These residues will be repacked and minimized for each rotamer tested
- jump: what jump to look at
- unbound: test the bound or unbound state?
- ddG_threshold: a further filter on which designs to test. Only residues that contribute more than the stated amount to binding will be tested.
- temperature: the scaling factor for the Boltzmann calculations. This is actually kT rather than just T.
- energy_reduction_factor: by what factor of the interaction energy to reduce the ddG.
- repack: repack in the bound and unbound states before reporting binding energy values (ddG). If false, don't repack (dG).
- skip_ala_scan: do not conduct first-pass ala scan. Instead compute only for residues that are allowed to repack in the task factory.
- ??? any of the three-letter codes for residues (TRP, PHE, etc.)
Computes the interface sasa and if it's **higher** than threshold passes.
<Sasa name=(sasa_filter &string) threshold=(800 &float) hydrophobic=(0&bool) polar=(0&bool) jump=(1 &integer)/>
- hydrophobic: compute hydrophobic-only SASA?
- polar: compute polar_only SASA?
- jump: across which jump to compute total SASA?
hydrophobic/polar are computed by discriminating each atom into polar (acceptor/donor or polar hydrogen) or hydrophobic (all else) and summing the delta SASA over each category. Notice that at this point only total sasa can be computed across jumps other than 1. Trying to compute hydrophobic or polar sasa across any other jump will cause an exit during parsing.
Filter for poses that place a neighbour of the types specified around a target residue in the partner protein.
<NeighborType name=(neighbor_filter &string) "res_num/pdb_num see above" distance=(8.0 &Real)> <Neighbor type=(&3-letter aa code)/> </NeighborType>
How many residues are within an interaction distance of target_residue across the interface. When used with neighbors=1 this degenerates to just checking whether or not a residue is at the interface.
<ResidueBurial name=(&string) "res_num/pdb_num see above" distance=(8.0 &Real) neighbors=(1 &Integer)/>
Maximum number of buried unsatisfied H-bonds allowed. If a jump number is specified (default=1), then this number is calculated across the interface of that jump. If jump_num=0, then the filter is calculated for a monomer. Note that #unsat for monomers is often much higher than 20. Notice that water is not assumed in these calculations.
<BuriedUnsatHbonds name=(&string) jump_number=(1 &Size) cutoff=(20 &Size)/>
What is the distance between two residues? Based on each residue's neighbor atom (usually Cbeta)
<ResidueDistance name=(&string) res1_"res_num/pdb_num see above" res2_"resnum/pdb_num" distance=(8.0 &Real)/>
Tests the energy of a particular residue. If whole_interface is set to 1, it computes all the energies for the interface residues defined by the jump_number and the interface_distance_cutoff. Helpful for post-design analyses.
<EnergyPerResidue name=(energy_per_res_filter &string) scorefxn=(score12 &string) score_type=(total_score &string) pdb_num/res_num(see above) energy_cutoff=(0.0 &float) whole_interface=(0 &bool) jump_number=(1 &int) interface_distance_cutoff=(8.0 &float)/>
Computes the energy of a particular score type for the entire pose and if that energy is lower than threshold, returns true.
<ScoreType name=(score_type_filter &string) scorefxn=(score12 &string) score_type=(&string) threshold=(&float)/>
Substitutes Ala for each interface position separately and measures the difference in ddg compared to the starting structure. The filter always returns true. The output is only placed in the .report file. Repeats causes multiple ddg calculations to be averaged, giving better converged values.
<AlaScan name=(&string) scorefxn=(score12 &string) jump=(1 &Integer) interface_distance_cutoff=(8.0 &Real) partner1=(0 &bool) partner2=(1 &bool) repeats=(1 &Integer) repack=(1 &bool)/>
- scorefxn: scorefxn to use for ddg calculations
- jump: which jump to use for ddg calculations. If jump=0 the complex is not taken apart and only the dG of the mutation is computed.
- interface_distance_cutoff: how far apart counts as an interface (in angstroms)
- partner1: report ddGs for everything upstream of the jump
- partner2: report ddGs for everything downstream of the jump
- repack: repack in the bound and unbound states before reporting the energy (ddG). When false, don't repack (dG).
Require a disulfide bond between the interfaces to be possible. 'Possible' is taken fairly loosely; a reasonable centroid disulfide score is required (fairly close CB atoms without too much angle strain).
targets are considered when searching for a disulfide bond. As for DisulfideMover, if no residues are specified from one interface partner all residues on that partner will be considered.
<DisulfideFilter name="&string" targets=(&string)/>
- targets: A comma-seperated list of residue numbers. These can be either with rosetta numbering (raw integer) or pdb numbering (integer followed by the chain letter, eg '123A'). Targets are required to be located in the interface. Default: All residues in the interface. Optional
Computes the fractional interface delta_sasa for a ligand on a ligand-protein interface and checks to see if it is *between* the lower and upper threshold. A DSasa of 1 means ligand is totally buried (loses all it's accessible surface area), 0 means totally accessible (loses none upon interface formation).
<LigDSasa name=(&string) lower_threshold=(0.0 &float) upper_threshold=(1.0 &float)/>
Compares the DSasa of two specified atoms and checks to see if one is greater or less than other. This is useful for figuring out whether a ligand is oritented in the correct way (i.e. whether in the designed interface one atom is more/less exposed than another)
<DiffAtomBurial name=(&string) res1_res_num/res1_pdb_num=(0, see res_num/pdb_num convention) res2_res_num/res2_pdb_num=(0, see convention) atomname1=(&string) atomname2=(&string) sample_type=(&string)/>
- res1_res_num/res2_res_num: conventional pose numbering of rosetta, res_num=0 will mean ligand (Assuming there is only one ligand)
- res1_pdb_num/res2_pdb_num: conventional pdb_numbering such as 100A (residue 100 chain A), 1X (residue 1 chain X e.g. of ligand)
- atomname1/atomname2: atomnames of the respective atoms
- sample_type: "more" or "less". "more" means Dsasa1>Dsasa2 (atom1 is more buried than atom2); "less" means Dsasa1<Dsasa2 (atom1 is less buried than atom2)
Calculates interface energy across a ligand-protein interface taking into account (or not) enzdes style cst_energy.
<LigInterfaceEnergy name=(&string) scorefxn=(&string) include_cstE=(0 &bool) jump_number=(last_jump &integer) energy_cutoff=(0.0 &float)/>
include_cstE=1 will *not* subtract out the cst energy from interface energy. jump_number defaults to last jump in the pose (assumed to be associated with ligand). energy should be less than energy_cutoff to pass.
Calculates scores of a pose e.g. a ligand-protein interface taking into account (or not) enzdes style cst_energy. Residues can be accessed by res_num/pdb_num or their constraint id. One and only one of res/pdb_num, cstid, and whole_pose tags can be specified. energy should be less than cutoff to pass.
<EnzScore name=(&string) scorefxn=(&string, score12) whole_pose= (&bool,0) score_type = (&string) res_num/pdb_num = (see convetion) cstid = (&string) energy_cutoff=(0.0 &float)/>
- cstid: string corresponding to cst_number+template (A or B, as in remarks and cstfile blocks). each enzdes cst is between two residues; A or B allows access to the corresponding residue in a given constraint e.g. cstid=1A means cst #1 template A (i.e. for the 1st constraint, the residue corresponding to the block that is described first in the cstfile and its corresponding REMARK line in header), cstid=4B (for the 4th constraint, the residue that is described second in the cstfile block and its REMARK line in header).
- score_type: usual rosetta score_types; cstE will calculate enzdes style constraint energy
- whole_pose: calculate total scores for whole pose
Calculates delta_energy or RMSD of protein residues in a protein-ligand interface when the ligand is removed and the interface repacked. RMSD of a subset of these repacked residues (such as catalytic residues) can be accessed by setting the appropriate tags.
<RepackWithoutLigand name=(&string) scorefxn=(&string, score12) target_res = (&string) target_cstids = (&string) energy_threshold=(0.0 &float) rms_threshold=(0.5 &float)/>
- target_cstids: comma-separated list corresponding to cstids (see EnzScore for cstid format)
- target_res: comma-separated list corresponding to res_nums/pdb_nums (following usual convention) OR "all_repacked" which will include all repacked neighbors of the ligand (the repack shell).
- rms_threshold: maximum allowed RMS of repacked region; (i.e. RMSD<rms_threshold filter passes, else fails)
- energy_threshold: delta_Energy allowed (i.e. if E(with_ligand)-E(no_ligand) < threshold, filter passes else fails)
<HeavyAtom name="&string" chain="&string" heavy_atom_limit=(&int)/>
Stop growing this designed ligand once we reach this heavy atom limit
<CompleteConnections name="&string" chain="&string"/>
Are there any connections left to fulfill? If not, stop growing ligand
<[name_of_this_ligand_area] chain="&string" cutoff=(float) add_nbr_radius=[true|false] all_atom_mode=[true|false] minimize_ligand=[float] Calpha_restraints=[float] high_res_angstroms=[float] high_res_degrees=[float] tether_ligand=[float] />
LIGAND_AREAS describe parameters specific to each ligand, useful for multiple ligand docking studies. "cutoff" is the distance in angstroms from the ligand an amino-acid's C-beta atom can be and that residue still be part of the interface. "all_atom_mode" can be true or false. If all atom mode is true than if any ligand atom is within "cutoff" of the C-beta atom, that residue becomes part of the interface. If false, only the ligand neighbor atom is used to decide if the protein residue is part of the interface. "add_nbr_radius" increases the cutoff by the size of the ligand neighbor atom's radius specified in the ligand .params file. This size can be adjusted to represent the size of the ligand, without entering all_atom_mode. Thus all_atom_mode should not be used with add_nbr_radius.
Ligand minimization can be turned on by specifying a minimize_ligand value greater than 0. This value represents the size of one standard deviation of ligand torsion angle rotation (in degrees). By setting Calpha_restraints greater than 0, backbone flexibility is enabled. This value represents the size of one standard deviation of Calpha movement, in angstroms.
During high resolution docking, small amounts of ligand translation and rotation are coupled with cycles of rotamer trials or repacking. These values can be controlled by the 'high_res_angstrom' and 'high_res_degrees' values respectively. A tether_ligand value (in angstroms) will constrain the ligand so that multiple cycles of small translations don't add up to a large translation.
<[name_of_this_interface_builder] ligand_areas=(comma separated list of predefined ligand_areas) extension_window=(int)/>
An interface builder describes how to choose residues that will be part of a protein-ligand interface. These residues are chosen for repacking, rotamer trials, and backbone minimization during ligand docking. The initial XML parameter is the name of the interface_builder (for later reference). "ligand_areas" is a comma separated list of strings matching LIGAND_AREAS described previously. Finally 'extension_window' surrounds interface residues with residues labeled as 'near interface'. This is important for backbone minimization, because a residue's backbone can't really move unless it is part of a stretch of residues that are flexible.
<[name_of_this_movemap_builder] sc_interface=(string) bb_interface=(string) minimize_water=[true|false]/>
A movemap builder constructs a movemap. A movemap is a 2xN table of true/false values, where N is the number of residues your protein/ligand complex. The two columns are for backbone and side-chain movements. The MovemapBuilder combines previously constructed backbone and side-chain interfaces (see previous section). Leave out bb_interface if you do not want to minimize the backbone. The minimize_water option is a global option. If you are docking water molecules as separate ligands (multi-ligand docking) these should be described through LIGAND_AREAS and INTERFACE_BUILDERS.