TaskOperations (RosettaScripts)
This section defines instances of the TaskOperation class hierarchy when used in the context of the Parser/RosettaScripts. They become available in the DataMap.
TaskOperation classes are used by TaskFactory to configure the behavior of PackerTask when it is generated on-demand for routines that use the "packer" to reorganize/mutate sidechains. When used by certain Movers (at present, the PackRotamersMover and its subclasses), the TaskOperations control what happens during packing, usually by restriction "masks."
Example
... <TASKOPERATIONS> <ReadResfile name=rrf/> <ReadResfile name=rrf2 filename=resfile2/> <PreventRepacking name=NotBeingUsedHereButPresenceOkay/> <RestrictResidueToRepacking name=restrict_Y100 resnum=100/> <RestrictToRepacking name=rtrp/> <OperateOnCertainResidues name=NoPackNonProt> <PreventRepackingRLT/> <ResidueLacksProperty property=PROTEIN/> </OperateOnCertainResidues> </TASKOPERATIONS> ... <MOVERS> <PackRotamersMover name=packrot scorefxn=sf task_operations=rrf,NoPackNonProt,rtrp,restrict_Y100/> </MOVERS> ...
In the rosetta code, the TaskOperation instances are registered with and then later created by a TaskOperationFactory. The factory calls parse_tag() on the base class virtual function, with no result by default. However, some TaskOperation classes (e.g. OperateOnCertainResidues and ReadResfile above) do implement parse_tag, and therefore their behavior can be configured using additional options in the "XML"/Tag definition.
General TaskOperations
List of current TaskOperation classes in the core library (* indicates use-at-own-risk/not sufficiently tested/still under development):
Position/Identity Specification
ReadResfile
Read a resfile. If a filename is given, read from that file. Otherwise, read the file specified on the commandline with -packing:resfile.
<ReadResfile name=(&string) filename=(&string) />
ReadResfileFromDB
Lookup the resfile in the supplied relational database. This is useful for processing different structures with different resfiles in the same protocol. The database db should have a table table_name with the following schema:
CREATE TABLE <table_name> ( tag TEXT, resfile TEXT, PRIMARY KEY(tag));
When this task operation is applied, it tries to look up the resfile string associated with the tag defined by
JobDistributor::get_instance()->current_job()->input_tag()
This task operation takes the following parameters:
- db=("resfiles.db3" &string)
- table=("resfiles" &string)
When db_mode is "sqlite3", db is the name of the sqlite3 database file. When db_mode is "mysql", then compile with extras=mysql and supply -mysql:host, -mysql:password, and -mysql:port to the command line, then db is the name of the database on the MySQL server.
RestrictChainToRepacking
Do not allow design in a particular chain
<RestrictChainToRepacking name=(&string) chain=(1 &int)/>
RestrictToRepacking
Only allow residues to repack. No design.
<RestrictToRepacking name=(&string) />
RestrictResidueToRepacking
Restrict a single residue to repacking. No design.
<RestrictResidueToRepacking name=(&string) resnum=(0 &integer)/>
RestrictResiduesToRepacking
Restrict a string of comma-delimited residues to repacking. No design.
<RestrictResiduesToRepacking name=(&string) residues=(0 &integer "," separated)/>
PreventRepacking
Do not allow repacking at all for the specified residue. Freezes residues.
<PreventRepacking name=(&string) resnum=(0 &int) />
PreventResiduesFromRepacking
Restrict a string of residues to repacking. No design. Use comma-delimited list of residues
<PreventResiduesFromRepacking name=(&string) residues=(0 &integer,"," separated)/>
NoRepackDisulfides
Do not allow disulfides to repack.
<NoRepackDisulfides name=(&string) />
DesignAround
Designs in shells around a user-defined list of residues. Restricts all other residues to repacking. <DesignAround name=(&string) design_shell=(8.0 &real) resnums=(comma-delimited list) repack_shell=(8.0&Real) allow_design=(1 &bool)/>
- resnums can be a list of pdb numbers, such as 291B,101A.
- repack_shell: what sphere to pack around the target residues. Must be at least as large as design_shell.
- allow_design: allow design in the sphere, else restrict to repacking.
RestrictToTermini
Restrict to repack only one or both termini on the specified chain.
<RestrictToTermini chain=(1 &size) repack_n_terminus=(1 &bool) repack_c_terminus=(1 &bool) />
LayerDesign
Design residues with selected amino acids depending on the enviroment: layer. The layer of each residue is assigned as core, boundary, or surface, which are defined by accessible surface area of mainchain + CB. If resfile is read before calling this operation, this operation is not applied for the residues defined by PIKAA. Note that this task is not ligand compatible (remove ligand prior to use).
Selected amino acid types for each layer:
- Core
- Loop: AFILPVWY
- Strand: FIL VWY
- Helix: AFIL VWY ( P only at the beginning of helix )
- HelixCapping: DNST
- Boundary
- Loop: ADEFGIKLNPQRSTVWY
- Strand: DEF IKLN QRSTVWY
- Helix: ADE IKLN QRSTVWY ( P only at the beginning of helix )
- HelixCapping: DNST
- Surface
- Loop: DEGHKNPQRST
- Strand: DE HKN QRST
- Helix: DE HKN QRST ( P only at the beginning of helix )
- HelixCapping: DNST
Option list
- layer ( default "core_boundary_surface" ) : layer to be designed, other ex. core_surface means only design core and surface layer
- use_original_non_designed_layer ( default, 0 ) : use original sequence for non designed layers, otherwise the residues on the layers turned into Ala
- pore_radius ( default 2.0) : pore radius for calculating accessible surface area
- core ( default 20.0) : residues of which asa is < core are defined as core
- surface ( default 40.0) : residues of which asa is > surface are defined as surface
RestrictToInterface
Restricts to interface between two protein chains along a specified jump and with a given radius.
<RestrictToInterface name=(&string) rb_jump=(&integer, 1) distance=(&Real, 8.0) />
RestrictToInterfaceVector
Restricts the task to residues defined as interface by core/pack/task/operation/util/interface_vector_calculate.cc Calculates the residues at an interface between two protein chains or jump. The calculation is done in the following manner. First the point graph is used to find all residues within some big cutoff(CB_dist_cutoff) of residues on the other chain. For these residues near the interface, two metrics are used to decide if they are actually possible interface residues. The first metric is to itterate through all the side chain atoms in the residue of interest and check to see if their distance is less than the nearby atom cutoff (nearby_atom_cutoff), if so then they are an interface residue. If a residue does not pass that check, then two vectors are drawn, a CA-CB vector and a vector from CB to a CB atom on the neighboring chain. The dot product between these two vectors is then found and if the angle between them (vector_angle_cutoff) is less than some cutoff then they are classified as interface. The vector cannot be longer than some other distance (vector_dist_cutoff).
There are two ways of using this task, first way is to use jumps:
<RestrictToInterfaceVector name=(& string) jump=(1 & int,int,int... ) CB_dist_cutoff=(10.0 & Real) nearby_atom_cutoff=(5.5 & Real) vector_angle_cutoff=(75.0 & Real) vector_dist_cutoff=(9.0 & Real)/>
- jump - takes a comma separated list of jumps to find the interface between, will find the interface across all jumps defined
OR you can use chains instead
<RestrictToInterfaceVector name=(& string) chain1_num=(1 & int) chain2_num=(2 & int) CB_dist_cutoff=(10.0 & Real) nearby_atom_cutoff=(5.5 & Real) vector_angle_cutoff=(75.0 & Real) vector_dist_cutoff=(9.0 & Real)/>
- chain1_num - chain number of the chain on one side of the interface
- chain2_num - chain on the other side of the interface from chain1
Common tags, see descriptions above:
- CB_dist_cutoff - distance, should keep between 8.0 and 15.0
- nearby_atom_cutoff - distance, should be between 4.0 and 8.0
- vector_angle_cutoff - angle in degrees, should be between 60 and 90
- vector_dist_cutoff - distance, should be between 7.0 and 12.0
ProteinInterfaceDesign
Restricts to the task that is the basis for protein-interface design.
- repack_chain1=(1, &bool)
- repack_chain2=(1, &bool)
- design_chain1=(0, &bool)
- design_chain2=(1, &bool)
- allow_all_aas=(0 &bool)
- design_all_aas=(0 &bool)
- interface_distance_cutoff=(8.0, &Real)
- jump=(1&integer) chains below, and above the jump are called chain1 and chain2 above.
DetectProteinLigandInterface
Setup packer task based on the detect design interface settings from enzyme design.
<DetectProteinLigandInterface name=(&string) cut1=(6.0 &Real) cut2=(8.0 &Real) cut3=(10.0 &Real) cut4=(12.0 &Real) design=(1 &bool) resfile=("" &string)/>
The task will set to design all residues with a Calpha within cut1 of the ligand (specifically the last ligand), or within cut2 of the ligand, where the Calpha-Cbeta vector points toward the ligand. Those residues within cut3 or within cut4 pointing toward the ligand will be set to repack. All others will be set to be fixed. Setting design to false will turn off design at all positions.
If resfile is specified, the listed resfile will be read in the settings therein applied to the task. Any positions set to "AUTO" (and only those set to AUTO) will be subjected the detect design interface procedure as described above. Note that design=0 will turn off design even for positions where it is permitted in the resfile (use "cut1=0.0 cut2=0.0 design=1" to allow design at resfile-permitted positions while disabling design at all AUTO positions).
SetCatalyticResPackBehavior
Ensures that catalytic residues as specified in a match/constraint file do not get designed. If no option is specified the constrained residues will be set to repack only (not design).
If the option fix_catalytic_aa=1 is set in the tag (or on the commandline), catalytic residues will be set to non-repacking.
If the option -enzdes::ex_catalytic_rot <number> is active, the extra_sd sampling for every chi angle of the catalytic residues will be according to <number>, i.e. one can selectively oversample the catalytic residues
RestrictAbsentCanonicalAAS
Restrict design to user-specified residues. If resnum is left as 0, the restriction will apply throughout the pose.
<RestrictAbsentCanonicalAAS name=(&string) resnum=(0 &integer) keep_aas=(&string) />
DisallowIfNonnative
Restrict design to not include a residue as an possibility in the task at a position unless it is the starting residue. If resnum is left as 0, the restriction will apply throughout the pose.
<DisallowIfNonnative name=(&string) resnum=(0 &integer) disallow_aas=(&string) />
- disallow_aas takes a string of one letter amino acid codes, no separation needed. For example disallow_aas=GCP would prevent Gly, Cys, and Pro from being designed unless they were the native amino acid at a position.
This task is useful when you are designing in a region that has Gly and Pro and you do not want to include them at other positions that aren't already Gly or Pro.
ThreadSequence
Threads a fasta-formatted sequence onto the source pdb. target_sequence=(&string), start_res=(1&int)
and call PackRotamersMover. Notice that this only packs the threaded sequence, holding everything else constant. The target sequence can contain 'wildcard' positions that are then designed. For instance:
target_sequence="TFYxxxHFS" will thread the two specified tripeptides and allow design in the intervening tripeptide. The string "TFY HFS" has the same effect as the one above.
JointSequence
<JointSequence use_current=(true &bool) use_native=(false &bool) filename=(&string) native=(&string) use_natro=(false &bool) />
Prohibit designing to residue identities that aren't found at that position in any of the listed structures:
- use_current - Use residue identities from the current structure (input pose to apply() of the taskoperation)
- use_native - Use residue identities from the structure listed with -in:file:native
- filename - Use residue identities from the listed file
- native - Use residue identities from the listed file
If use_natro is true, the task operation also adds the rotamers from the native structures (use_native/native) in the rotamer library.
RestrictDesignToProteinDNAInterface
Restrict Design and repacking to protein residues around the defined DNA bases
<RestrictDesignToProteinDNAInterface name=(&string) dna_defs=(chain.pdb_num.base) base_only=(1, &bool) z_cutoff=(0.0, &real) />
- dna_defs: dna positions to design around, separated by comma (e.g. C.405.THY,C.406.GUA). The definitions should refer only to one DNA chain, the complementary bases are automatically retrieved. Bases are ADE, CYT, GUA, THY. The base (and its complementary) in the starting structure will be mutated according to the definition, if not prevented from another task operation.
- base_only: only residues within reach of the DNA bases are considered
- z_cutoff: limit the protein interface positions to the ones that have a projection of their distance vector on DNA axis lower than this threshold. It prevents designs that are too far away along the helical axis
Rotamer Specification
InitializeFromCommandline
Reads commandline options. For example, -ex1 -ex2 (does not read resfile from command line options) This taskoperation will complain about an unimplemented method, but you can safely ignore the message.
<InitializeFromCommandline name=(&string) />
IncludeCurrent
Includes current rotamers (eg - from input pdb) in the rotamer set. These rotamers will be lost after a packing run, so they are only effective upon initial loading of a pdb!
<IncludeCurrent name=(&string) />
ExtraRotamersGeneric
During packing, extra rotamers can be used to increase sampling. Use this TaskOperation to specify for all residues at once what extra rotamers should be used. Note: The extrachi_cutoff is used to determine how many neighbors a residue must have before the extra rotamers are applied. For example of you want to apply extra rotamers to all residues, set extrachi_cutoff=0. See the Extra Rotamer Commands section on the resfile syntax and convention (http://graylab.jhu.edu/Rosetta.Developer.Documentation/all_else/d1/d97/resfiles.html) page for additional details.
<ExtraRotamersGeneric name=(&string) ex1=(0 &boolean) ex2=(0 &boolean) ex3=(0 &boolean) ex4=(0 &boolean) ex1aro=(0 &boolean) ex2aro=(0 &boolean) ex1aro_exposed=(0 &boolean) ex2aro_exposed=(0 &boolean) ex1_sample_level=(7 &Size) ex2_sample_level=(7 &Size) ex3_sample_level=(7 &Size) ex4_sample_level=(7 &Size) ex1aro_sample_level=(7 &Size) ex2aro_sample_level=(7 &Size) ex1aro_exposed_sample_level=(7 &Size) ex2aro_exposed_sample_level=(7 &Size) exdna_sample_level=(7 &Size) extrachi_cutoff=(18 &Size)/>
RotamerExplosion
Sample residue chi angles much more finely during packing. Currently hardcoded to use three 1/3 step standard deviation.
Note: This might actually need to be called as RotamerExplosionCreator in the xml
<RotamerExplosionCreator name=(&string) resnum=(&Integer) chi=(&Integer) />
LimitAromaChi2
Prevent to use the rotamers of PHE, TYR and HIS that have chi2 far from 90.
- chi2max ( default 110.0 ) : max value of chi2 to be used
- chi2min ( default 70.0 ): min value of chi2 to be used
AddLigandMotifRotamers
Using a library of protein-ligand interactions, identify possible native-like interactions to the ligand and add those rotamers to the packer, possibly with a bonus.
Required command line flags:
- A
- B
- C
- D
- E
- F
- G
Example:
<AddLigandMotifRotamers name=(&string)/>
Since it only makes sense to run AddLigandMotifRotamers once (it takes a very long time), I have not made the options parseable. I can do that if there's interest--the code would take a minute--but I can't see an advantage. You can however read in multiple weight files in order to do motif weight ramping. -Matt
Packer Behavior Modification
ProteinLigandInterfaceUpweighter
Specifically upweight the strength of the protein-ligand interaction energies by a given factor.
<ProteinLigandInterfaceUpweighter name=(&string) interface_weight=(1.0 &Real)/>
Development/Testing
InitializeExtraRotsFromCommandline
Under development and untested. Use at your own risk.
SetRotamerCouplings
Under development and untested. Use at your own risk.
AppendRotamer
Under development and untested. Use at your own risk.
AppendRotamerSet
Under development and untested. Use at your own risk.
PreserveCBeta
Under development and untested. Use at your own risk.
RestrictYSDesign
Restrict amino acid choices during design to Tyr and Ser. This is similar to the restricted YS alphabet used by Sidhu's group during in vitro evolution experiments. Under development and untested. Use at your own risk.
Per Residue Specification
OperateOnCertainResidues Operation
Allows specification of Residue Level Task Operations based on residue properties specified with ResFilters.
Example:
<OperateOnCertainResidues name=PROTEINnopack> <PreventRepackingRLT/> //Only one Residue level task per OperateOnCertainResidues block <ResidueHasProperty property=PROTEIN/> //Only one ResFilter per OperateOnCertainResidues block </OperateOnCertainResidues>
Residue Level TaskOperations
Use these as a subtag for special OperateOnCertainResidues TaskOperation. Only one may be used per OperateOnCertainResidues
RestrictToRepackingRLT
PreventRepackingRLT
AddBehaviorRLT
RestrictAbsentCanonicalAASRLT*
ResFilters
Use these as a subtag for special OperateOnCertainResidues TaskOperation. Only one may be used per OperateOnCertainResidues
ResidueHasProperty
settings:
- property. e.g. DNA, PROTEIN, POLAR, CHARGED (one only)
ResidueLacksProperty
ChainIs
Selects a set of residues based on their chain letter in the original PDB.
- chain: defaults to "A"
ChainIsnt
Excludes a set of residues based on their chain letter in the original PDB.
- chain: defaults to "A"
ResidueName3Is
- name3: eg arg,lys,gua (one only)
ResidueName3Isnt
ResidueIndexIs
- indices
comma-separated list of rosetta residue indices (1 to nres) e.g. indices=1,2,3,4,33
ResidueIndexIsnt
ResiduePDBIndexIs
- indices: comma-separated list of chain.pos identifiers e.g. indices=A.2,C.100,D.-10
ResiduePDBIndexIsnt
Currently Undocumented
The following TaskOperations are available through RosettaScripts, but are not currently documented. See the code (particularly the respective parse_tag() and apply() functions) for details. (Some may be undocumented as they are experimental/not fully functional.)
- AddRigidBodyLigandConfs
- OptCysHG
- OptH
- PreventChainFromRepacking
- ReadResfileAndObeyLengthEvents
- ReplicateTask
- RestrictByCalculators
- RestrictConservedLowDdg
- RestrictNonSurfaceToRepacking
- RestrictToNeighborhood
- SeqprofConsensus
- WatsonCrickRotamerCouplings
Residue Level TaskOperations:
- DisallowIfNonnativeRLT