Each filter definition has the following format:
<"filter_name" name="&string" ... confidence=(1 &Real)/>
where "filter_name" belongs to a predefined set of possible filters that the parser recognizes and are listed below, name is a unique identifier for this mover definition and then any number of parameters that the filter needs to be defined.
If confidence is 1.0, then the filter is evaluated as in predicate logic (T/F). If the value is less than 0.999, then the filter is evaluated as fuzzy, so that it will return True in (1.0 - confidence) fraction of times it is probed. This should be useful for cases in which experimental data are ambiguous or uncertain.
|Table of contents|
Returns true. Useful for defining a mover without using a filter. Can be explicitly specified with the name "true_filter".
Always returns false. Can be explicitly specified with the name "false_filter".
Filters which are useful for combining, modifying or working with other filters and movers.
This is a special filter that uses previously defined filters to construct a compound logical statement with AND, OR, XOR, NAND and NOR operations. By making compound statements of compound statements, esssentially all logical statements can be defined.
<CompoundStatement name=(&string)> <OPERATION filter_name=(true_filter &string)/> <.... </CompoundStatement>
where OPERATION is any of the operations defined in CAPS above.Note that the operations are performed in the order that they are defined. No precedence rules are enforced, so that any precedence has to be explicitly written by making compound statements of compound statements.Also note that the first OPERATION is ignored, and the value of the first filter is simply assigned to the filter's results.
This is a special filter that calculates a weighted sum based on previously defined filters.
<CombinedValue name=(&string) threshold=(0.0 &Real)> <Add filter_name=(&string) factor=(1.0 &Real) temp=(&Real)/> <.... </CombinedValue>
By default, the value is a straight sum of the calculated values (not the logical results) of the listed filters. A multiplicative weighting factor for each filter can be specified with the factor parameter. (As a convenience, temp can be given instead of factor, which will divide the filter value, rather than multiply it.)
For truth value contexts, the filter evaluates to true if the weighted sum if less than or equal to the given threshold.
Apply a given mover to the pose before calculating the results from another filter
<MoveBeforeFilter name=(&string) mover=(&string) filter=(&string)/>
Note that, like all filters, MoveBeforeFilter cannot change the input pose - the results of the submover will only be used for the subfilter calculation and then discarded.
Also note that caution must be exercised when using a computationally expensive mover or a mover/filter pair which yields stochastic results. The result of the mover is not cached, and will be recomputed for each call to apply(), report() or report_sm().
A special filter that allows movers to set its value (pass/fail). This value can then be used in the protocol together with IfMover to control the flow of execution depending on the success of the mover. Currently, none of the movers uses this filter.
Computes the energetic strain in a bound monomer.
<BindingStrain name=(&string) threshold=(3.0 &Real) task_operations=(comma-delimited list of operations &string) scorefxn=(score12 &string) relax_mover=(null &string) jump=(1 &Int)/>
- threshold: how much strain to allow.
- task_operations: define the repacked region. Whatever you choose, the filter will make sure you don't design and that the packer task is initialized from the commandline.
- scorefxn: what scorefxn to use for repacking and total-score evaluations.
- relax_mover: after repacking in the unbound state, what mover (if at all) to use to further relax the structure (MinMover?)
- jump: along which jump to dissociate the complex?
Dissociates the complex and takes the unbound energy. Then, repacks and calls the relax mover, and measures the unbound relaxed energy. Reports the strain as unbound - unbound_relaxed. Potentially useful to relieve strain in binding.
Computes the difference in a filter's value compared to the input structure
<Delta name=(&string) upper=(1 &bool) lower=(0 &bool) range=(0 &Real) filter=(&string) unbound=(0 &bool) jump=(see below &Int)/>
- upper/lower: the threshold is upper/lower? Use both if the threshold is a range within these numbers.
- range: how much above/below the baseline to allow?
- filter: the name of a predefined filter for evaluation.
- unbound: translates the partners by 10000A before evaluating the baseline and the filters. Allows evaluation of the unbound pose.
- jump: if unbound is set, this can be used to set the jump along which to translate.
The filter is evaluated at parsetime and its internal value (through report_sm) is saved. At apply time, the filter's report_sm is called again, and the delta is evaluated.
Tests the energy of a particular residue. If whole_interface is set to 1, it computes all the energies for the interface residues defined by the jump_number and the interface_distance_cutoff. Helpful for post-design analyses.
<EnergyPerResidue name=(energy_per_res_filter &string) scorefxn=(score12 &string) score_type=(total_score &string) pdb_num/res_num(see above) energy_cutoff=(0.0 &float) whole_interface=(0 &bool) jump_number=(1 &int) interface_distance_cutoff=(8.0 &float)/>
Computes the energy of a particular score type for the entire pose and if that energy is lower than threshold, returns true.
<ScoreType name=(score_type_filter &string) scorefxn=(score12 &string) score_type=(total_score &string) threshold=(&float)/>
What is the distance between two residues? Based on each residue's neighbor atom (usually Cbeta)
<ResidueDistance name=(&string) res1_"res_num/pdb_num see above" res2_"resnum/pdb_num" distance=(8.0 &Real)/>
Do two residues have any pair of atoms within a cutoff distance? Somewhat more subtle than ResidueDistance (which works by neighbour atoms). Iterates over all atom types of a residue, according to the user specified restrictions (sidechain, backbone, protons)
<AtomicContact name=(&string) residue1=(&integer) residue2=(&integer) sidechain=1 backbone=0 protons=0 distance=(4.0 &integer)/>
Some movers (e.g., PlaceSimultaneously) can set a filter's internal residue on-the-fly during protocol operation. To get this behaviour, do not specify residue2.
Are two specified atoms within a cutoff distance? More specific than AtomicContact (which reports if any atom is within the cutoff) or ResidueDistance (which works by neighbor atoms only). Residues can be specified either with pose numbering, or with PDB numbering, with the chain designation (e.g. 34B). One of atomname or atomtype (but not both) needs to be specified for each partner. If atomtype is specified for one or both atoms, the closest distance of all relevant combinations is used.
<AtomicDistance name=(&string) residue1=(&string) atomname1=(&string) atomtype1=(&string) residue2=(&sring) atomname2=(&string) atomtype2=(&string) distance=(4.0 &integer)/>
True if all residues in the interface are more than <distance> residues from the N or C terminus. If fails, reports how far failing residue was from the terminus. If passes, returns "1000"
<TerminusDistance name=(&string) jump_number=(1 &integer) distance=(5 &integer)/>
- jump_number: Which jump to use for calculating the interface?
- distance: how many residues must each interface residue be from a terminus? (sequence distance)
What is the average degree connectivity of a subset of residues? Found to be useful for discriminating non-interacting designs from natural complexes. Apparently, many non-interacting designs use surfaces that are poorly embedded in the designed monomer, a feature that can be easily captured by this simple metric.
<AverageDegree name=(&string) threshold=(0&Real) distance_threshold=(&10.0) task_operations=(comma-delimited list)/>
- threshold: how many residues need to be on average in the sphere of each of the residues under scrutiny.
- distance_threshold: Size of sphere around each residue under scrutiny.
- task_operations: define residues under scrutiny (all repackable residues).
Looks for voids at protein/protein interfaces using Will Sheffler's packstat. The number reported is the difference in the holes score between bound/unbound conformations. Be sure to set the -holes:dalphaball option!
<InterfaceHoles name=(&string) jump=(1 &integer) threshold=(200 &integer)/>
- jump: Which jump to calculate InterfaceHoles across?
- threshold: return false if above this number
Computes the number of residues in the interface specific by jump_number and if it is above threshold returns true. o/w false. Useful as a quick and ugly filter after docking for making sure that the partners make contact.
<ResInInterface name=(riif &string) residues=(20 &integer) jump_number=(1 &integer)/>
Filter for poses that place a neighbour of the types specified around a target residue in the partner protein.
<NeighborType name=(neighbor_filter &string) "res_num/pdb_num see above" distance=(8.0 &Real)> <Neighbor type=(&3-letter aa code)/> </NeighborType>
Computes the interface sasa and if it's **higher** than threshold passes.
<Sasa name=(sasa_filter &string) threshold=(800 &float) hydrophobic=(0&bool) polar=(0&bool) jump=(1 &integer)/>
- hydrophobic: compute hydrophobic-only SASA?
- polar: compute polar_only SASA?
- jump: across which jump to compute total SASA?
hydrophobic/polar are computed by discriminating each atom into polar (acceptor/donor or polar hydrogen) or hydrophobic (all else) and summing the delta SASA over each category. Notice that at this point only total sasa can be computed across jumps other than 1. Trying to compute hydrophobic or polar sasa across any other jump will cause an exit during parsing.
How many residues are within an interaction distance of target_residue across the interface. When used with neighbors=1 this degenerates to just checking whether or not a residue is at the interface.
<ResidueBurial name=(&string) "res_num/pdb_num see above" distance=(8.0 &Real) neighbors=(1 &Integer)/>
Compute a filter's value relative to a different pose's structure. This is useful for cases in which you want to know the effects of a mutation on different poses. An alignment of the pose being read from disk is made to the currently active pose (through the user defined alignment), and applies any sequence changes to the pose read from disk, while repacking a shell around each mutation. It can then apply a relax mover, report a filter's evaluation and dump a scored pose to disk.
<RelativePose name=(&string) pdb_name=(&string) filter=(&string) relax_mover=(null &string) dump_pose=("" &string) alignment=(&string; see below) scorefxn=(score12 &string) packing_shell=(8.0 &Real) thread=(1 &bool)>
- pdb_name: which is the reference pose to read from disk.
- filter: which filter to apply.
- relax_mover: which relax mover to apply after threading.
- dump_pose: optional- should we dump the pose after threading?
- alignment: what segments to align between the disk-pose and the current pose. defaults to aligning from 1->nres. To specify something different use the following format: 3A:1B,4A:2B,5A:6B, meaning align disk pose's 3A-5A to 1B,2B, and 6B on the current pose. Only the aligned segments are searched for mutations between the disk and current pose for threading. All else is ignored.
- scorefxn: used for packing during threading and for scoring the dumped pose.
- packing_shell: radius of shell around each residue to repack after threading. The more use use the longer the simulation.
- thread: Normally you'd want this to be true. This is not the case only if you're estimating baselines for the disk pose before doing an actual run.
Calculates the Calpha RMSD over a user-specified set of residues. Superimposition is optional. Selections are additive, so choosing a chain, and individual residue, and span will result in RMSD calculation over all residues selected. If no residues are selected, the filter uses all residues in the pose. Use -in:file:native <filename> to choose an alternate reference pose.
<Rmsd name=(&string) chains=("" &string) threshold=(5 &integer) superimpose=(1 &bool)> <residue res/pdb_num=(see above) /> <span begin_(res/pdb_num)=("" &integer) end_(res/pdb_num)=(""&integer)/> </Rmsd>
- chains: list of chains (eg - "AC") to use for RMSD calculation
- residue: add a new leaf for each residue to include (can use rosetta index or pdb number)
- span: contiguous span of residues to include (rosetta index or pdb number)
- threshold: accept at this rmsd or lower
- superimpose: perform superimposition before rmsd calculation?
Calculates the fraction sequence recovery of a pose compared to a reference pose. This is similar to the InterfaceRecapitulation mover, but does not require a design mover. Instead, the user can provide a list of task operations that describe which residues are designable in the pose.
<SequenceRecovery name=(&string) rate_threshold=(0.0 &Real) task_operations=(comma-delimited list of task_operations) />
- rate_threshold: what is an acceptable recovery rate?
The reference pose against which the recovery rate will be computed can be defined using the -in:file:native command-line flag. If that flag is not defined, the starting pose will be used as a reference.
This filter checks whether residues defined by res_num/pdb_num are hbonded with as many hbonds as defined by partners, where each hbond needs to have at most energy_cutoff energy.
<HbondsToResidue name=(hbonds_filter &string) partners="how many hbonding partners are expected &integer" energy_cutoff=(-0.5 &float) backbone=(0 &bool) sidechain=(1 &bool) res_num/pdb_num=(&string - see above)>
- backbone: should we count backbone-backbone hbonds?
- sidechain: should we count backbone-sidechain and sidechain-sidechain hbonds?
Maximum number of buried unsatisfied H-bonds allowed. If a jump number is specified (default=1), then this number is calculated across the interface of that jump. If jump_num=0, then the filter is calculated for a monomer. Note that #unsat for monomers is often much higher than 20. Notice that water is not assumed in these calculations.
<BuriedUnsatHbonds name=(&string) jump_number=(1 &Size) cutoff=(20 &Size)/>
Require a disulfide bond between the interfaces to be possible. 'Possible' is taken fairly loosely; a reasonable centroid disulfide score is required (fairly close CB atoms without too much angle strain).
targets are considered when searching for a disulfide bond. As for DisulfideMover, if no residues are specified from one interface partner all residues on that partner will be considered.
<DisulfideFilter name="&string" targets=(&string)/>
- targets: A comma-seperated list of residue numbers. These can be either with rosetta numbering (raw integer) or pdb numbering (integer followed by the chain letter, eg '123A'). Targets are required to be located in the interface. Default: All residues in the interface. Optional
These filters are used primarily for the reports they generate in the log and/or score and silent files, more so than their ability to end a run.
Returns the Boltzmann weighted sum of a set of positive and negative filters. The fitness is actually defined as -F with [-1-0] range (-1 most optimal, 0 least).
<Boltzmann name=(&string) fitness_threshold=(0&real) temperature=(0.6 &real) positive_filters=(&comma delimited list) negative_filters=(&comma delimited list)/>
- fitness_threshold: above which fitness threshold to allow?
- temperature: the Boltzmann weighting factor (in fact, kT rather than T).
- positive_filters: a list of predefined filters to use as the positive states. The filters' report_sm methods will be invoked, so there's no need to fret about their thresholds.
- negative_filters: as above, only negative.
Useful for balancing counteracting objectives.
Reports to tracers which residues are repackable/designable according to use-defined task_operations. Useful for automatic interface detection (use the ProteinInterfaceDesign task operation for that). The residue number that are reported are pdb numbering.
<DesignableResidues name=(&string) task_operations=(comma-separated list) designable=(1 &bool) packable=(0 &bool)/>
Approximates the Boltzmann probability for the occurrence of a rotamer. The method, usage examples, and analysis scripts are published in Fleishman et al. (2011) Protein Sci. 20:753.
Residues to be tested are defined using a task_factory (set all inert residues to no repack). A first-pass alanine scan looks at which residues contribute substantially to binding affinity. Then, the rotamer set for each of these residues is taken, each rotamer is imposed on the pose, the surrounding shell is repacked and minimized and the energy is summed to produce a Boltzmann probability. Can be computed in both the bound and unbound state.
This is apparently a good discriminator between designs and natives, with many designs showing high probabilities for their highly contributing rotamers in both the bound and unbound states.
The filter also reports a modified value for the complex ddG. It computes the starting ddG and then reduces from this energy a fraction of the interaction energy of each residue the rotamer probability of which is below a certain threshold. The interaction energy is computed only for the residue under study and its contacts with residues on another chain.
<RotamerBoltzmannWeight name=(&string) task_operations=(comma-delimited list) radius=(6.0 &Real) jump=(1 &Integer) unbound=(1 &bool) ddG_threshold=(1.5 &Real) scorefxn=(score12 &string) temperature=(0.8 &Real) energy_reduction_factor=(0.5 &Real) repack=(1&bool) skip_ala_scan=(0 &bool)> <??? threshold_probability=(&Real)/> . . . </RotamerBoltzmannWeight>
- task_operations: define what residues to work on. Set all residues not to be tested to no repack.
- radius: repacking radius around the rotamer under consideration. These residues will be repacked and minimized for each rotamer tested
- jump: what jump to look at
- unbound: test the bound or unbound state?
- ddG_threshold: a further filter on which designs to test. Only residues that contribute more than the stated amount to binding will be tested.
- temperature: the scaling factor for the Boltzmann calculations. This is actually kT rather than just T.
- energy_reduction_factor: by what factor of the interaction energy to reduce the ddG.
- repack: repack in the bound and unbound states before reporting binding energy values (ddG). If false, don't repack (dG).
- skip_ala_scan: do not conduct first-pass ala scan. Instead compute only for residues that are allowed to repack in the task factory.
- ??? any of the three-letter codes for residues (TRP, PHE, etc.)
Substitutes Ala for each interface position separately and measures the difference in ddg compared to the starting structure. The filter always returns true. The output is only placed in the .report file. Repeats causes multiple ddg calculations to be averaged, giving better converged values.
<AlaScan name=(&string) scorefxn=(score12 &string) jump=(1 &Integer) interface_distance_cutoff=(8.0 &Real) partner1=(0 &bool) partner2=(1 &bool) repeats=(1 &Integer) repack=(1 &bool)/>
- scorefxn: scorefxn to use for ddg calculations
- jump: which jump to use for ddg calculations. If jump=0 the complex is not taken apart and only the dG of the mutation is computed.
- interface_distance_cutoff: how far apart counts as an interface (in angstroms)
- partner1: report ddGs for everything upstream of the jump
- partner2: report ddGs for everything downstream of the jump
- repack: repack in the bound and unbound states before reporting the energy (ddG). When false, don't repack (dG).
Scan all mutations allowed by task_operations and test against a filter. Produces a report on the filter's values for each mutation as well as a resfile that says which mutations are allowed.
<FilterScan name=(&string) scorefxn=(score12 &string) task_operations=(comma separated list) triage_filter=(true_filter &string) filter=(&string) report_all=(0 &bool) relax_mover=(null &string) resfile_name=(<PDB>.resfile &string) resfile_general_property=("nataa" &string) delta=(0 &bool) unbound=(0 &bool) jump=(1 &int) />
- triage_filter: If this filter evaluates to false, don't include the mutation in the resulting resfile
- filter: Used only for reporting the value for the pose in the tracer report
- report_all: By default, only attempted mutations which pass triage_filter will be evaluated by filter and reported in the tracer report. If report_all is true, report the value of filter for all evaluated mutations. (Note this will increase the number of calls to filter/the computational cost.)
- relax_mover: After mutation, what mover to use for relax (minimization may be a good idea) This mover is in addition to repacking done as part of the mutation. (Repacking is done according to task_operations, but is limited to an 8 Angstrom shell around each mutated residue.)
- scorefxn: The scorefunction to use with the mutation repacking
- resfile_name: the output resfile name. Defaults to what's the -s on the commandline +".resfile"
- resfile_general_property: What to do with all other residues in the resfile
- delta: Test the filter against a baseline which is the filter's value at the start of the run?
- unbound: Test the filter in the unbound state?
- jump: If unbound, which jump?
Filter and triage_filter are potentially confusing. You can use the same filter for both. Triage_filter can be more involved, including compound filter statements, whereas the filter option is reserved to filters that have meaningful report_sm methods (ddG, energy...).
The reported values from filter will appear in a Tracer called ResidueScan, so -mute all -unmute ResidueScan will only output the necessary information
Use the unbound option only on a Prepacked structure with jump_number=1, o/w the reference energy (baseline) won't make any sense.
Simple filter for reporting the time a sequence of movers/filters takes.
Within the protocol, you need to call time at least twice, once, when you want to start the timer, and then, when you want to stop. The reported time is that between the first and last calls.
Special Application Filters
Computes the binding energy for the complex and if it is below the threshold returns true. o/w false. Useful for identifying complexes that have poor binding energy and killing their trajectory.
<Ddg name=(ddg &string) scorefxn=(score12 &string) threshold=(-15 &float) jump=(1 &Integer) repeats=(1 &Integer) repack=(true &bool)/>
- jump specifies which chains to separate. Jump=1 would separate the chains interacting across the first chain termination, jump=2, second etc.
- repeats: averages the calculation over the number of repeats. Note that ddg calculations show noise of about 1-1.5 energy units, so averaging over 3-5 repeats is recommended for many applications.
- repack: Should the complex be repacked in the bound and unbound states prior to taking the energy difference? If false, the filter turns to a dG evaluator. If repack=false repeats should be turned to 1, b/c the energy evaluations converge very well with repack=false
Ligand docking and enzyme design
(Formerly known as LigDSasa)
Computes the fractional interface delta_sasa for a ligand on a ligand-protein interface and checks to see if it is *between* the lower and upper threshold. A DSasa of 1 means ligand is totally buried (loses all it's accessible surface area), 0 means totally accessible (loses none upon interface formation).
<DSasa name=(&string) lower_threshold=(0.0 &float) upper_threshold=(1.0 &float)/>
Compares the DSasa of two specified atoms and checks to see if one is greater or less than other. This is useful for figuring out whether a ligand is oritented in the correct way (i.e. whether in the designed interface one atom is more/less exposed than another)
<DiffAtomBurial name=(&string) res1_res_num/res1_pdb_num=(0, see res_num/pdb_num convention) res2_res_num/res2_pdb_num=(0, see convention) atomname1=(&string) atomname2=(&string) sample_type=(&string)/>
- res1_res_num/res2_res_num: conventional pose numbering of rosetta, res_num=0 will mean ligand (Assuming there is only one ligand)
- res1_pdb_num/res2_pdb_num: conventional pdb_numbering such as 100A (residue 100 chain A), 1X (residue 1 chain X e.g. of ligand)
- atomname1/atomname2: atomnames of the respective atoms
- sample_type: "more" or "less". "more" means Dsasa1>Dsasa2 (atom1 is more buried than atom2); "less" means Dsasa1<Dsasa2 (atom1 is less buried than atom2)
Calculates interface energy across a ligand-protein interface taking into account (or not) enzdes style cst_energy.
<LigInterfaceEnergy name=(&string) scorefxn=(&string) include_cstE=(0 &bool) jump_number=(last_jump &integer) energy_cutoff=(0.0 &float)/>
include_cstE=1 will *not* subtract out the cst energy from interface energy. jump_number defaults to last jump in the pose (assumed to be associated with ligand). energy should be less than energy_cutoff to pass.
Calculates scores of a pose e.g. a ligand-protein interface taking into account (or not) enzdes style cst_energy. Residues can be accessed by res_num/pdb_num or their constraint id. One and only one of res/pdb_num, cstid, and whole_pose tags can be specified. energy should be less than cutoff to pass.
<EnzScore name=(&string) scorefxn=(&string, score12) whole_pose= (&bool,0) score_type = (&string) res_num/pdb_num = (see convetion) cstid = (&string) energy_cutoff=(0.0 &float)/>
- cstid: string corresponding to cst_number+template (A or B, as in remarks and cstfile blocks). each enzdes cst is between two residues; A or B allows access to the corresponding residue in a given constraint e.g. cstid=1A means cst #1 template A (i.e. for the 1st constraint, the residue corresponding to the block that is described first in the cstfile and its corresponding REMARK line in header), cstid=4B (for the 4th constraint, the residue that is described second in the cstfile block and its REMARK line in header).
- score_type: usual rosetta score_types; cstE will calculate enzdes style constraint energy
- whole_pose: calculate total scores for whole pose
Calculates delta_energy or RMSD of protein residues in a protein-ligand interface when the ligand is removed and the interface repacked. RMSD of a subset of these repacked residues (such as catalytic residues) can be accessed by setting the appropriate tags.
<RepackWithoutLigand name=(&string) scorefxn=(&string, score12) target_res = (&string) target_cstids = (&string) energy_threshold=(0.0 &float) rms_threshold=(0.5 &float)/>
- target_cstids: comma-separated list corresponding to cstids (see EnzScore for cstid format)
- target_res: comma-separated list corresponding to res_nums/pdb_nums (following usual convention) OR "all_repacked" which will include all repacked neighbors of the ligand (the repack shell).
- rms_threshold: maximum allowed RMS of repacked region; (i.e. RMSD<rms_threshold filter passes, else fails)
- energy_threshold: delta_Energy allowed (i.e. if E(with_ligand)-E(no_ligand) < threshold, filter passes else fails)
<HeavyAtom name="&string" chain="&string" heavy_atom_limit=(&int)/>
Stop growing this designed ligand once we reach this heavy atom limit
<CompleteConnections name="&string" chain="&string"/>
Are there any connections left to fulfill? If not, stop growing ligand
The following Filters are available through RosettaScripts, but are not currently documented. See the code (particularly the respective parse_my_tag() and apply() functions) for details. (Some may be undocumented as they are experimental/not fully functional.)