LigandBindingAssemblyMover is a derived AppendAssemblyMover that is intended to build new contacts to a partially coordinated ligand while simultaneously building new protein backbone using SEWING.

## Description

LigandBindingAssemblyMover takes a partial ligand binding site and its ligand and, given user-provided information on how the ligand should be coordinated, adds new structural elements that will be able to complete the ligand's coordination with the correct geometry. It also makes any necessary mutations for this coordination and outputs PDBInfoRemarks indicating all coordinating residues.

## Usage

NOTE: This is currently unavailable but will be automatically updated when the new version of SEWING is merged.

Autogenerated Tag Syntax Documentation:

AssemblyMover designed to add contacts to specified ligand atoms and then build an assembly around the ligand.

<LigandBindingAssemblyMover name="(&string;)" start_temperature="(0.6 &real;)"
delete_probability="(0.005 &real;)"
conformer_switch_probability="(0 &real;)"
window_width="(4 &positive_integer;)"
minimum_cycles="(10000 &non_negative_integer;)"
maximum_cycles="(100000 &non_negative_integer;)"
model_file_name="(&string;)" hashed="(false &bool;)"
edge_file_name="(&string;)" max_segments="(100 &non_negative_integer;)"
max_segment_length="(100 &non_negative_integer;)"
output_pose_per_move="(false &bool;)"
recover_lowest_assembly="(true &bool;)"
recursive_depth="(1 &non_negative_integer;)"
pose_segment_starts="(&int_cslist;)" pose_segment_ends="(&int_cslist;)"
pose_segment_dssp="(&string;)" strict_dssp_changes="(false &bool;)"
set_segments_from_dssp="(false &bool;)" match_segments="(&int_cslist;)"
partner_pdb="(&string;)"
required_resnums="(&refpose_enabled_residue_number_cslist;)"
max_recursion="(1 &non_negative_integer;)"
modifiable_terminus="(B &string;)" output_partner="(true &bool;)"
extend_mode="(false &bool;)" start_node_vital_segments="(all &string;)"
required_selector="(&string;)" distance_cutoff="(10.0 &real;)"
segment_distance_cutoff="(1 &non_negative_integer;)"
binding_cycles="(1000 &non_negative_integer;)"
build_site_only="(false &bool;)" >
<AssemblyScorers >
<InterModelMotifScorer name="(&string;)" weight="(1.0 &real;)" />
optimum_distance="(&real;)" maximum_unpenalized_variance="(&real;)" />
<LigandScorer name="(&string;)" weight="(1.0 &real;)"
ligand_interaction_cutoff_distance="(5.0 &real;)" />
<MotifScorer name="(&string;)" weight="(1.0 &real;)" />
<PartnerMotifScorer name="(&string;)" weight="(1.0 &real;)" />
<SegmentContactOrderScorer name="(&string;)" weight="(1.0 &real;)" />
<StartingNodeMotifScorer name="(&string;)" weight="(1.0 &real;)" />
<SubsetPartnerMotifScorer name="(&string;)" weight="(1.0 &real;)"
region_start="(1 &non_negative_integer;)"
region_end="(2 &non_negative_integer;)" />
<TerminusMotifScorer name="(&string;)" weight="(1.0 &real;)"
partner_residue="(&non_negative_integer;)" optimum_distance="(&real;)"
maximum_unpenalized_variance="(&real;)" terminus="(&string;)" />
<TopNMotifScorer name="(&string;)" weight="(1.0 &real;)"
scores_to_keep="(1 &non_negative_integer;)" />
</AssemblyScorers>
<AssemblyRequirements >
<ClashRequirement name="(&string;)"
maximum_clashes_allowed="(0 &non_negative_integer;)"
<DsspSpecificLengthRequirement name="(&string;)" dssp_code="(X &dssp_enum;)"
maximum_length="(100 &non_negative_integer;)"
minimum_length="(0 &non_negative_integer;)" />
<KeepLigandContactsRequirement name="(&string;)"
contact_distance_cutoff="(2.5 &real;)" />
<LengthInResiduesRequirement name="(&string;)"
maximum_length="(10000 &non_negative_integer;)"
minimum_length="(0 &non_negative_integer;)" />
<LigandClashRequirement name="(&string;)"
maximum_clashes_allowed="(0 &non_negative_integer;)"
<NonTerminalStartingSegmentRequirement />
<SizeInSegmentsRequirement name="(&string;)"
maximum_size="(10000 &non_negative_integer;)"
minimum_size="(0 &non_negative_integer;)" />
</AssemblyRequirements>
<Ligands >
<Ligand partner_ligand="(false &bool;)" pdb_conformers="(&string;)"
alignment_atoms="(&string;)" auto_detect_contacts="(true &bool;)"
ligand_resnum="(&refpose_enabled_residue_number;)"
ligand_selector="(&string;)" >
<Contact partner_contact="(false &bool;)"
contact_resnum="(&refpose_enabled_residue_number;)"
ligand_atom_name="(&string;)" contact_atom_name="(&string;)" />
<Coordination coordination_files="(&string;)"
geometry_score_threshold="(1 &real;)" >
<IdealContacts distance="(&real;)" angle="(109.5 &real;)"
dihedral_1="(30 &real;)" dihedral_2="(30 &real;)"
max_coordinating_atoms="(&non_negative_integer;)"
ligand_atom_name="(&string;)" />
</Coordination>
</Ligand>
</Ligands>
</LigandBindingAssemblyMover>
• start_temperature: Temperature at start of simulated annealing
• end_temperature: Temperature at end of simulated annealing
• add_probability: Probability of adding a triplet of segments at any given step during assembly
• delete_probability: Probability of deleting a terminal triplet of segments at any given step during assembly
• conformer_switch_probability: Probability of switching ligand conformers during assembly. This should only be used if a ligand is present AND if you have provided conformers for that ligand.
• window_width: Required number of overlapping residues for two segments to be considered a match. Used in hashless SEWING only (for hashed SEWING, this is determined by the hasher settings used when generating the edge file).
• minimum_cycles: Minimum number of Monte Carlo cycles for assembly before completion requirements are checked.
• maximum_cycles: Maximum number of Monte Carlo cycles for assembly before forced termination.
• model_file_name: (REQUIRED) Path to file defining segments to use during assembly
• hashed: Use the hasher during assembly to check overlap of all atoms? Requires an input edge file.
• edge_file_name: Path to edge file to use during assembly (only used if hashed is set to true)
• max_segments: Maximum number of segments to include in the final assembly
• max_segment_length: Maximum number of residues to include in a segment
• output_pose_per_move: Setting to true will output a pose after each move/revert.
• recover_lowest_assembly: Setting to true will output the lowest assembly in the final pose
• recursive_depth: How many nodes after the terminal node should we keep track of alignments for?
• pose_segment_starts: Residue numbers of the first residue in each segment in the input pose
• pose_segment_ends: Residue numbers of the last residue in each segment in the input pose. Length must match that of pose_segment_starts.
• pose_segment_dssp: String indicating the secondary structure of user-specified segments, one character per segment (e.g. HLH for a helix-loop-helix motif). Length should match that of pose_segment_starts and pose_segment_ends if specified.
• strict_dssp_changes: Segments require at least a 2-residue change in DSSP to specify a new segment
• set_segments_from_dssp: Determine segment boundaries based on pose secondary structure
• match_segments: Which segments from the input pose should we be able to append onto? Defaults to exterior segments.
• partner_pdb: Name of PDB file containing binding partner for this assembly
• required_resnums: Residue numbers of residues in the input structure that must be preserved
• max_recursion: How many alignments from the end nodes should be stored in memory?
• modifiable_terminus: Which terminus of the starting node may be modified.
• output_partner: Should the output pdb contain the partner?
• extend_mode: Should SEWING append only a single helix?
• start_node_vital_segments: Which segments from starting node are vital? (terminal or all)
• required_selector: Residue selector specifying residues in the input structure that must be preserved. The name of a previously declared residue selector or a logical expression of AND, NOT (!), OR, parentheses, and the names of previously declared residue selectors. Any capitalization of AND, NOT, and OR is accepted. An exclamation mark can be used instead of NOT. Boolean operators have their traditional priorities: NOT then AND then OR. For example, if selectors s1, s2, and s3 have been declared, you could write: 's1 or s2 and not s3' which would select a particular residue if that residue were selected by s1 or if it were selected by s2 but not by s3.
• distance_cutoff: Cutoff for distance in Angstroms between segment and ligand to check for contacts
• segment_distance_cutoff: Maximum number of segments away from the starting structure to look for contacts
• binding_cycles: How many cycles to check for ligand contacts before giving up
• build_site_only: Should we stop after finding all of the desired contacts?

Subtag AssemblyScorers: The subtags of this tag define the AssemblyScoreFunction that will be used to evaluate assemblies

Subtag InterModelMotifScorer: Basic Motif score among non-adjacent helices

• weight: How heavily will this term be weighted during scoring?

Subtag IntraDesignTerminusMotifScorer: Motif score to measure packing of assembly against partner PDB

• weight: How heavily will this term be weighted during scoring?
• optimum_distance: How far apart should that residue optimally be from the terminus?
• maximum_unpenalized_variance: How far off from that can it be before it should be penalized?

Subtag LigandScorer: Scores how well ligand is buried based on orientation of nearby Ca's

• weight: How heavily will this term be weighted during scoring?
• ligand_interaction_cutoff_distance: The distance cutoff between ligand atom and c alpha that is considered an interaction.

Subtag MotifScorer: Basic Motif score among all helices

• weight: How heavily will this term be weighted during scoring?

Subtag PartnerMotifScorer: Motif score to measure packing of assembly against partner PDB

• weight: How heavily will this term be weighted during scoring?

Subtag SegmentContactOrderScorer: Favors assemblies whose segments form contacts with segments distant in the assembly

• weight: How heavily will this term be weighted during scoring?

Subtag StartingNodeMotifScorer: Specifically scores packing against the starting node

• weight: How heavily will this term be weighted during scoring?

Subtag SubsetPartnerMotifScorer: Motif score to measure packing of assembly against partner PDB

• weight: How heavily will this term be weighted during scoring?
• region_start: What is the first residue of the scored subset?
• region_end: What is the last residue of the scored subset?

Subtag TerminusMotifScorer: Motif score to measure packing of assembly against partner PDB

• weight: How heavily will this term be weighted during scoring?
• partner_residue: Which residue of the partner should this scorer calculate distance to?
• optimum_distance: How far apart should that residue optimally be from the terminus?
• maximum_unpenalized_variance: How far off from that can it be before it should be penalized?
• terminus: Which terminus should be scored?

Subtag TopNMotifScorer: Basic Motif score among all helices

• weight: How heavily will this term be weighted during scoring?
• scores_to_keep: How many scores from each pair should be counted?

Subtag AssemblyRequirements: Subtags of this tag define the set of requirements that will be used when evaluating SEWING assemblies

Subtag ClashRequirement: Checks for clashes between segments in the assembly

• maximum_clashes_allowed: Maximum number of clashes to allow in the assembly
• clash_radius: Radius in Angstroms within which two residues are considered to be clashing

Subtag DsspSpecificLengthRequirement: Restricts the number of residues in segments with the specified DSSP

• dssp_code: DSSP code whose length the requirement is restricting
• maximum_length: Maximum number of residues in a segment with the given secondary structure
• minimum_length: Minimum number of residues in a segment with the given secondary structure

Subtag KeepLigandContactsRequirement: Fails if an assembly's ligands lose more than a set number of contacts

• contact_distance_cutoff: Maximum distance between two contact atoms before the contact is considered broken

Subtag LengthInResiduesRequirement: Checks the number of segments in the assembly

• maximum_length: Maximum number of residues to allow in the assembly
• minimum_length: Minimum number of residues in the final assembly

Subtag LigandClashRequirement: Checks for clashes between the assembly and its ligands

• maximum_clashes_allowed: Maximum number of clashes to allow in the assembly
• clash_radius: Radius in Angstroms within which two residues are considered to be clashing

Subtag SizeInSegmentsRequirement: Checks the number of segments in the assembly

• maximum_size: Maximum number of secondary structure elements (including loops) to allow in the assembly
• minimum_size: Minimum number of secondary structure elements (including loops) in the final assembly

Subtag Ligands: Subtags of this tag specify the ligands present in the input pose and their respective protein contacts.

Subtag Ligand: Specifies the position of a ligand and the contacts that it forms with the input pose

• partner_ligand: Is this ligand found in the partner PDB?
• pdb_conformers: Name of file containing a list of PDBs (or other Rosetta-compatible input files) containing alternate ligand conformations to sample
• alignment_atoms: Comma-separated list of atom names to use when aligning ligand conformers to one another
• auto_detect_contacts: Should we automatically detect contacts that are joined to the ligand by inter-residue chemical bonds?
• ligand_resnum: Residue number of ligand in either PDB or Rosetta numbering
• ligand_selector: Residue selector indicating ligand(s) covered in this tag. The name of a previously declared residue selector or a logical expression of AND, NOT (!), OR, parentheses, and the names of previously declared residue selectors. Any capitalization of AND, NOT, and OR is accepted. An exclamation mark can be used instead of NOT. Boolean operators have their traditional priorities: NOT then AND then OR. For example, if selectors s1, s2, and s3 have been declared, you could write: 's1 or s2 and not s3' which would select a particular residue if that residue were selected by s1 or if it were selected by s2 but not by s3.

Subtag Contact:

• partner_contact: Does this tag specify a contact with the partner PDB?
• contact_resnum: (REQUIRED) Number of residue participating in this contact in PDB or Rosetta numbering
• ligand_atom_name: Rosetta name for the ligand atom participating in the contact
• contact_atom_name: Rosetta name for the protein atom participating in the contact

Subtag Coordination: Contains subtags defining ideal coordination environments for atoms in the ligand

• coordination_files: Comma-separated list of coordination file names for this ligand
• geometry_score_threshold: Maximum score geometry score to allow when forming a contact

Subtag IdealContacts:

• distance: (REQUIRED) Ideal distance between ligand and contact atom
• angle: Ideal angle between this atom's contacts
• dihedral_1: Ideal dihedral angle: contact_base - contact - ligand_atom - other_contact
• dihedral_2: Ideal dihedral angle: contact - ligand_atom - other_contact - other_base
• max_coordinating_atoms: (REQUIRED) Maximum number of contacts that this atom can form. Note that IdealContacts tags do not need to be defined for atoms with no contacts.
• ligand_atom_name: (REQUIRED) Rosetta name of the ligand atom to which this tag applies

### Defining the ligand coordination environment

The Coordination subtag of the Ligand tag (which is documented here), contains two types of ligand coordination information. First, it contains the "coordination_files" attribute, which provides a comma-separated list of all ligand coordination files (described below) that will be used to identify ligand contacts. Second, it provides information on each atom's preferred geometry in the form of IdealContacts subtags. Each ligand atom that will form new contacts will have its own IdealContacts subtag.

#### IdealContacts

Each atom's IdealContacts tag defines the atom name for which it should be applied, its preferred bond lengths ("distance"); bond angles about the atom ("angle"); and the ideal dihedral angles about the atom as defined in the XML schema above. Note that angles and dihedral angles will be accepted if they are any multiple of the entered value; for example, if the user requests a 90 degree angle, then bond angles of 180 degrees will also be accepted. The subtag also indicates the maximum number of contacts/bonds that the ligand should form, including existing contacts.

The final attribute of the IdealContacts tag, the geometry score threshold, indicates the tolerance that LigandBindingAssemblyMover should apply when scoring the geometry of newly added ligand contacts. This score is calculated as follows:

geometry_score = (delta_distance)^2 + 10*(mod(angle, ideal_angle))^2 + 5*(mod(dihedral_1, ideal_dihedral_1))^2 + 5*(mod(dihedral_2, ideal_dihedral_2))^2

Note that all angles are converted to radians for this calculation.

The recommended geometry score threshold will vary depending on the coordination environment of each ligand atom. For example, for a zinc ion with two initial contacts, a threshold of 5 provides the best balance of good geometry and high success rate for the addition of a third contact whereas a threshold of 20 provides the best results for a fourth contact.

#### Ligand Coordination Files

The format for ligand coordination files can be found here.

### Post-Processing

In many cases, users will want to add additional structural elements using AppendAssemblyMover after their ligands are fully coordinated. This can either be performed using a separate mover to allow for additional filtering/inspection beforehand or can be performed automatically within LigandBindingAssemblyMover by setting the "build_site_only" option to false. In this case, the min_cycles and max_cycles options would indicate the length of the AppendAssemblyMover run (whereas "binding_cycles" is used to determine the length of the LigandBindingAssemblyMover run), and the start_temperature and end_temperature options would control temperature ramping. This option is not recommended in most cases because AppendAssemblyMover will generally perform better with different add/delete probabilities and temperatures than LigandBindingAssemblyMover. For refinement after SEWING, please see the refinement of SEWING assemblies page.

## Example

The following is an example RosettaScript using LigandBindingAssemblyMover:

<ROSETTASCRIPTS>
<SCOREFXNS>
</SCOREFXNS>
<RESIDUE_SELECTORS>
<ResidueName name="select_zn" residue_name3=" ZN" />
</RESIDUE_SELECTORS>
<FILTERS>
</FILTERS>
<MOVERS>
<LigandBindingAssemblyMover name="assemble" binding_cycles="20000" model_file_name="/nas02/home/g/u/guffy/netscr/sewing_with_zinc/input_files/smotifs_H_5_40_L_1_6_H_5_40.segment\
s" add_probability="0.05" delete_probability="0.05" hashed="false" segment_distance_cutoff="2" distance_cutoff="8.0" start_temperature="2.0" build_site_only="true" window_width="4" \
>
<Ligands>
<Ligand ligand_selector="select_zn" auto_detect_contacts="true" >
<Coordination coordination_files="/nas02/home/g/u/guffy/netscr/sewing_with_zinc/input_files/H_NEW_stats.txt" geometry_score_threshold="5" >
<IdealContacts ligand_atom_name="ZN" max_coordinating_atoms="3" angle="109.5" distance="2.2" dihedral_1="30" dihedral_2="120" />
</Coordination>
</Ligand>
</Ligands>
<AssemblyRequirements>
<DsspSpecificLengthRequirement dssp_code="L" maximum_length="6" /> Prevents super-long loops
<DsspSpecificLengthRequirement dssp_code="H" minimum_length="10" /> Prevents super-short helices
<ClashRequirement />
<SizeInSegmentsRequirement maximum_size="9" minimum_size="5" />
<LigandClashRequirement />
</AssemblyRequirements>
</LigandBindingAssemblyMover>
</MOVERS>
<APPLY_TO_POSE>
</APPLY_TO_POSE>
<PROTOCOLS>
</ROSETTASCRIPTS>