Last updated 2011-02-18. Corresponding authors are Justin Ashworth and David Baker.
To run the demos (integration tests), obtain the source package, enter directory main/tests/integration, and run the command line:
(note: run ./integration.py -h to set additional options and paths that may be necessary depending on your system)
This application provides support for biophysical modeling of protein-DNA interactions. It provides sampling and energetic estimates of structural and mutational degrees of freedom in the protein-DNA interface.
This application is provided to access full-atom, primarily fixed-backbone protocols in the protocols/dna directory through the RosettaScripts and Job Dsitributor schemes. This scheme is highly extensible: this application also provides access to any other class or procedure that is implemented under the RosettaScripts scheme. Currently, the primary difference between this application and the RosettaScripts application is that this one uses a custom JobOutputter class to prepend protein-DNA-specific information to output PDB files.
This application is not officially supported for anything that is not described in the referenced literature citations.
Please see the demos (integration tests) for three example modes for this application. Three modes are provided: automated basepair-specific design, resfile-directed design, and basepair specificity prediction.
Most protocol-specific behavior is specified through the RosettaScripts interface. Here are examples of the primary class options:
<RestrictDesignToProteinDNAInterface name="DnaInt" base_only="1" z_cutoff="3.0" dna_defs="C.-10.GUA/">This class automatically detects the relevant amino acids to move and mutate (those near DNA). base_only: should backbone contacts be considered?; z_cutoff: distance cutoff for contact to a certain basepair, measured along the DNA helical axis; dna_defs: which base pairs to consider (uses PDB chain id and numbering).
<DNA weights="dna/">sets the weights file for the internal score function.
<DnaInterfaceMultiStateDesign name="msd" scorefxn="DNA" task_operations="IFC,IC,AUTOprot,DnaInt" pop_size="20" num_packs="1" numresults="0" boltz_temp="2" anchor_offset="15" mutate_rate="0.8" generations="5/">This class performs multi-state design of amino acids for improved descrimination between seqquences. This functionality is not officially supported. pop_size: number of different protein sequences; num_packs: number of packing trials per sequence; numresults: number of results to return; boltz_temp: the Boltzmann temperature factor to use when effectively comparing energy fitnesses; anchor_offset: approximate number of energy units that can be traded in order to gain specificity; mutate_rate: rate of mutation for a single sequence; generations: number of times to "evolve" set of sequences.
<DesignProteinBackboneAroundDNA name="bb" scorefxn="DNA" task_operations="IFC,IC,AUTOprot,DnaInt" type="ccd" gapspan="4" spread="3" cycles_outer="3" cycles_inner="1" temp_initial="2" temp_final="0.6/">This class tries to introduce small local changes in backbone structure. gapspan and spread determine how much of the backbone is made flexible (spread refers to the number of backbone residues on either side of each DNA-contacting residue, and gapspan sets the number of amino acids between spreads that results in their concatenation). cycles_outer and cycles_inner set the number of times to samplee possible backbone conformations. temp_initial and temp_final affect the starting and ending acceptance rates of backbone changes that increase free energy (higher numbers: higher acceptance rates and thus more aggressive and potentially destabilizing changes).
<DnaInterfacePacker name="DnaPack" scorefxn="DNA" task_operations="IFC,IC,AUTOprot,ProtNoDes,DnaInt" binding="1" probe_specificity="1/">This class controls the fixed-backbone packing stage of the protocol. task_operations: this specifies all of the RosettaScripts-specified TaskOperations that specify the behavior of the packer (see the full script in the demos (integration tests) for examples of these). binding: this option sets whether binding energies are calculated (0: no, 1: yes). probe_specificity: sets whether the basepair specificity of the protein is calculated after the packer has finished altering the interface. Please see the demos and RosettaScripts documentation for more details.
The following general Rosetta command-line options are currently used in demos:
-adducts dna_major_groove_wateractivates hydrated nucleotides
-sparse_pdb_outputonly write PDB lines for residues that differ from input structure (to reduce disk space and memory sizes)
-file:s [your.pdb]specifies input structure file
-in:ignore_unrecognized_resthis options toggles failure upon encountering an unkown residue in the input file
-score:weights dnasets scorefunction weights for output structure (separate (but preferably the same as) scorefunction weights specified in the RosettaScripts file).
-use_input_sctoggles inclusion of original sidechain orientations in conformational sampling.
-run:output_hbond_infoadds hydrogen bonding information to the output file
-score:output_residue_energiesadds residue energies to output file
-jd2:dd_parserrequired option for RosettaScripts
-parser:protocol design.scriptspecifies protocol file for RosettaScripts
-overwriteoverwrites old output files when re-run
-out:prefix design_output file prefix
The demos (integration tests) are designs to run quickly. For higher quality results, additional rotamer sampling can be enabled using the -ex1 and -ex2 command line options. Also, see the scientific tests (mentioned earlier in this documentation under Files) for more useful parameters.
Optimal performance depends upon a few minor modifications of the code and database files. For this reason, alternative database files are provided in test/scientific/cluster/dna_interface_design/rosetta_database_sparse/. The modifications are summarized below:
These protocols output files through the JobDistributor scheme. Normally, this results in a PDB file. Often, this PDB file will contain header information that describes the results of the analysis.