Rosetta 3.4
Documentation for RosettaRemodel

Metadata

This document was last edited on January 20, 2012 by Ron Jacak. The code was written by Possu Huang and Andrew Ban. The corresponding PI is William Schief (schief@scripps.edu).

Code and Demos

The remodel application is started from the application in rosetta/rosetta_source/src/apps/pilot/possu/remodel.cc. This application uses classes from the protocols library, which can be found in the directory rosetta/rosetta_source/src/protocols/forge/remodel.

References

We recommend the following articles for further studies of RosettaDesign methodology and applications:

Purpose

The remodel application is an alternative way to use the loop modeling tools in Rosetta, tailor-made for design. The basic components of the tool consists of a building stage (at the centroid level) and a design stage. Generally there is a "partial design" stage to direct simulation to satisfy packing requirements. Instructions are given using an input PDB file, a blueprint description of the task, and if needed, a constraint definition and a PDB containing a segment of a structure to be inserted to the starting PDB.

Algorithm

Remodel operates in various stages. The protocol begins in the building stage, where it reads in the blueprint file (discussed below) which describes what will be done to the input structure. A blueprint file consists of three columns:

1 N .

where the first column is the residue number in the starting PDB, the second column is a single letter (usu. amino acid) code, and the last column specifies the building instruction. By replacing the "." to "E", "L", "H", or "D", Remodel will build residues with secondary structure of extended strand, loop, helix, or random, respectively. This notation controls the secondary structure used in harvesting fragments from the fragment database (known as "vall"). A default blueprint file can be generated by running the following command:

/src/apps/pilot/possu/getBlueprintFromCoords.pl -pdbfile [starting pdb] > [blueprint file]

Below are some examples of blueprint files and what they will do to the starting structure.

Fixed backbone design

4 I . PIKAA A
5 L . ALLAA
6 N . POLAR
7 G .

This blueprint will perform design on positions 4, 5, and 6 and leave position 7 fixed.

Remodel (no length change)

4 I . PIKAA A
5 L H
6 N H
7 G .

This will perform design on position 4, and rebuild residues 5 and 6 using helical fragments.

Extension

4 I . PIKAA A
5 L H 
0 x H
0 x H
6 N H 
7 G . 

Inserts two residue between residues 5 and 6, still using helical fragments. The sidechains of this newly built helical segment will be designed automatically. Note: You must allow remodelling of the residues flanking the extension. In this example, residues 5 and 6 need to be remodelled for the remodelling to work correctly.

Deletion

4 I . PIKAA A
5 L H ALLAA 
7 G H APOLAR
8 A .

Simply removing a line while giving build assignments to the positions before and after the deletion will shrink the PDB and reconnect (if possible, given the slack in the structure) the rest of the structure. One might need to include more positions in the rebuild assignment to ensure enough elements can move to reconnect the junction.

De novo structure

1 A .
2 A H
3 A H
4 A H
5 A H

Build off of a single amino acid stub (residue 1) any secondary structure desired. In the blueprint above, a 4 residue helix is created and its sidechains are automatically designed. Although the second columns has alanines, it will not affect the final design. In the blueprint file, the second column is only used for extensions ("x") or if sequence-biased fragments are desired (using the flag -remodel:use_blueprint_sequence). In this case, all positions will be automatically designed because no resfile manual assignments were made for the positions.

Disulfide design

Remodel can be used to design disulfides with the flag "-remodel:build_disulf". The protocol is designed to rebuild (using backbone changes) regions of a structure such that a good disulfide can be formed with another residue in the region not being remodelled. It will try to build and scan for all possible disulfides between the build region and the "landing" range (but not within the build region). The build region is assigned in the blueprint file (by specifying one or more positions to be remodelled), and, by default, the landing range is all non-remodelled positions. A custom landing range can be specified with the flag "-remodel:disulf_landing_range". The protocol is not intended for scanning a fixed-backbone structure for all pairs of positions that could make a disulfide, but it can be used to do something close to that. By using the "-remodel:bypass_fragments" option, no backbone changes will be made to the build region and the protocol will look for good disulfides between the build region residues and the landing residues. To make the protocol search for all possible disulfide pairs, some small changes to the code are necessary (and described in the FAQs). Alternatively, one can also use the 'design_disulfide' app for scanning for all possible position pairs suitable for a disulfide in a given structure.

One can also tighten the match_rt_limit setting to make "better" disulfides, but this runs the risk of not able to build any. In Rob's disulfide potential, this setting corresponds to a RMSD measure from your backbone to a real disulfide in the database. The disulfides in the database are harvested with idealized structures, so the precision isn't sub-angstrom. Around 1A is pretty good.

It's highly recommended that you use pose relax refinement scheme for disulfide refinement (by including the option "-remodel:use_pose_relax true" on the command line).

Disulfide design (auto)

1 M .
2 A .
3 R .
4 I .
5 L L
6 N L
7 G L
8 E .
9 T .
10 Y .

Try to find disulfides with an RMSD cutoff to structurally observed disulfides ("-remodel:match_rt_limit", default 0.4) between residues 5, 6, and 7 and the non-remodelled residues 1, 2, 3, 4, 8, 9 and 10 by remodelling resiudes 5, 6, and 7 using loop fragments. The "landing range" by default is all non-remodelled residues, but a particular region can be specified with the -remodel:disulf_landing_range option.

Disulfide design (limited region)

4 I .
5 L L
6 N L
7 G L
8 E .
...
88 K . DS_start
89 N .
90 E . DS_stop

Find disulfides between residues 5, 6, 7 and residues 88, 89, and 90. Same as the example above in combination with the flag -remodel:disulf_landing_range 88-90.

Disulfide design (specific pair)

4 I .
5 L L DM_start DM_stop
6 N L
7 G L
8 E .
...
88 K . DS_start DS_stop
89 N .

Build a disulfide between residue 5 and 88, if possible.

Domain insertion

4 I .
5 L L PIKAA L
0 x I NATAA
0 x I NATAA
0 x I NATAA
0 x I NATAA
6 N L PIKAA L
7 G .

Inserts the 4 residue segment contained in the file specified with the option -remodel:domainFusion:insert_segment_from_pdb between residues 5 and 6 and remodel residues 5 and 6 to accomodate the inserted sequence. The PDB file containing the segment to be inserted does not need to be renumbered. In this example above, Remodel will preserve the identities of the amino acids in the inserted segment because of the NATAA specifications in the fourth column. Similarly, Remodel will preserve the identities of the amino acids flanking the inserted segment. Note, is necessary to allow the residues flanking the insert to be remodelled or otherwise errors will occur. The identities of these positions can be fixed to their native types (as in the example above) but their backbones must be allowed to be remodelled to accomodate the inserted segment. Depending on the length of the insert, more than one residue on each side of the inserted segment may need to be remodelled for loop closure to succeed. In other words, if only one residue on each side of the insert is allowed to remodel, Remodel may not be able to "fold" the inserted segment in such a way that the chain can be connected. The protocol will continue to try indefinitely until a "loop closed" solution is found, and the process may need to be killed manually.

Special pair linker setup scheme

I have recently been asked to provide a setup where multiple linkers can be built on two different chains simultaneously. More specifically for a domain assembly type setup where a dimer interface is held constant while two extra domains are linked to each of the binding partners. This special case requires Remodel to recognize the input pose consists of two chains and allows fragment insertion without holding a jump across the segment. Use two flags to achieve this: The starting PDB already contains all the domains, now just build linkers to assemble them in a plausible configuration.

-remodel:two_chain_tree [start of the second chain]  
-remodel:no_jumps

Tethered docking

A segment of the structure can be rebuilt, but without restricting it to closure criteria. This is useful for things like docking a helix onto a backbone, and one can build a new connection once the helix is in place. Normally only "-remodel:bypass_closure" is needed, and a cut will be introduced to the C-term of the blueprint assignment range, but it can be switched to N-term shall the helix, or any structure, need building backwards. Recommended for use only when rebuilding a single segment.

-remodel:RemodelLoopMover:bypass_closure
-remodel:RemodelLoopMover:force_cutting_N

Remodel with constraints

Remodel uses the constraint setup in the EnzDes protocol, because it separates the definition of residue positions and the constraints to be applied to them. This setup allows blueprint to double as a constraint position definition file and allow all extensions and deletions to be handled elegantly. Note that the build constraint defined here will only be used in the centroid stage.

To setup build constraints, a constraint definition text file has to be created first, following the enzdes format. Note that backbone atoms need extra manual declaration of "is_backbone" otherwise they won't get applied in the build stage. Also, the atom_type fields expect Rosetta atom types.

  CST::BEGIN
    TEMPLATE::   ATOM_MAP: 1 atom_type: Nbb
    TEMPLATE::   ATOM_MAP: 1 is_backbone
    TEMPLATE::   ATOM_MAP: 1 residue3: ALA CYS ASP GLU PHE GLY HIS ILE LYS LEU MET ASN PRO GLN ARG SER THR VAL TRP TYR
  
    TEMPLATE::   ATOM_MAP: 2 atom_type: OCbb
    TEMPLATE::   ATOM_MAP: 2 is_backbone
    TEMPLATE::   ATOM_MAP: 2 residue3: ALA CYS ASP GLU PHE GLY HIS ILE LYS LEU MET ASN PRO GLN ARG SER THR VAL TRP TYR
  
    CONSTRAINT:: distanceAB:    2.80   0.20 100.00  0
  CST::END

This block describes a constraint setup between a backbone Nitrogen and backbone Oxygen atom to fall within a hydrogen bonding distance (2.8 Ã). The residue3 tag lists all the amino acid types so when the constraint is applied, it simply ignores the amino acid identity. A single letter code can be used instead if "residue1" is used.

Notes on cst file: "atom_type" designation requires only one name, and the connectivity will be automatically looked up in params file. But if use "atom_name", one would have to explicitly specify three atoms so the torsions and angles can be set correctly. For example:

ATOM_MAP: 1
atom_name N CA C

and

ATOM_MAP: 2
atom_type Obb

given a TORSION_A constraint, it will restraint torsion between O-N-CA-C.

The terms that can be used are: distanceAB: torsion_A: torsion_B: torsion_AB: angle_A: angle_B:

atom_map residue1 (or residue3) assignment should match the kind of atoms they define. In the example above, a backbone constraint can be applied to all amino acid types, so we list all 20 AA's. But if a sidechain designation is required, be sure to use only the residue that can satisfy those descriptions (e.g. NE2 exists in GLN, but not in ALA).

Secondly, the positions corresponding to the residues defined in the constraint file should be tagged in blueprint:

  10 K .
  11 T .
  12 L . CST1A
  13 K .
  14 G H
  15 E H
  16 T H
  17 T H CST1B
  18 T H

In this example, the Nbb atom on position 12 will have a H-bond constraint setup with the OCbb atom on position 17. The residue tagged as "A" corresponds to "Atom Map: 1" in the constraint definition block, "B" corresponds to "Atom Map: 2" (unfortunately, this conversion between A, B and 1, 2 is needed so the number of constraints is not limited to 26, A...Z. If you have more than one constraint to setup for each position, just continue with the definition (but be sure to duplicate the CST entry in the constraint file, each block correspond to CST1A/B, CST2A/B, ..., etc).

  10 K .
  11 T .
  12 L . CST1A CST2B
  13 K .
  14 G H
  15 E H CST2A
  16 T H
  17 T H CST1B
  18 T H

In this second case, the Nbb and OCbb on position 12 is constraint with the OCbb on position 15 and Nbb on position 17, respectively.

Once the constraint and the blueprint files are tagged correctly, to use them, one simply adds "-enzdes:cstfile [constraint file name]" to the command line.

4 I . PIKAA A CST1A
5 L H ALLAA
6 N H POLAR CST1B
7 A .

Rebuild residues 5 and 6 with helical fragments and apply constraints between residues 4 and 6 in the build process, followed by manual design protocols. The desired geometry of the constrained pair is specified in the separate constraints file.

CONSTRAINT FILTER When running a build with constraint definition, Remodel automatically uses a constraint filter and only generates decoys that satisfies your constraints, within 10 energy units. The stringency can be adjusted by changing the weights on the constraint term, for example, in the cstfile defined above:

    CONSTRAINT:: distanceAB:    2.80   0.30 100.00  0

a constraint weight of 100.00 was defined. Effectively any violation of this distance constraint will be greater than the 10 energy unit allowed. To soften this filter, one can simply change the line:

    CONSTRAINT:: distanceAB:    2.80   0.30   1.00  0

the change in the constraint weight effectively gives 1/100th the penalty. After each round of building, the log will indicate what the atom_pair violation is and the user can tweak the weight based on this.

Advanced Uses

Design

There are different stages of designing a newly built structure. In a accumulation stage (the "partial structure"), Remodel counts neighbors to identify what residues are buried, and only uses the buried positions in this stage. There are 6 different design modes. The decision for the switches are controlled by whether there's a 4th column added to the blueprint and two extra flags:

If there's no resfile command assignment in the 4th column (except CST assignments), Remodel uses automatic design. (Auto design will pick core positions automatically and use only hydrophobic amino acids.) To control whether neighboring residues are included, repacked or redesigned:

-find_neighbors alone will repack all the neighbors, in either auto or manual mode, otherwise only the rebuilt positions described in the blueprint file will be redesigned. -find_neighbors -design_neighbors will redesign all the neighbors.

For manual design, Blueprint file also handles designing positions using resfile commands, so one can specify:

  1 N .
  2 K H PIKAA K
  0 x H ALLAA
  0 x H APOLAR
  3 G H ALLAA
  4 W .
  5 K .

If you are using manual mode, it is recommended that you assign all the positions included in the rebuilding segment (but not necessarily elsewhere), otherwise they will be turned into valines (the default building residue at centroid level). This assignment, however, is not limited to the residues to be rebuilt. It works just like a resfile, you can turn any residue on the fly.

Additionally, the normal design flags such as -ex1 -ex2 should also be included to control this design behavior.

KIC Confirmation

[This feature is to be tested] After building the structure using CCD_refinement, one still does not know what would be the best sequence to produce experimentally. A new feature is to check those sections with an orthogonal method, and if the results converge, there is a higher chance that the loop is plausible. Currently we use CCD to build, and use KIC for the refinement step. If "-remodel:run_confirmation" was issued, structures generated by CCD will be tested in a KIC refinement step using the sequences designed from the CCD step. The build region will be expanded by 2 residues (to reduce bias) on both sides, and the entire segment will be rebuilt by KIC_refine protocol. The confirmation stage is currently setup as an evaluation check instead of a filter, so structures failed the test will still be produced. RMSD for the loop CA atoms to the CCD refined structure will be reported in both the log file and the REMARKs section (if turning on "-preserve_header true"). One can post-filter the structures based on this info.

One can also swap the methods to build with KIC and refine with CCD by "-swap_refine_confirm_protocols".

Clustering ligand poses

In specifying -remodel:num_trajectory, Remodel can make decisions on the refinement step by going after only the unique structures that score well coming out of the partial design stage. It collects all the structures and clusters them, and only use the lowest energy (presumably having the best contacts in the core) models to carry out full-atom refinement. By default, no clustering is performed.

-use_clusters false <=this will ask the program to refine all the structures generated at centroid level.

One can turn this feature on with

-use_clusters true

But be sure to set a trajectory number greater than one. Single structure can't form a cluster and Remodel will generate no output.

If using the clustering strategy, one can use "-cluster_radius [real number]" to set how different the clusters should look.

Checkpointing

Remodel has its own checkpointing scheme, in which structures were written out to disk and could be re-introduced to the program when the process is rebooted -- as in the case of being pre-emped from a cluster but inserted later on.

Checkpointing can be switched on by issuing the flag: "-checkpoint".

It is related to the "-save_top [number]" option, as checkpointing recovers the same number of structures written out by the -save_top command. (If your first pass of the protocol didn't generate the number of structures specified in -save_top, only the ones generated will be recovered). This scheme is only used to make sure that the best structures from a simulation is not lost when a process is terminated. It will also generate a text file to mark the number of trajectories already finished, therefore picking up from where num_trajectories were left off, so to speak.

Mock Prediction (not yet available)

Although it does not have all the bells and whistles for running a thorough prediction run when a sequence is known, the Remodel protocol can be used to run a "mock prediction" where the amino acid sequences are given and one just wanted to predict its final structure using sequence biased fragments (still generic/or hand assigned and has no secondary structure prediction from multiple sequence alignment) and relying largely on full-atom refinement to get a structure for a known sequence. In this case, one would switch on "-mock_prediction" and this swaps out the design scorefunction with the one (score4L) used for general loop prediction tasks. "-remodel:use_blueprint_sequence" should also be used, so fragments chosen from vall database will bias towards sequences known. But when doing so, be sure to assign the *SECOND* column of the blueprint file amino acids of your target sequence, and use PIKAA to manually force the final amino acids used in refinement to match the sequence needed a predicted structure.

Limitations

PDB files must be numbered starting from 1.

If a starting PDB has multiple chains, Remodel works on the first chain. Leave the chain field empty for this region or use the flag "-chain " followed by chain name (eg. A) to indicate the target chain. The other parts can still carry their chain designator, and they will not be touched by the protocol -- they will stay present throughout the simulation. If DNA is present it will be considered for scoring but will not move during the run. Also if you have anything other than protein, clustering on CA atom will not work, so clustering should be turned off by adding "-use_clusters false" to the command line.

Input Files

Structures - REQUIRED

At least one input PDB file must always be given. A single PDB, or a list of PDBs, must be specified on the command line. The remodel code is also compatible with PDBs containing DNA.

-s <pdb1> <pdb2>                                              A list of one or more PDBs to run fixbb upon
-l <listfile>                                                 A file that lists one or more PDBs to run fixbb upon, one PDB per line

Database location - REQUIRED

-database <path/to/rosetta/rosetta_database>                  Specifies the location of the rosetta_database

Blueprint - REQUIRED

-remodel:blueprint <blueprint>                                A file that specifies what operations should be performed to the input structure(s)

Options

Remodel options

-run:chain <letter>                                           Chain id of the chain to remodel, if a multichain structure is given
-remodel:num_trajectory <int>                                 The number of centroid level sampling trajectories to run (default: 10)
-remodel:dr_cycles <int>                                      Number of design/relax cycles to run (default: 3)
-remodel:save_top <int>                                       The number of structures to be processed in accumulators/clusters
-remodel:quick_and_dirty                                      Switch off CCD_refine. Useful in early stages of design when one wants to sample different loop lengths to find appropriate setup
-remodel:bypass_fragments                                     Skip creation of fragments for remodelling. (default: false)
-remodel:use_same_length_fragments                            Use same length fragments during centroid chain closure
-remodel:use_clusters                                         Specifies whether to perform clustering during structure aggregation (default: true)
-remodel:generic_aa <letter>                                  Instead of valine, use specified amino acid for generic centroid side chain during centroid phase
-remodel:use_blueprint_sequence                               Find fragments which have the same amino acid sequence as what is specified in the second column of the blueprint file
-remodel:build_disulf                                         Use Remodel to find residue pairs - between the "build" and "landing" residues - which would make good disulfide bonds
-remodel:match_rt_limit <float>                               The RMSD cutoff to use for determining how closely a potential disulfide must match observed disulfide distributions (default: 0.4)
-remodel:disulf_landing_range <range>                         The range within which remodel attempts to find disulfides for positions in the "build" region
-remodel:use_pose_relax                                       Perform Rosetta FastRelax during the protocol
-remodel:run_confirmation                                     run kinemtic loop closure on the rebuild segments to see how well they match to what was build with CCD
-remodel:swap_refine_confirm_protocols                        use kinematic loop closure algorithm for building and CCD closure for the confirmation
-remodel:checkpoint                                           ?
-remodel:repeat_structure                                     number of times to run RemodelLoopMover during centroid chain closure?
-symmetry:symmetry_definition                                 ?
-remodel:domainFusion:insert_segment_from_pdb <segment.pdb>   Insert the segment in the given PDB file into the input structure
-run::show_simulation_in_pymol                                show the trajectory in PyMOL (what else does this require?)
-enzdes:cstfile <constraint file>                             Use constraints specified in given file during remodelling
-vall <fragment_database_file>                                database file of fragments. e.g. vall.dat.2006-05-05"

Remodel allows switching between different builders, but the default is to use the RemodelLoopMover as the builder protocol. It is slightly different from the other builders because it assumes no knowledge of amino acid types at the centroid level. Only "vdw", "rama" and "rg" are used to score centroid level structures. Since the identity of an amino acid is not known until the backbone is built, as a placeholder for sidechain volume, VALINE centroids are used as the generic amino acid type. One can change this assignment by the option "-generic_aa A". This changes the generic amino acid to ALANINE. For different types of building tasks (such as building helices, sheets), one will want to also turn on the secondary structure relevant flags to turn on their scoring during the centroid mode phase.

For helices:

-remodel:hb_srbb 1.0                                          Turn on short-range backbone-backbone hydrogen bond term during centroid scoring

For sheets:

-remodel:hb_lrbb 1.0                                          Turn on long-range backbone-backbone hydrogen bond term during centroid scoring
-remodel:rsigma 1.0                                           Turn on rsigma scoring term during centroid scoring
-remodel:ss_pair 1.0                                          Turn on secondary structure pair term during centroid scoring

The values for the weights can be adjusted freely. But the scores are only used if the weight is nonzero.

Rotamers

-ex1                                                          Increase chi1 rotamer sampling for buried* residues +/- 1 standard deviation - RECOMMENDED
-ex2                                                          Increase chi2 rotamer sampling for buried* residues +/- 1 standard deviation - RECOMMENDED
-ex3                                                          Increase chi3 rotamer sampling for buried* residues +/- 1 standard deviation
-ex4                                                          Increase chi4 rotamer sampling for buried* residues +/- 1 standard deviation

-ex1:level <int>                                              Increase chi1 sampling for buried* residues to the given sampling level***
-ex2:level <int>                                              Increase chi1 sampling for buried* residues to the given sampling level
-ex3:level <int>                                              Increase chi1 sampling for buried* residues to the given sampling level
-ex4:level <int>                                              Increase chi1 sampling for buried* residues to the given sampling level

-extrachi_cutoff <int>                                        Set the number of Cbeta neighbors (counting its own) at which a residue is considered buried.
                                                              A value of "1" will mean that all residues are considered buried for the purpose of rotamer building.
                                                              Use this option when you want to use extra rotamers for less buried positions.

-use_input_sc                                                 Include the side chain from the input PDB.  Default: false
                                                              Including the input sidechain is "cheating" if your goal is to measure sequence recovery,
                                                              but a good idea if your goal is to eventually synthesize the designed sequence

\* Buried residues are those with >= threshold (default: 18) neighbors within 10 Angstroms (Cbeta-distance). This threshold can be controlled by the -extrachi_cutoff flag.

\** Aromatic residues are HIS, TYR, TRP, and PHE. Note: Including both -ex1 and -ex1_aro does not increase the sampling for aromatic residues any more than including only the -ex1 flag. If however, both -ex1 and -ex1_aro:level 4 are included on the command line, then aromatic residues will have more chi1 rotamer samples than non aromatic residues. Note also that -ex1_aro can *only increase* the sampling for aromatic residues beyond that for non-aromatic residues. -ex1:level 4 and -ex1_aro:level 1 together will have the same effect as -ex1:level 4 alone.

Energy Function

-packing:soft_rep_design                                      use soft_rep_design energy function weights which linearize vdW repulsive energy
-correct                                                      Use modified Rosetta energy function which has new terms and altered weights - RECOMMENDED

Other options

-nstruct <int>                                                The number of iterations to perform per input structure; e.g. with 10 input structures and an -nstruct
                                                              of 10, 100 trajectories will be performed. Default: 1.
-overwrite                                                    Overwrite the output files, if they already exist. Not used by default.

-min_type <string>                                            When combined with the -minimize_sidechains flag, specifies the line-search algorithm to use in the
                                                              gradient-based minimization . "dfpmin" by default.

-constant_seed                                                Fix the random seed
-jran <int>                                                   Specify the random seed; if unspecified, and -constant_seed appears on the command line, then the seed 11111111 will be used

Tips

Example Command Lines

Input files for all of these examples can be found in the integration test folder for RosettaRemodel, rosetta/rosetta_tests/tests/remodel/.

~/rosetta_source/bin/remodel.macosgccrelease -database ~/rosettadatabase/ -s 2ci2.renumbered.pdb -remodel:blueprint blueprint.2ci2.disulfides -remodel:build_disulf -remodel:match_rt_limit 1.0 -remodel:disulf_landing_range 1 65
-remodel:bypass_fragments -remodel:use_clusters false -no_optH false -remodel:num_trajectory 1 -remodel:save_top 5 -remodel:use_pose_relax true -correct -overwrite -mute core.pack > log

Use Remodel to find disulfides in the PDB 2ci2.renumbered.pdb beteween residues 7-14 and the rest of the protein, with an RMSD match of 1.0 A or less to observed disulfide distributions and outputting at most 5 possible disulfided structures. The output structures will have have been relaxed using Rosetta FastRelax so they will have minor deviations from the starting structure. However, because the "-remodel:bypass_fragments" option is present, no actual backbone changes will be made to the remodelled residues.

NOTE: In this case num_trajectory is 1, but we have requested outputting of 5 structures. This is because there is more than one possible disulfide in this one structure, and Remodel will make them all. (If only 4 are possible, only 4 will be generated.)

In the log, you will find lines like:

protocols.forge.remodel.RemodelDesignMover: DISULF possible 22.8416
protocols.forge.remodel.RemodelDesignMover: DISULF 30x5
protocols.forge.remodel.RemodelDesignMover: match_rt 1.04513

These lines tell you the quality of the disulfide bond. The first one is a CA distance squared, it only evaluates them if within 5 A. the second tells you it's between residues 30 and 5, and the third line is how different this pair is from the closest disulfide in the database.

WARNING: Currently, the disulfide building scheme can lead to an infinite loop. It works very well if you give it positions that are plausible, but if it cannot find a disulfide to build, it will try in an infinite loop. Building perfect disulfides can be very difficult, thus this setup was used. This is especially bad for scanning fixed backbones. Since these backbones can't move, failure on the first try will guarantee failure the second time, too. Be sure to monitor disulfide scanning runs to kill by hand, if necessary.

~/rosetta_source/bin/remodel.macosgccrelease -database ~/rosettadatabase/ -s 2ci2.renumbered.pdb -remodel:blueprint blueprint.2ci2.disulfides -remodel:build_disulf -remodel:match_rt_limit 1.0 -remodel:disulf_landing_range 1 65
-remodel:use_clusters false -no_optH false -remodel:num_trajectory 1 -remodel:save_top 5 -remodel:use_pose_relax true -correct -overwrite -mute core.pack > log

Use Remodel to build disulfides in the PDB 2ci2.renumbered.pdb, with a match of 0.5 A to observed disulfide distributions and outputting at most 5 possible disulfided structures, and use FastRelax to minimize the disulfided structures. Note that in this example "-remodel:bypass_fragments" is not present, so Remodel will change the backbone of residues 7-14 to find better disulfides.

~/rosetta_source/bin/remodel.macosgccrelease -database ~/rosettadatabase/ -s 2ci2.renumbered.pdb -remodel:blueprint blueprint.2ci2.domaininsertion -remodel:domainFusion:insert_segment_from_pdb 2ci2.insert.pdb
-remodel:quick_and_dirty -run:chain A -remodel:num_trajectory 3 -overwrite

blueprint.2ci2.domaininsertion (excerpt):
39 V .
40 T L PIKAA T
41 M L PIKAA M
42 E L PIKAA E
43 Y L PIKAA Y
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
44 R L PIKAA R
45 I .

Insert the residues in 2ci2.insert.pdb between residues 43 and 44 and remodelling the flanking residues 40-43 and 44 (all specified in the blueprint file, excerpt shown) of the starting structure 2ci2.renumbered.pdb using the quick-and-dirty mode and output 3 putative models. Note in the excerpt from the blueprint file that NATRO has been specified for the insert residues. If this tag was left off, the inserted residues would be remodelled and could change identity. Also, because NATRO has been specified for the inserted residues, that put Remodel in manual design mode. This means that all other remodelled residues will become valines (or whatever amino acid is specified by the "-remodel:generic_aa" flag) unless corresponding residue behaviours are added for those residues too. In the example above, we want to rebuild the residues flanking the insert, but preserve the wild-type residue identity, hence the PIKAA tokens.

Expected Outputs

REMODEL HANDLES ITS OWN FILE I/O, and only uses the job_distributor to launch the process. Normally the job distributor will write out a file at the end of a run, usually in the format of XXXX_0001.pdb; this file is NOT TO BE TRUSTED IF YOU USE -num_trajectory GREATER THAN 1!! Instead, look for files that are simply 1.pdb, 2.pdb, etc. Due to internal cacheing of structures in both the clustering and structure accumulation stage, the Remodel protocol generates more structures internally than what is expected by job_distributor. If one trajectory was used, then the 1.pdb will have the same info as the XXXX_0001.pdb from the job_distributor. Once Accumulator/Clustering is done, due to sorting done internally, the structure with lowest energy, according to score12, is output as XXXX_0001.

Post Processing

Troubleshooting

If you get a seg fault right after reading the blueprint, it may be because you have an empty line at the end of the file. The blueprint file should just end at the last definition and have no empty blank lines.

If you see errors complaining about some residue doesn't exist or that the ResfileReader generated problems, it's very likely that you have a chain definition in your PDB file and you are running manual design mode. The easiest solution is to delete the chain id from your starting PDB or checkpointed PDB's. The checkpointed PDB's carry over whatever chain the starting PDB has, so sometimes they also have to be corrected to remedy this problem.

For extensions, it is important to assign secondary structure to the flanking residues or otherwise errors will occur.

Why are some of my positions turned into alanine? This can occur because you assigned some positions to build manually, but forgot to assign all rebuilding positions. Currently, once you are in manual mode, everything is controlled manually (except neighboring residues that can be picked automatically).

New things since last release

RosettaRemodel is being released for the first time with Rosetta v3.3.

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines