• See also the Remodel documentation.

Metadata

This document was last edited on January 14th, 2016 by Jared Adolf-Bryfgogle. The code was written by Possu Huang and Andrew Ban. The corresponding PI is William Schief (schief@scripps.edu).

An introductory loop modeling tutorial can be found here.

Code and Demos

The remodel application is started from the application in rosetta/main/source/src/apps/public/remodel.cc . This application uses classes from the protocols library, which can be found in the directory rosetta/main/source/src/protocols/forge/remodel .

References

We recommend the following articles for further studies of RosettaDesign methodology and applications:

Purpose

The remodel application is an alternative way to use the loop modeling tools in Rosetta, tailor-made for design. The basic components of the tool consists of a building stage (at the centroid level) and a design stage. Generally there is a "partial design" stage to direct simulation to satisfy packing requirements. Instructions are given using a blueprint file containing a description of the task, and any other files that are needed (e.g. a constraints definition file or a PDB containing a segment of a structure to be inserted into the starting PDB).

Algorithm

Remodel operates in various stages. The protocol begins in the building stage, where it reads in the blueprint file (discussed below) which describes what will be done to the input structure. A blueprint file consists of three columns:

1 N .

where the first column is the residue number in the starting PDB, the second column is a single letter (usu. amino acid) code, and the last column specifies the building instruction. By replacing the "." to "E", "L", "H", "I" or "D", Remodel will build residues with secondary structure of extended strand, loop, helix, inserted, or random, respectively. This notation controls the secondary structure used in harvesting fragments from the fragment database (known as "vall"). A default blueprint file can be generated by running the following command:

rosetta/tools/remodel/getBlueprintFromCoords.pl -pdbfile [starting pdb] > [blueprint file]

Below are some examples of blueprint files and what they will do to the starting structure.

Basic remodelling tasks

Fixed backbone design

4 I . PIKAA A
5 L . ALLAA
6 N . POLAR
7 G .

This blueprint will perform design on positions 4, 5, and 6 and leave position 7 fixed.

Remodel (no length change)

4 I . PIKAA A
5 L H
6 N H
7 G .

This will perform design on position 4, and rebuild residues 5 and 6 using helical fragments.

Extension

4 I . PIKAA A
5 L H 
0 x H
0 x H
6 N H 
7 G . 

Inserts two residue between residues 5 and 6, still using helical fragments. The sidechains of this newly built helical segment will be designed automatically. Note: You must allow remodelling of the residues flanking the extension. In this example, residues 5 and 6 need to be remodelled for the protocol to work correctly.

Deletion

4 I . PIKAA A
5 L H ALLAA 
7 G H APOLAR
8 A .

Simply removing a line while giving build assignments to the positions before and after the deletion will shrink the PDB and reconnect (if possible, given the slack in the structure) the rest of the structure. One might need to include more positions in the rebuild assignment to ensure enough elements can move to reconnect the junction.

De novo structure

C-terminal extension

1 A .
2 A .
0 X H
0 X H
0 X H
0 X H

Build off of a 2 amino acid stub (residue 1 + 2) any secondary structure desired. In the blueprint above, a 4 residue helix is created and its sidechains are automatically designed. Although the second columns has alanines, it will not affect the final design. In the blueprint file, the second column is only used for extensions ("x") or if sequence-biased fragments are desired (using the flag -remodel:use_blueprint_sequence). In this case, all positions will be automatically designed because no resfile manual assignments were made for the positions. If a single manual resfile assignment is made, RosettaRemodel switches to manual design mode and all positions which are being rebuilt should be given resfile specifications.

Disulfide design

Remodel can be used to design disulfides with the flag "-remodel:build_disulf". The protocol is designed to rebuild (using backbone changes) regions of a structure such that a good disulfide can be formed with another residue in the region not being remodelled. It will try to build and scan for all possible disulfides between the build region and the "landing" range (but not within the build region). The build region is assigned in the blueprint file (by specifying one or more positions to be remodelled), and, by default, the landing range is all non-remodelled positions. A custom landing range can be specified with the flag "-remodel:disulf_landing_range". Remodel will continue to run until it succeeds in building a good disulfide (according to its scoring criteria). If Remodel is asked to find disulfides in an input structure and no good disulfides are possible, it will result in an infinite loop. Be sure to monitor disulfide scanning runs to kill by hand, if necessary.

Note: The protocol is NOT intended for scanning a fixed-backbone structure for all pairs of positions that could make a disulfide, but it can be used to do something close to that. By using the "-remodel:bypass_fragments" option, no backbone changes will be made to the build region and the protocol will look for good disulfides between the build region residues and the landing residues. An alternative Rosetta application, 'design_disulfide', can also be used to scan for all possible position pairs suitable for a disulfide in a given structure.

In the Remodel log output, there will be lines like the following:

protocols.forge.remodel.RemodelDesignMover: DISULF possible 22.8416
protocols.forge.remodel.RemodelDesignMover: DISULF 30x5
protocols.forge.remodel.RemodelDesignMover: match_rt 1.04513

These lines tell you the quality of the disulfide bond. The first one is the CA distance squared - it only evaluates them if within 5 A. The second tells you it's between residues 30 and 5, and the third line is how different this pair is from the closest disulfide in the database. One can tighten the match_rt_limit setting to make "better" disulfides, but this runs the risk of not able to build any. In Rob's disulfide potential, this setting corresponds to a RMSD measure from your backbone to a real disulfide in the database. The disulfides in the database are harvested with idealized structures, so the precision isn't sub-angstrom. Around 1A is pretty good.

It's highly recommended to use the pose relax refinement scheme for disulfide refinement (by including the option "-remodel:use_pose_relax true" on the command line).

Disulfide design (auto)

1 M .
2 A .
3 R .
4 I .
5 L L
6 N L
7 G L
8 E .
9 T .
10 Y .

Try to find disulfides with a score cutoff to structurally observed disulfides ("-remodel:match_rt_limit", default 0.4) between build residues 5, 6, and 7 and the non-remodelled residues 1, 2, 3, 4, 8, 9 and 10 by remodelling resiudes 5, 6, and 7 using loop fragments. The "landing range" by default is all non-remodelled residues, but a particular region can be specified with the -remodel:disulf_landing_range option.

Disulfide design (limited region)

4 I .
5 L L
6 N L
7 G L
8 E .
...
88 K . DS_start
89 N .
90 E . DS_stop

Find disulfides between build residues 5, 6, 7 and landing residues 88, 89, and 90. This same behaviour can be obtained by using the blueprint file in the example above with the flag -remodel:disulf_landing_range 88-90.

Disulfide design (specific pair)

4 I .
5 L L DM_start DM_stop
6 N L
7 G L
8 E .
...
88 K . DS_start DS_stop
89 N .

Build a disulfide between residue 5 and 88, if possible. If not possible, Remodel will run in an infinite loop.

Domain insertion

Remodel can also be used to do domain insertion. A PDB file containing the residues to be inserted must be specified with the option -remodel:domainFusion:insert_segment_from_pdb. The PDB file containing the segment to be inserted does not need to be renumbered.

Note: Is necessary to allow the residues flanking the insert to be remodelled or otherwise errors will occur. The identities of these positions can be fixed to their native types (as in the example below) but their backbones must be allowed to be remodelled to accomodate the inserted segment. Depending on the length of the insert, more than one residue on each side of the inserted segment may need to be remodelled for loop closure to succeed. In other words, if only one residue on each side of the insert is allowed to remodel, Remodel may not be able to "fold" the inserted segment in such a way that the chain can be connected. The protocol will continue to try indefinitely until a "loop closed" solution is found, or until the process is killed manually.

4 I .
5 L L PIKAA L
0 x I NATAA
0 x I NATAA
0 x I NATAA
0 x I NATAA
6 N L PIKAA L
7 G .

Inserts the 4 residue segment contained in the file specified with the option -remodel:domainFusion:insert_segment_from_pdb between residues 5 and 6 and remodel residues 5 and 6 to accomodate the inserted sequence. In this example above, Remodel will preserve the identities of the amino acids in the inserted segment because of the NATAA specifications in the fourth column. Similarly, Remodel will preserve the identities of the amino acids flanking the inserted segment.

Special pair linker setup scheme

I have recently been asked to provide a setup where multiple linkers can be built on two different chains simultaneously. More specifically for a domain assembly type setup where a dimer interface is held constant while two extra domains are linked to each of the binding partners. This special case requires Remodel to recognize the input pose consists of two chains and allows fragment insertion without holding a jump across the segment. Use two flags to achieve this: The starting PDB already contains all the domains, now just build linkers to assemble them in a plausible configuration.

-remodel:two_chain_tree [start of the second chain]  
-remodel:no_jumps

Tethered docking

A segment of the structure can be rebuilt, but without restricting it to closure criteria. This is useful for things like docking a helix onto a backbone, and one can build a new connection once the helix is in place. Normally only "-remodel:bypass_closure" is needed, and a cut will be introduced to the C-term of the blueprint assignment range, but it can be switched to N-term shall the helix, or any structure, need building backwards. Recommended for use only when rebuilding a single segment.

-remodel:RemodelLoopMover:bypass_closure
-remodel:RemodelLoopMover:force_cutting_N

Remodel with constraints

Remodel uses the constraint setup in the enzyme design ("EnzDes") protocol, because it separates the definition of residue positions and the constraints to be applied to them. This setup allows blueprint to double as a constraint position definition file and allow all extensions and deletions to be handled elegantly. Note that the build constraint defined here will only be used in the centroid stage.

To setup build constraints, two things must be done. First, a constraint definition text file has to be created, following the enzdes constraints format . Note that backbone atoms need extra manual declaration of "is_backbone" otherwise they won't get applied in the build stage. Also, the atom_type fields expect Rosetta atom types.

CST::BEGIN
  TEMPLATE::   ATOM_MAP: 1 atom_type: Nbb
  TEMPLATE::   ATOM_MAP: 1 is_backbone
  TEMPLATE::   ATOM_MAP: 1 residue3: ALA CYS ASP GLU PHE GLY HIS ILE LYS LEU MET ASN PRO GLN ARG SER THR VAL TRP TYR

  TEMPLATE::   ATOM_MAP: 2 atom_type: OCbb
  TEMPLATE::   ATOM_MAP: 2 is_backbone
  TEMPLATE::   ATOM_MAP: 2 residue3: ALA CYS ASP GLU PHE GLY HIS ILE LYS LEU MET ASN PRO GLN ARG SER THR VAL TRP TYR

  CONSTRAINT:: distanceAB:    2.80   0.20 100.00  0
CST::END

This block describes a constraint setup between a backbone Nitrogen and backbone Oxygen atom to fall within a hydrogen bonding distance (2.8 Ã). The residue3 tag lists all the amino acid types so when the constraint is applied, it simply ignores the amino acid identity. A single letter code can be used instead if "residue1" is used.

Notes on cst file specification- the "atom_type" designation requires only one name, and the connectivity will be automatically looked up in the residue params file. But if "atom_name" is used, one would have to explicitly specify three atoms so the torsions and angles can be set correctly. For example:

ATOM_MAP: 1
atom_name N CA C

and

ATOM_MAP: 2
atom_type Obb

given a TORSION_A constraint, it will restrain the torsion between O-N-CA-C.

The terms that can be used are as follows: distanceAB, torsion_A, torsion_B, torsion_AB, angle_A, angle_B:

ATOM_MAP residue1 (or residue3) assignment should match the kind of atoms they define. In the example above, a backbone constraint can be applied to all amino acid types, so we list all 20 AA's. But if a sidechain constraint is required, be sure to use only the residues that can satisfy those descriptions (e.g. atom NE2 only exists in GLN).

The second thing that needs to be done to setup build constraints is that the positions corresponding to the residues defined in the constraint file should be tagged in the blueprint file.

10 K .
11 T .
12 L . CST1A
13 K .
14 G H
15 E H
16 T H
17 T H CST1B
18 T H

In this example, the Nbb atom on position 12 will have a H-bond constraint setup with the OCbb atom on position 17. The residue tagged as "A" corresponds to "Atom Map: 1" in the constraint definition block, "B" corresponds to "Atom Map: 2" (unfortunately, this conversion between A, B and 1, 2 is needed so the number of constraints is not limited to 26, A...Z. If you have more than one constraint to setup for each position, just continue with the definition (but be sure to duplicate the CST entry in the constraint file, each block correspond to CST1A/B, CST2A/B, ..., etc).

10 K .
11 T .
12 L . CST1A CST2B
13 K .
14 G H
15 E H CST2A
16 T H
17 T H CST1B
18 T H

In this second case, the Nbb and OCbb on position 12 are constrained with the OCbb on position 15 and Nbb on position 17, respectively.

Once the constraints and the blueprint files are tagged correctly, to use them, one simply adds "-enzdes:cstfile [constraint file name]" to the command line.

4 I . PIKAA A CST1A
5 L H ALLAA
6 N H POLAR CST1B
7 A .

Rebuild residues 5 and 6 with helical fragments and apply constraints between residues 4 and 6 in the build process, followed by manual design protocols. The desired geometry of the constrained pair is specified in the separate constraints file.

Constraint Filters

When running a build with constraint definition, Remodel automatically uses a constraint filter and only generates decoys that satisfies your constraints, within 10 energy units. The stringency can be adjusted by changing the weights on the constraint term, for example, in the cstfile defined below

  CONSTRAINT:: distanceAB:    2.80   0.30 100.00  0

a constraint weight of 100.00 is defined. Effectively any violation of this distance constraint will be greater than the 10 energy unit allowed. To soften this filter, one can simply change the line:

  CONSTRAINT:: distanceAB:    2.80   0.30   1.00  0

the change in the constraint weight effectively gives 1/100th the penalty. After each round of building, the log will indicate what the atom_pair violation is and the user can tweak the weight based on this.

Advanced Uses

Design

There are different stages of designing a newly built structure. In a accumulation stage (the "partial structure"), Remodel counts neighbors to identify what residues are buried, and only uses the buried positions in this stage. There are 6 different design modes. The decision for the switches are controlled by whether there's a 4th column added to the blueprint and two extra flags:

If there's no resfile command assignment in the 4th column (except CST assignments), Remodel uses automatic design. Auto design will pick core positions automatically and use only hydrophobic amino acids. To control whether neighboring residues are included, repacked or redesigned, use the following flags:

-find_neighbors alone will repack all the neighbors, in either auto or manual mode, otherwise only the rebuilt positions described in the blueprint file will be redesigned. -find_neighbors -design_neighbors will redesign all the neighbors.

For manual design, the blueprint file also handles designing positions using resfile commands, so one can specify:

1 N .
2 K H PIKAA K
0 x H ALLAA
0 x H APOLAR
3 G H ALLAA
4 W .
5 K .

If you are using manual mode, it is recommended that you assign all the positions included in the rebuilding segment (but not necessarily elsewhere), otherwise they will be turned into valines (the default residue built during the centroid phase). This assignment, however, is not limited to the residues to be rebuilt. It works just like a resfile, you can turn any residue on the fly.

Additionally, the normal design flags such as -ex1 -ex2 should also be included to control this design behavior.

KIC Confirmation

[This feature is to be tested]

After building the structure using CCD_refinement, one still does not know what would be the best sequence to produce experimentally. A new feature is to check those sections with an orthogonal method, and if the results converge, there is a higher chance that the loop is plausible. Currently we use CCD to build, and use KIC for the refinement step. If "-remodel:run_confirmation" was issued, structures generated by CCD will be tested in a KIC refinement step using the sequences designed from the CCD step. The build region will be expanded by 2 residues (to reduce bias) on both sides, and the entire segment will be rebuilt by KIC_refine protocol. The confirmation stage is currently setup as an evaluation check instead of a filter, so structures failed the test will still be produced. RMSD for the loop CA atoms to the CCD refined structure will be reported in both the log file and the REMARKs section (if turning on "-preserve_header true"). One can post-filter the structures based on this info.

One can also swap the methods to build with KIC and refine with CCD by "-swap_refine_confirm_protocols".

Clustering

In specifying -remodel:num_trajectory, Remodel can make decisions on the refinement step by going after only the unique structures that score well coming out of the partial design stage. It collects all the structures and clusters them, and only use the lowest energy (presumably having the best contacts in the core) models to carry out full-atom refinement. By default, no clustering is performed.

-use_clusters false <= this will ask the program to refine all the structures generated at centroid level.

One can turn this feature on with

-use_clusters true

But be sure to set a trajectory number greater than one. Single structure can't form a cluster and Remodel will generate no output.

If using the clustering strategy, one can use "-cluster_radius [real number]" to set how different the clusters should look.

Checkpointing

Remodel has its own checkpointing scheme, in which structures are written out to disk and can be re-introduced to the program when the process is rebooted – as in the case of being pre-emped from a cluster but restarted later on.

Checkpointing can be switched on by issuing the flag: "-checkpoint".

Checkpointing is related to the "-save_top [number]" option, as checkpointing recovers the same number of structures written out by the -save_top command. (If your first pass of the protocol didn't generate the number of structures specified in -save_top, only the ones generated will be recovered). This scheme is only used to make sure that the best structures from a simulation are not lost when a process is terminated. It will also generate a text file to mark the number of trajectories already finished, therefore picking up from where num_trajectories were left off, so to speak.

Mock Prediction (not yet available)

Although it does not have all the bells and whistles for running a thorough prediction run when a sequence is known, the Remodel protocol can be used to run a "mock prediction" where the amino acid sequences are given and one just wanted to predict its final structure using sequence biased fragments (still generic/or hand assigned and has no secondary structure prediction from multiple sequence alignment) and relying largely on full-atom refinement to get a structure for a known sequence. In this case, one would switch on "-mock_prediction" and this swaps out the design scorefunction with the one (score4L) used for general loop prediction tasks. "-remodel:use_blueprint_sequence" should also be used, so fragments chosen from vall database will bias towards sequences known. But when doing so, be sure to assign the SECOND column of the blueprint file amino acids of your target sequence, and use PIKAA to manually force the final amino acids used in refinement to match the sequence needed a predicted structure.

Limitations

A major limitiation of the Remodel protocol is that input PDB files must be numbered starting from 1. A PDB numbered starting from 1 can be created using the fixbb application with the flag -renumber_pdb and a resfile that has NATRO as the default behaviour.

If a starting PDB has multiple chains, Remodel works on the first chain. Use the flag "-chain " followed by chain name (eg. A) to indicate the target chain. The other parts can still carry their chain designator, and they will not be touched by the protocol – they will stay present throughout the simulation. Note that at the moment, one should always use -chain field, even if model or target model has only one chain. If DNA is present it will be considered for scoring but will not move during the run. Also if you have anything other than protein, clustering on CA atom will not work, so clustering should be turned off by adding "-use_clusters false" to the command line.

Input Files

Structures - REQUIRED

At least one input PDB file must always be given. A single PDB, or a list of PDBs, must be specified on the command line. If doing denovo design, this is the stub PDB. The remodel code is also compatible with PDBs containing DNA.

-s <pdb1> <pdb2>                                              A list of one or more PDBs to run fixbb upon
-l <listfile>                                                 A file that lists one or more PDBs to run fixbb upon, one PDB per line
-run::chain <letter>                                          Chain id of the chain to remodel

Database location - REQUIRED

-database <path/to/rosetta/main/database>                  Specifies the location of the rosetta_database

Blueprint - REQUIRED

-remodel:blueprint <blueprint>                                A file that specifies what operations should be performed to the input structure(s)

Options

Remodel options

-remodel::num_trajectory <int>                                 the number of centroid level sampling trajectories to run (default: 10)
-remodel::dr_cycles <int>                                      number of design/refine cycles to run (default: 3)

-remodel::quick_and_dirty                                      only do fragment sampling; bypass refinement of final structures which is slow. useful in early stages of design when one wants to sample different 
                                                               loop lengths to find appropriate setup (default: false)
-remodel::bypass_fragments                                     skip creation of fragments for remodelling; do refinement only. no extensions or deletions are honored in the blueprint (default: false)
-remodel::use_blueprint_sequence                               find fragments which have the same amino acid sequence and secondary structure as what is specified in the second column of the blueprint file (default: false)
-remodel::use_same_length_fragments                            harvest fragments that match the length of the segment being rebuilt (default: true)

-remodel::build_disulf                                         use Remodel to find residue pairs - between the "build" and "landing" residues - which would make good disulfide bonds (default: false)
-remodel::match_rt_limit <float>                               the score cutoff to use for determining how closely a potential disulfide must match observed disulfide distributions (default: 0.4)
-remodel::disulf_landing_range <range>                         the range within which remodel attempts to find disulfides for positions in the "build" region (default: none)

-remodel::use_pose_relax                                       add fast relax to the refinement stage (instead of the default minimization step), but use constraints in a similar way (default: false)
-remodel::run_confirmation                                     run kinemtic loop closure on the rebuild segments to see how well they match to what was build with CCD (default: false)
-remodel::swap_refine_confirm_protocols                        swap protocols used for refinement and confirmation test; i.e. use kinematic loop closure instead of CCD closure (default: false)
-remodel::repeat_structure                                     build identical repeats this many times (default: 1)

-symmetry::symmetry_definition                                 text file describing symmetry setup (default: none)

-enzdes::cstfile <constraint file>                             use constraints specified in given file during remodelling
-remodel::cstfilter                                            threshold to put on the atom_pair_constraint score type filter during the centroid build phase refinement (default: 10)

-remodel::domainFusion:insert_segment_from_pdb <segment.pdb>   segment PDB file to be inserted into the input structure
-remodel::checkpoint                                           turns on checkpointing, for use in preemptive scheduling environments. writes out the best pdbs collected after each design step. (default: false)
-remodel::save_top <int>                                       the number of structures to be processed in accumulators/clusters (default: 5)

-remodel::no_jumps                                             will setup simple foldtree and fold through it during centroid build (default: false)
-remodel::use_clusters                                         specifies whether to perform clustering during structure aggregation (default: false)

-remodel::core_cutoff <int>                                    number of neighbors required to consider core in auto design (default: 15)
-remodel::boundary_cutoff <int>                                number of neighbors required to consider boundary in auto design (default: 10)

-remodel::generic_aa <letter>                                  residue type to use as placeholder during centroid phase (default: V)
-remodel::cen_sfxn                                             score function to be used for centroid phase building (default: remodel_cen)
-remodel::cen_minimize                                         centroid minimization after fragment building (default: false)

-remodel::two_chain_tree                                       label the start of the second chain (default: none)
-remodel::repeat_structure                                     build identical repeats this many times (default: 1)

-remodel::lh_ex_limit                                          loophashing neighboring bin expansion limit (default: 5)
-remodel::lh_filter_string                                     loophash ABEGO filter target fragment type. list sequentially for each loop. (default: none)
-remodel::lh_cbreak_selection                                  loophash with cbreak dominant weight (default: 10)
-remodel::lh_closure_filter                                    filter for close rms when bypass_closure is used (default: false)
-remodel::no_design                                            skips all design steps. WARNING: will only output centroid level structures and dump all fragment tries (default: false)
-remodel::silent                                               dumps all structures by silent-mode WARNING: will work only during no_design protocol (see -no_design) (default: false)
-remodel::allow_rare_aro_chi                                   allow all aromatic rotamers, not issuing AroChi2 filter (default: false)
-remodel::skip_partial                                         skip design stage that operate only on burial positions (default: false)
-remodel::design_neighbors                                     during automatic design mode, design neighbors (default: false)
-remodel::find_neighbors                                       during automatic design mode, find neighbors for design/repack (default: false)

-remodel::RemodelLoopMover::max_linear_chainbreak              if linear chainbreak is <= this value, loop is considered closed (default: 0.07)
-remodel::RemodelLoopMover::randomize_loops                    randomize loops prior to running main protocol (default: true)
-remodel::RemodelLoopMover::allowed_closure_attempts           the allowed number of overall closure attempts (default: 1)
-remodel::RemodelLoopMover::use_loop_hash                      centroid build with loop hash (default: false)
-remodel::RemodelLoopMover::loophash_cycles                    the number of loophash closure cycles to perform (default: 8)
-remodel::RemodelLoopMover::simultaneous_cycles                the number of simultaneous closure cycles to perform (default: 2)
-remodel::RemodelLoopMover::independent_cycles                 the number of independent closure cycles to perform (default: 8)
-remodel::RemodelLoopMover::boost_closure_cycles               the maximum number of possible lockdown closure cycles to perform (default: 30)
-remodel::RemodelLoopMover::force_cutting_N                    force a cutpoint at N-term side of blueprint assignment (default: false)
-remodel::RemodelLoopMover::bypass_closure                     skip the loop closure check during the centroid phase; sets linear_chainbreak term weight to zero. for tethered docking purpose (default: false)
-remodel::RemodelLoopMover::cyclic_peptide                     circularize structure joining N and C-term (default: false)
-remodel::RemodelLoopMover::temperature                        temperature for monte carlo (default: 2.0)

-run:show_simulation_in_pymol                                  show the trajectory in PyMOL (every X seconds) (default: 0)

-vall <fragment_database_file>                                 database file of fragments. e.g. vall.dat.2006-05-05"

Remodel allows switching between different builders, but the default is to use the RemodelLoopMover as the builder protocol. It is slightly different from the other builders because it assumes no knowledge of amino acid types at the centroid level. Only "vdw", "rama" and "rg" are used to score centroid level structures. Since the identity of an amino acid is not known until the backbone is built, as a placeholder for sidechain volume, VALINE centroids are used as the generic amino acid type. One can change this assignment by the option "-generic_aa A". This changes the generic amino acid to ALANINE. For different types of building tasks (such as building helices, sheets), one will want to also turn on the secondary structure relevant flags to turn on their scoring during the centroid mode phase.

For helices:

-remodel:hb_srbb 1.0                                          turn on short-range backbone-backbone hydrogen bond term during centroid scoring (default: 0.0)

For sheets:

-remodel:hb_lrbb 1.0                                          turn on long-range backbone-backbone hydrogen bond term during centroid scoring (default: 0.0)
-remodel:rsigma 1.0                                           turn on rsigma scoring term during centroid scoring (default: 0.0)
-remodel:ss_pair 1.0                                          turn on secondary structure pair term during centroid scoring (default: 0.0)

Other centroid scorefunction weights:

-remodel:vdw                                                  change weight for centroid vdw energy (default: 1.0)
-remodel:rama                                                 change weight for centroid rama energy (default: 0.1)
-remodel:cbeta                                                turn on cbeta term during centroid scoring (default: 0.0)
-remodel:cenpack                                              turn on cenpack term during centroid scoring (default: 0.0)
-remodel:rg                                                   turn on rg term during centroid scoring (default: 0.0)

The values for the weights can be adjusted freely. But the scores are only used if the weight is nonzero.

Rotamers

-ex1                                                          Increase chi1 rotamer sampling for buried* residues +/- 1 standard deviation - RECOMMENDED
-ex2                                                          Increase chi2 rotamer sampling for buried* residues +/- 1 standard deviation - RECOMMENDED
-ex3                                                          Increase chi3 rotamer sampling for buried* residues +/- 1 standard deviation
-ex4                                                          Increase chi4 rotamer sampling for buried* residues +/- 1 standard deviation

-ex1:level <int>                                              Increase chi1 sampling for buried* residues to the given sampling level***
-ex2:level <int>                                              Increase chi1 sampling for buried* residues to the given sampling level
-ex3:level <int>                                              Increase chi1 sampling for buried* residues to the given sampling level
-ex4:level <int>                                              Increase chi1 sampling for buried* residues to the given sampling level

-extrachi_cutoff <int>                                        Set the number of Cbeta neighbors (counting its own) at which a residue is considered buried.
                                                              A value of "1" will mean that all residues are considered buried for the purpose of rotamer building.
                                                              Use this option when you want to use extra rotamers for less buried positions.

-use_input_sc                                                 Include the side chain from the input PDB.  Default: false
                                                              Including the input sidechain is "cheating" if your goal is to measure sequence recovery,
                                                              but a good idea if your goal is to eventually synthesize the designed sequence

* Buried residues are those with >= threshold (default: 18) neighbors within 10 Angstroms (Cbeta-distance). This threshold can be controlled by the -extrachi_cutoff flag.

** Aromatic residues are HIS, TYR, TRP, and PHE. Note: Including both -ex1 and -ex1_aro does not increase the sampling for aromatic residues any more than including only the -ex1 flag. If however, both -ex1 and -ex1_aro:level 4 are included on the command line, then aromatic residues will have more chi1 rotamer samples than non aromatic residues. Note also that -ex1_aro can only increase the sampling for aromatic residues beyond that for non-aromatic residues. -ex1:level 4 and -ex1_aro:level 1 together will have the same effect as -ex1:level 4 alone.

Energy Function

-packing:soft_rep_design                                      use soft_rep_design energy function weights which linearize vdW repulsive energy

Other options

-overwrite                                                    Overwrite the output files, if they already exist. Not used by default.

-min_type <string>                                            When combined with the -minimize_sidechains flag, specifies the line-search algorithm to use in the
                                                              gradient-based minimization . "dfpmin" by default dfpmin_armijo_nonmonotone recommended.

Tips

  • The flags -ex1 and -ex2 are recommended to get well-packed output structures.
  • A two-line resfile where the first line is "NATAA" and the second line is "start" is sufficient to say "only repack the input sequence". This optimizes only the input structures side-chain rotamers, and leaves the sequence the same.
  • If you are redesigning with many rotamers (>6K), then using the linear-memory interaction graph will save both time and memory. Add the flag "-linmem_ig 10" to activate the linear-memory interaction graph.

Example command lines

Input files for all of these examples can be found in the integration test folder for RosettaRemodel, rosetta/main/tests/integration/tests/remodel/ .

rosetta/main/source/bin/remodel.macosgccrelease -database rosetta/main/database/ -s 2ci2.renumbered.pdb -remodel:blueprint blueprint.2ci2.remodel -run:chain A -remodel:num_trajectory 2 -remodel:quick_and_dirty -overwrite
blueprint.2ci2.remodel (excerpt):
36 G .
37 T L
38 I L
39 V L
40 T L
41 M L
42 E L
43 Y L
44 R L
45 I .

Use Remodel to sample alternate backbone conformations (and change the identity of) loop residues 37-44, but use the quick_and_dirty mode (which disables CCD loop closure and shortens runtime considerably). To hold the identity of the residues fixed, the resfile assignment PIKAA X should be added at the end of the line for each of the residues being remodelled.

rosetta/main/source/bin/remodel.macosgccrelease -database rosetta/main/database/ -s 2ci2.renumbered.pdb -remodel:blueprint blueprint.2ci2.disulfides -remodel:build_disulf -remodel:match_rt_limit 6.0 
-remodel:disulf_landing_range 1 65 -remodel:bypass_fragments -remodel:use_clusters false -no_optH false -remodel:num_trajectory 1 -remodel:save_top 5 -remodel:use_pose_relax true -correct -overwrite 
-mute core.pack > log

Use Remodel to find disulfides in the PDB 2ci2.renumbered.pdb beteween residues 7-14 and the rest of the protein, with a score match of 6.0 or less to observed disulfide distributions and outputting at most 5 possible disulfided structures. The output structures will have have been relaxed using Rosetta FastRelax so they will have minor deviations from the starting structure. However, because the "-remodel:bypass_fragments" option is present, no large backbone changes will be made to the remodelled residues. The -remodel:disulf_landing_range 1 65 flag is not necessary; Remodel will determine the landing range from the blueprint file if not specified on the command line.

NOTE: In this case num_trajectory is 1, but we have requested outputting of the top 5 structures. This is because there is more than one possible disulfide in this one structure, and Remodel will consider them all in the 1 trajectory. However, it will only output the 5 best ones. If only 4 are possible, only 4 will be generated.

rosetta/main/source/bin/remodel.macosgccrelease -database rosetta/main/database/ -s 2ci2.renumbered.pdb -remodel:blueprint blueprint.2ci2.disulfides -remodel:build_disulf -remodel:match_rt_limit 1.5 -remodel:use_clusters false
-no_optH false -remodel:num_trajectory 3 -remodel:save_top 3 -remodel:use_pose_relax true -correct -overwrite -mute core.pack > log

Use Remodel to build disulfides in the PDB 2ci2.renumbered.pdb, with a score of 1.5 or better to observed disulfide distributions and outputting at most 3 possible disulfided structures, and use fast relax to minimize the disulfided structures. Note that in this example "-remodel:bypass_fragments" is not present, so Remodel will change the backbone of residues 7-14 to find better disulfides. In fact, Remodel will go through 3 trajectories in which it will remodel the residues specified in the blueprint file and then check for disulfides between the rebuilt region and the landing region. The top 3 disulfided structures will be output.

rosetta/main/source/bin/remodel.macosgccrelease -database rosetta/main/database/ -s 2ci2.renumbered.pdb -remodel:blueprint blueprint.2ci2.domaininsertion -remodel:domainFusion:insert_segment_from_pdb 2ci2.insert.pdb
-remodel:quick_and_dirty -run:chain A -remodel:num_trajectory 3 -overwrite
blueprint.2ci2.domaininsertion (excerpt):
39 V .
40 T L PIKAA T
41 M L PIKAA M
42 E L PIKAA E
43 Y L PIKAA Y
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
0 x I NATRO
44 R L PIKAA R
45 I .

Insert the residues in 2ci2.insert.pdb between residues 43 and 44 and remodelling the flanking residues 40-43 and 44 (all specified in the blueprint file, excerpt shown) of the starting structure 2ci2.renumbered.pdb using the quick-and-dirty mode and output 3 putative models. Note in the excerpt from the blueprint file that NATRO has been specified for the insert residues. If this tag was left off, the inserted residues would be remodelled and could change identity. Also, because NATRO has been specified for the inserted residues, that put Remodel in manual design mode. This means that all other remodelled residues will become valines (or whatever amino acid is specified by the "-remodel:generic_aa" flag) unless corresponding residue behaviours are added for those residues too. In the example above, we want to rebuild the residues flanking the insert, but preserve the wild-type residue identity, hence the PIKAA tokens.

Expected Outputs

REMODEL HANDLES ITS OWN FILE I/O, and only uses the job_distributor to launch the process. Normally the job distributor will write out a file at the end of a run, usually in the format of XXXX_0001.pdb; this file is NOT TO BE TRUSTED IF YOU USE -num_trajectory GREATER THAN 1!! Instead, look for files that are simply 1.pdb, 2.pdb, etc. Due to internal cacheing of structures in both the clustering and structure accumulation stage, the Remodel protocol generates more structures internally than what is expected by job_distributor. If one trajectory was used, then the 1.pdb will have the same info as the XXXX_0001.pdb from the job_distributor. Once Accumulator/Clustering is done, due to sorting done internally, the structure with lowest energy, according to score12, is output as XXXX_0001.

Post Processing

Output PDB files can be viewed in a molecular graphics program such as PyMOL. The log file output can be checked for protocol specific details (chainbreak energies, no. of trajectories, etc).

Troubleshooting

If you get a seg fault right after reading the blueprint, it may be because you have an empty line at the end of the file. The blueprint file should just end at the last definition and have no empty blank lines.

If you see errors complaining about some residue doesn't exist or that the ResfileReader generated problems, it's very likely that you have a chain definition in your PDB file and you are running manual design mode. The easiest solution is to make sure to set -run:chain . This is a known bug (and is currently being fixed) - You probably need this flag EVEN IF YOU ONLY HAVE ONE CHAIN IN YOUR POSE!!

For denovo design, make sure your stub PDB is at least two residues long.

For extensions, it is important to assign secondary structure to the flanking residues or otherwise errors will occur.

Why are some of my positions turned into alanine or valine? This can occur because you assigned some positions to build manually, but forgot to assign all rebuilding positions. Currently, once you are in manual mode, everything is controlled manually (except neighboring residues that can be picked automatically).

New things since last release

RosettaRemodel is being released for the first time with Rosetta v3.4.

See Also