Rosetta 3.4
Resfile syntax and conventions
Author:
Matthew O'Meara and Steven Lewis

This page describes the resfile format, syntax, and conventions. The resfile contains information which is input into the PackerTask and controls the Packer. Internal details for the commands can be found at the How to write new resfile commands residue-level options how-to.

Syntax and Semantics

The syntax for the resfile format in extended-EBNF form is as follows.

<RESFILE> ::= <HEADER> | [<HEADER>] START\n <BODY> ;

<HEADER>  ::= {{<COMMAND>}*\n}* ;

<BODY>    ::= {<RESIDUE_IDENTIFIER> {<COMMAND>}*\n}* ;

<RESIDUE_IDENTIFIER> ::= <SINGLE_RESID> | <RANGE_RESID> | <CHAIN_RESID> ;

<SINGLE_RESID> ::= <PDBNUM>[<ICODE>] <CHAIN> ;

<RANGE_RESID>  ::= <PDBNUM>[<ICODE>] - <PDBNUM>[<ICODE>] <CHAIN> ;

<CHAIN_RESID>  ::= '*' <CHAIN> ;

<PDBNUM>  ::= [-]{digits}+ ;

<ICODE>   ::= {A-Za-z} ;

<COMMAND>  ::= <COMMAND_ID> {<COMMAND_PARAMS>}* ;

<COMMAND_ID> ::= ? see command types below ? ;

<COMMAND_PARAMS> ::= ? see command types below ? ;

Throughout, the resfile has the following conventions:

Specify in the <header> section the commands that should be applied by default to all residues that are not specified in the body. NOTE: Commands in this section are not applied to any residue which has a line in the body section. For example, if the header commands include "EX 1" and residue 10 has the behavior "ALLAAxc" in the body, then residue 10 will NOT get EX 1 behavior. Command line flags (e.g. "-ex1") will apply to all residues and can be used when the user wants to quickly specify global behavior.

Example header section:

# These commands will be applied to all residue positions that lack a specified behavior in the body:
ALLAA           # allow all amino acids
EX 1 EX 2       # allow extra chi rotamers at chi-id 1 and 2 (note: multiple commands can be on the same line.)
USE_INPUT_SC    # allow the use of the input side chain conformation   ( see below for more detailed description of commands)
start
#... the body would continue here.

Body

Specify in the <body> section residue level constraints to be sent to the packer and elsewhere. Each line should have one of the the following formats,

Identification

Residue Identification

To specify commands for a single residue, use the following form

<PDBNUM>[<ICODE>] <CHAIN>  <COMMANDS>

If the pose the resfile is has pdb information associated with it (eg it was read in from a pdb file) then <PDBNUM>[<ICODE>] corresponds to columns 22-26. If the pose does not have pdb information (eg if it was generated de novo or from a silent file), the <PBDNUM> is the residue index in the pose and the <ICODE> should not be specified. The <PDBNUM> can be positive, zero, or negative. The <ICODE> is an optional character [A-Z] (case insensitive) that occurs in some pdbs to represent insertion or deletions in the sequence to maintain a consistent numbering scheme or the remainder of the sequence.

To accommodate structures with a large number of chains, following the PDB the, the chain can be any character [A-Za-z] where upper and lower case characters are treated as separate chains. For example

10 _ PIKAA W  # Allow only Trp at residue 10 in the unlabeled chain
40 B NC ZN    # Allow zinc at residue 40 chain B
40A Q ALLAA   # Residue 40, insertion code A, on chain Q, use any residue type
Identification

To specify commands for a sequence of residues at once, use the form

<PDBNUM>[<ICODE>] - <PDBNUM>[<ICODE>] <CHAIN>  <COMMANDS>

See the section on specifying a single residue, above, for commends on the format for the <PDBNUM>, <ICODE>, and <CHAIN> identifiers.

To determine which residues fall into the specified, range, the sequence of residues in pose are used. NOTE: it is an error if the first residue does not come before the second residue.

Identification

To specify commands for all the residues of a specific chain, use the following form

* <CHAIN> <COMMANDS>

See the section on specifying a single residue, above, for commends on the format for the <PDBNUM>, <ICODE>, and <CHAIN> identifiers.

specified multiple times

If a residue is specified at multiple levels, e.g. as a single residue, in a range and as part of a whole chain, then the specific specification supersedes the others. If a residue is specified multiple times at the same level, eg in multiple single residue commands, then all the commands are used together. Note: the order in which commands are specified is not important.

For example,

NATRO
START

* A NATAA
3C - 20 A APOLAR
15 A PIKAA Y

This task, sets residue 15 on chain A to be a tyrosine, all the residues in between residue 3C (PDBnum=3, icode='C') and residue 20 except residue 15 on chain A to be a-polar, the rest of the residues on chain A to be their native amino acid but packable, and the rest of the residues in the pose to keep their fixed rotamer.

The body of the above resfile could have been written like this:

15 A PIKAA Y
3C - 20 A APOLAR
* A NATAA

Commands

Commands

Each command acts to restrict the allowed amino acids allowable at each position. If multiple commands are combined, only amino acids that are allowed by each command individually are included. This is a consequence of the commutativity property for operations on the PackerTask class.

 - ALLAA ................ allow all 20 amino acids INCLUDING the cystein amino acid (same as ALLAAwc)
 - ALLAAwc .............. allow all 20 amino acids ( default )
 - ALLAAxc .............. allow all amino acids except cysteine
 - POLAR ................ allow only canonical polar amino acids
 - APOLAR ............... allow only canonical non polar amino acids
 - NOTAA <list of AAs> .. disallow only the specified amino acids ( use one letter codes, undelimited like ACFYRT )
 - PIKAA <list of AAs> .. allow only the specified amino acids ( use one letter codes, undelimited like ACFYRT )
 - NATAA ................ allow only the natural amino acid
 - NATRO ................ preserve the input rotamer (do not pack at all)
 - EMPTY ................ disallow all canonical amino acids ( for use with non canonicals )
 - NC <ResidueTypeName> . allow the specific possibly non canonical residue type
NATROT #default command that applies to everything without a non-default setting; do not repack
begin
10 A POLAR # consider polar amino acids at position 10
11 A POLAR PIKAA ACDEFGH # allow mutations to those in the intersection of two sets:
# ........................ the polar amino acids and { ALA, CYS, ASP, GLU, PHE, GLY & HIS}
# ........................ the intersection set {ASP, GLU, HIS}

Commands

 - NATRO  ...................................... fix natural rotamer ( fix identity and conformation )
 - EX (ARO) <chi-id> ( LEVEL <sample level> ) .. ( see below for detailed description
 - EX_CUTOFF <num-neighbors>  .................. about how to specify EX commands )
 - USE_INPUT_SC ................................ include native rotamer

Miscellaneous commands for controlling the packer:

This miscellaneous commands have meanings to particular protocols, but not to all.

Extra Rotamer Commands:

The packer considers discrete sampling of sidechain conformations; it samples discretely from the Dunbrack rotamer library, and by default, samples only at the center of the rotamer wells. The "extra" flags below control the inclusion of additional discrete samples.

Here are some examples:

10 B  EX 1 EX 2
9  B  EX 3      # WARNING: probably want to include EX 1 and EX 2 if EX 3 is specified!
8  B  EX ARO 2
7  B  EX ARO 3 # ERROR: ARO only works for chi ids 1 and 2
6  B  EX 1 LEVEL 7
13 B  EX 1 EX ARO 1 LEVEL 4 # include extra rotamers at chi 1 for all amino acids,
                            # but use sample level 4 for the aromatic amino acids

This resfile will provide packability at most locations, fixed rotamers at a few, and designability at a few. Note the liberal use of comments for clarity. NOTAA C is equivalent to ALLAAxc.

NATAA # this default command applies to all residues that are not given non-default commands
start
#anchor
81 - 82 B NATAA #anchor
83 - 86 B NATRO #anchor

#loops

#133  B NOTAA C #loop
134 - 142 B NOTAA C #loop
#143  B NOTAA C #loop
#144  B NOTAA C #loop
#145  B NOTAA C #loop

#77 B NOTAA C #loop
#78 B NOTAA C #loop
#79 B NOTAA C #loop
80  B NOTAA C #loop
#ANCHOR
87  B NOTAA C #loop
88  B NOTAA C #loop
#89 B NOTAA C #loop
Author:
Matthew O'Meara and Steven Lewis

This page describes the resfile format, syntax, and conventions. The resfile contains information which is input into the PackerTask and controls the Packer. Internal details for the commands can be found at the How to write new resfile commands residue-level options how-to.

New Features

There are new features compared to the resfile format in Rosetta++:

Syntax and Semantics

The outline of a resfile is

<header>

start

<body>

Throughout, the resfile has the following convetions:

Specify in the <header> section the commands that should be applied to all residues by default if no other commands are specified in the body. NOTE: Commands in this section are not applied to any residue which has a line in the body section. As a consequence, if the header commands include "EX 1" and then residue 10 has the behavior "ALLAAxc" in the body, then residue 10 will NOT get EX 1 behavior. Command line flags (e.g. "-ex1") will apply to all residues and can be used when the user wants to quickly specify global behavior.

For example:

# These commands will be applied to all residue positions that lack a specified behavior in the body:
ALLAA           # allow all amino acids
EX 1 EX 2       # allow extra chi rotameters at chi-id 1 and 2
USE_INPUT_SC    # allow the use of the input side chain conformation   ( see below for more detailed description of commands)
start
#... the body would continue here.

Body

Specify in the <body> section residue level information to be sent to the packer. Each line should have the following format,

<res-id> <one or more commands>

The <res-id> should be of the form <PDB residue id[insertion code]> <chain>. If the pose you are working with does not have chain information put "_" in for the chain. Insertion code is (of course) optional, and should only be one character. For example

10 _ PIKAA W  # Allow only Trp at residue 10 in the unlabeled chain
0 A NOTAA W   # ERROR: residue 0 is not allowed!
40 B NC ZN    # Allow zinc at residue 40 chain B
40A Q ALLAA   # Residue 40, insertion code A, on chain Q, use any residue type

Files

You can use resfiles with silent files, even though there's no PDB numbering or chain. Use raw residue numbering for the number field and count chains from A for the chains field.

Commands

Commands

Each command acts to restrict the allowed amino acids allowable at each position. If multiple commands are combined, only amino acids that are allowed by each command individually are included. This is a consequence of the commutativity property for operations on the PackerTask class.

 - ALLAA ................ allow all 20 amino acids INCLUDING the cystein amino acid (same as ALLAAwc)
 - ALLAAwc .............. allow all 20 amino acids ( default )
 - ALLAAxc .............. allow all amino acids except cystein
 - POLAR ................ allow only canonical polar amino acids
 - APOLAR ............... allow only canonical non polar amino acids
 - NOTAA <list of AAs> .. disallow only the specified amino acids ( use one letter codes )
 - PIKAA <list of AAs> .. allow only the specified amino acids ( use one letter codes )
 - NATAA ................ allow only the natural amino acid
 - NATRO ................ preserve the input rotamer (do not pack at all)
 - EMPTY ................ disallow all canonical amino acids ( for use with non canonicals )
 - NC <ResidueTypeName> . allow the specific possibly non canonical residue type
NATROT #default command that applies to everything without a non-default setting; do not repack
begin
10 A POLAR # consider polar amino acids at position 10
11 A POLAR PIKAA ACDEFGH # allow mutations to those in the intersection of two sets:
# ........................ the polar amino acids and { ALA, CYS, ASP, GLU, PHE, GLY & HIS}
# ........................ the intersection set {ASP, GLU, HIS}

Commands

 - NATRO  ...................................... fix natural rotamer ( fix identity and conformation )
 - EX (ARO) <chi-id> ( LEVEL <sample level> ) .. ( see below for detailed description
 - EX_CUTOFF <num-neighbors>  .................. about how to specify EX commands )
 - USE_INPUT_SC ................................ include native rotamer

commands for controlling the packer:

This miscellaneous commands have meanings to particular protocols, but not to all.

Rotamer Commands:

The packer considers discrete sampling of sidechain conformations; it samples discretely from the Dunbrack rotamer library, and by default, samples only at the center of the rotamer wells. The "extra" flags below control the inclusion of additional discrete samples.

Here are some examples:

10 B  EX 1 EX 2
9  B  EX 3      # WARNING: probably want to include EX 1 and EX 2 if EX 3 is specified!
8  B  EX ARO 2
7  B  EX ARO 3 # ERROR: ARO only works for chi ids 1 and 2
6  B  EX 1 LEVEL 7
13 B  EX 1 EX ARO 1 LEVEL 4 # include extra rotamers at chi 1 for all amino acids,
                            # but use sample level 4 for the aromatic amino acids

This resfile will provide packability at most locations, fixed rotamers at a few, and designability at a few. Note the liberal use of comments for clarity. NOTAA C is equivalent to ALLAAxc.

NATAA # this default command applies to all residues that are not given non-default commands
start
#anchor
81  B NATAA #anchor
82  B NATAA #anchor
83  B NATRO #anchor
84  B NATRO #anchor
85  B NATRO #anchor
86  B NATRO #anchor

#loops

#133  B NOTAA C #loop
134 B NOTAA C #loop
135 B NOTAA C #loop
136 B NOTAA C #loop
137 B NOTAA C #loop
138 B NOTAA C #loop
139 B NOTAA C #loop
140 B NOTAA C #loop
141 B NOTAA C #loop
142 B NOTAA C #loop
#143  B NOTAA C #loop
#144  B NOTAA C #loop
#145  B NOTAA C #loop

#77 B NOTAA C #loop
#78 B NOTAA C #loop
#79 B NOTAA C #loop
80  B NOTAA C #loop
#ANCHOR
87  B NOTAA C #loop
88  B NOTAA C #loop
#89 B NOTAA C #loop
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines