This glossary collects lots of the Rosetta terms with short (sentence-to-paragraph) definitions. You'll see definitions of objects in the code, biophysics concepts, and adminstrivia. Many of these are terms of art in structural biology with the particular nuances that apply in Rosetta.

# A

Term Description
ABEGO Designation that indicates a residue's position in Ramachandran space (A = right-handed alpha or 310 helix; B = right-handed beta strands and extended conformations; E = left-handed beta strands; G = left-handed helices) and cis omega angles (O). See citation here.
Ab Initio Structure Prediction Prediction of molecular structure given only its sequence. Known also as de novo modeling. In Rosetta, ab initio modeling uses statistical information from the PDB such as fragments and statistical potentials.
AllAtom Modeling See fullatom.
Alpha Helix A common motif in the secondary structure of proteins, the alpha helix (α-helix) is a right-handed coiled or spiral conformation, in which every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier (i+4 - i hydrogen bonding). Among types of local structure in proteins, the α-helix is the most regular and the most predictable from sequence, as well as the most prevalent.
Analogue Similar proteins without evolutionary relationships. See also homologue. Rosetta homology modeling doesn't actually need strict evolutionary relationships, and can use analogues as templates.
Annotated Aequence Rosetta will often record the sequence of a protein as the one letter amino acid codes, expanding when necessary with square brackets to indicate patches like post-translational modifications.
Atom A class storing the Cartesian position of an atom in a Residue.
Atom Tree The atom tree connects atoms in the pose, and is used to convert internal coordinates into cartesian Coordinates. Normally derived from the fold tree.
AtomTree Core::kinematics class for defining atomic connectivity.
AtomType A class which stores the properties of a particular kind of atom. (e.g. a carboxylate oxygen). See Rosetta AtomTypes for more details.

# B

Term Description
B factor The "temperature factor" from crystallography and seen in PDB files, the larger the value the more "flexible" the atom is
Backbone In biopolymers, the backbone is those atoms which form the polymeric chain. In proteins these are the N, CA, C, and O atoms and their hydrogens. In nucleic acids it is the phophate and sugars. See also main chain
Backrub A change in torsion angles seen routinely in structures in solution. A change in one dihedral is compensated for by changes in the previous and next dihedrals. This 'move' was implemented as a backbone-sampling protocol by Tanja Kortemme.
Base The non-backbone portion of a nucleotide. Analogous to a protein's side chain.
_Benchmark Study _ Tests done to confirm the performance of a new algorithm or method, results are compared to previous results using the same starting data
Beta Sheet A common motif in the secondary structure of proteins, the beta sheet (β-sheet) is a mostly flat extended structure made up of individual (possibly non-consecutive) extended chains (β-strand) held together by alternating hydrogen bonds.
Binding-Affinity or Binding-Energy How strongly two molecules are associated with one another.
Binding Interface The point of contact between two molecules. A common definition in Rosetta is any residue with the C-beta atom or any heavy atom within 6 Angstroms of the other binding partner.
Biopolymers Polymeric molecules important for biological systems. Typical biopolymers are proteins, RNA, DNA and carbohydrates.
Blind Docking Docking where the structure of the docked complex is unknown.
Bootcamp An intense week-long Rosetta training session for new developers.
Bound Docking The complex structure that is used for reference in docking and rmsd calculations is determined experimentally by X-rays/NMR.
Branches A development term used in code version control. See trunk.
Cartesian Coordinates Coordinates with spatial positions specified by xyz coordinates. Contrast this with internal coordinates. The conversion between the two is kinematics.
Cartesian Minimization Gradient minimization based on moving atoms in xyz Cartesian space, rather than with Internal Coordinates. This requires an extra term (cart_bonded) to maintain bond lengths and angles to their near-ideal values.

# C

Term Description
CCD Cyclic coordinate descent. A loop closure protocol where backbone dihedrals are progressively adjusted to minimizethe gap in the loop backbone. [Add a reference]
Centroid A reduced representation mode, used for simplifying the representation of the system, to permit faster sampling and scoring.For proteins, each residue is represented by five backbone atoms (N, CA, C, O and the polar hydrogen on N) and one pseudo-atom, the “centroid,” to represent the side chain. [Explain further how centroid is calculated.] Gray et al., J. Mol. Biol. (2003) 331, 281–299
Chain In Rosetta, a chain is a single, covalently connected molecule. Internally, it is stored as a number. In the PDB format, a chain is all residues which share a chain identification label.
Chainbreak A gap in connectivity (in the AtomTree) between chemically connected / sequentially adjacent residues. These are used in CCD (Cyclic Coordinate Descent) loop closure.
ChemicalManager A singleton class in Rosetta which keeps track of things like ResidueTypeSets.
chi angles Chi angles are the dihedral angles which control the heavy atom positions of side chain components of residues. In carbohydrates, these are are rotatable OH groups.
chi1, chi2, chi3, & chi4 Specific sidechain chi angles of protein residues. They are enumerated from the C-alpha atom outward, so chi1 would be the dihedral between N-Ca-Cb-Cg.
clash Two (or more) atoms being too close to be energetically favorable (essentially an overlap of vdW radii)
cluster Clustering of structures involves grouping structures with "similar" structures. These groups of similar structures are called "clusters". The measure of structure similarity are typically either RMSD or GDT.
coarse grain Initial modeling, where all atoms or energy terms may not be represented.
commit This is a term related to how version control is used. A commit is when you upload your changes from your computer to the common code source.
comparative modeling Prediction of protein structure based on sequence and the structures of closely related proteins. Also called "Homology Modeling"
conformation The three dimensional organization of atoms in a structure.
Conformation A class which contains Residue objects and FoldTree. This is the part of the Pose which keeps track of coordinates. This is linked by the kinematic layer to describe internal-coordinate folding.
conformer One of a set of 3 dimensional orientations a ligand, small molecule or amino acid side chain. Sometimes refered to in Rosetta as a "rotamer".
constraint When used with Rosetta, actually a "restraint": an adjustment to the score function to take into account additional geometric information
contact order Taken from "Contact order and ab initio protein structure prediction," Bonneau et. al, Protein Science (2002), 11:1937-1944: "The relative CO is the average sequence separation of residues that form contacts in the three dimensional structure divided by the length of the protein."
Critical Assessment of PRediction of Interactions (CAPRI) Protein-protein interactions and other interactions between macromolecules are essential to all aspects of biology and medical sciences, and a number of methods have been developed to predict them. CAPRI is a community wide experiment designed to assess those that are based on structure. Since CAPRI began in 2001, the experiment has had two to four prediction rounds each year, with one or a few targets per round.
Critical Assessment of Techniques for Protein Structure Prediction (CASP) The Critical Assessment of protein Structure Prediction (CASP) experiments aim at establishing the current state of the art in protein structure prediction, identifying what progress has been made, and highlighting where future effort may be most productively focused. CASP has been held every two years starting in 1994. Rosetta has participated in several CASP experiments.
crystal neighbors Is crystal structure a so called native structure? Crystal is composed of approximately 40~70% of water molecules, which gives crystallographers confidence saying proteins in crystal lattice should be able to represent proteins in biological environments, especially when proteins in crystal lattice often times are able to undergo biological reactions they are capable of in cells. However, there is an inevitable artifact in crystal lattice - that is regions where proteins adjacent to each other, making so called crystal contacts. Conformations in regions where proteins have contacts somehow are altered to some extent. Rosetta sometimes is able to sample conformations where the RMSD are 3 or 4 A away from "native crystal structure" but have lower energies, which are the results from variations of a section of loop. And this loop region happens to locate at the spot where crystal contact occurs. Therefore, we now are thinking about our definition of "native structure", where native structure is supposed to be the conformation of the protein exists in cell.
crystallographic phasing The critical step of solving a crystal structure is to get the phase either via molecular replacement or experimental methods. The technically easier way to get the phase is by the method of molecular replacement, where crystallographers utilize existing structures with high structural similarities to help guide the search of phase. However, in some hard cases, where there is no structurally similar structures exist, or structures have too low sequence identities (below 15~20%), crystallographers then have to get the phase through experimental methods, which are much more tedious and difficult compared to molecular replacement method. Rosetta can generate or refine models using physically realistic full-atom force field, which sometimes can generate more accurate comparative models. For some of those hard cases, Rosetta therefore is able to provide better initial search models for molecular replace to find the solutions. ref: Qian et. al. High-resolution structure prediction and the crystallographic phase problem. Nature 450, 259-264 (2007)
CxxTest This is the framework we use for unit tests. See also http://cxxtest.com.

# D

Term Description
database The Rosetta database directory contains key parameters for Rosetta. Examples of stored information is force field, definition of monomers (see: residue types), representation of the model, fundamental constant parameters, etc.
ddG Also known as ΔΔG. The change in binding energy free energy (ΔG) upon a mutation.
de novo modeling Prediction of molecular structure given only its sequence. Known also as ab initio structure predition.
decoy A model produced by a computational protocol.
density map Experimental data showing where the electrons (and thus the atoms) are.
design Optimization of the amino acid sequence of a protein.
devel devel is one of the libraries within the Rosetta project. It contains code that is documented and tested but not necessarily scientifically validated to work well: code still under development. It is not availible in the released version. It should be deleted.
dihedral angle A four-body angle encoding the respective orientation of two atoms around the axis connecting two other atoms. Also known as a torsion.
disulfides The covalent attachement of two cysteine residues in close proximity. This depends on the protein being present in an oxidizing environment (like outside of the cell), rather than a reducing environment (like the inside of the cell). This covalent attachment can greatly stabilize the folding of a protein.
docking Assembling two separate proteins (or protein-ligand, protein-surface) into their biologically relevant structure and finding the lower free energy of the complex.
docking funnel An energy funnel (score versus RMSD) for docking runs.
Dunbrack library A sidechain rotamer library compiled by the Dunbrack laboratory; the standard rotamer library of Rosetta.
Dunbrack loop optimization See CCD
Dunbrack score Statistical energy term of the rotamer (-log(p) where p is the probability of the given rotamer.)

# E

Term Description
Energies A class in Pose which stores the energies computed by the ScoreFunction.
energy function Also called a "score function". The prediction of structural energy over which Rosetta operates.
energy funnel An attempt at representing the energy landscape of the protein. A plot which (ideally) shows low rmsd structures having lower energies than high rmsd structures.
ensemble a group of closely related structures
EnergyMethod The class which implements the scoring of a particular score term for the ScoreFunction.
explicit water Water modeled as atoms, rather than implicitly.
ex1/ex2 Options that specify the size (extra sampling) of rotamer library being used

# F

Term Description
fasta Text based format describing the peptide sequence of a protein, single letter amino acid codes are used
filter A pass/fail check on structure quality during the middle of a run. Filters are applied to avoid wasting computational time on trajectories which are unlikely to result in successful results.Relevant metrics are calculated and those structures with poor values are discarded.
fixbb A Rosetta application which does fixed backbone design.
fixed backbone design Design of a protein where the the backbone is not moved during the redesign.
fixed backbone packing Optimization (packing) of the side chain conformations, done without moving the backbone.
flag(s) file Also options file: a file that contains a set of flags (possibly with their respective parameters) to control the program or protocol. You can load it as an option when you start the program, instead of typing all options at the command line.
flexible backbone design Design of a protein where the the backbone is allowd to move during design.
flexible backbone packing Optimization (packing) of the side chain conformations, where the backbone is allowed to move during optimization.
FoldTree (fold tree) A directed, acyclic graph (tree) connecting all the residues in the pose. The fold tree is the residue-level description of how internal coordinates and cartesian coordinates interconvert, and how changes propogate between residues. Changing the dihedral of one residue will change the cartesian coordinates of all residues "downstream" in the fold tree due to lever arm effects. By changing the fold tree you can limit the propagation of these effects, keeping portions of the protein backbone fixed which would normally move. See also atom tree and kinematics.
force field See scorefunction.
fragment A section of a protein. Typical Rosetta usage is for 3- and 9-mer backbone fragments selected from PDB structures.
fragment insertion Placing backbone dihedrals from a fragment into the structure. Used frequently for loop modeling and ab initio.
fragment picker A Rosetta application used to pick fragments.
fullatom Also "all atom": A representation of the protein where all physical atoms (including hydrogens) are present during modeling, in contrast to reduced representations like centroid mode.
full-atom energy function The energy terms and interactions are calculated in the atomistic scale (atom-atom pairwise). As apposed to a reduced representational-mode such as centroid.

# G

Term Description
GDT Global Distance Test. A metric used in CASP instead of RMSD, which is less sensitive to regions of unaligned structure. [Insert reference here]
GDTMM A Rosetta-specific name for GDT.
Git One of the most widely used distributed version control system, used to version control the Rosetta code. Developed originally for Linux development by Linus Torvold. We use GitHub for hosting.
Gollum Gollum (external link) is a Git-based wiki, used to create this wiki you are reading.
global minimum The 3 dimensional conformation of a protein which corresponds to the lowest possible energy state, this is (usually) the conformation found in nature.

# H

Term Description
hard rep Normal Lennard Jones repulsive - used in contrast to soft_rep.
heavy atom All atoms except hydrogens.
homologue Evolutionarily related proteins. They usually have similar structure and sequences, but don't necessarily have to. Within Rosetta however we are only interested in homologues that are similar in structure. Ones that are similar in sequence but not in structure are not necessarily useful, though proteins that share more than 20-25% of their sequence are usually structurally similar. (The 20-25% region is called the "twilight zone" of homology.) A protein that is structurally similar but not evolutionarily related is an analogue.
homology modeling Homology modeling of protein refers to constructing an atomic-resolution model of the "target" protein from its amino acid sequence and an experimental three-dimensional structure of a related homologous protein (the "template"). Homology modeling relies on the identification of one or more known protein structures likely to resemble the structure of the query sequence, and on the production of an alignment that maps residues in the query sequence to residues in the template sequence. Related to threading.

# I

Term Description
idealization Rosetta normally works only with changing dihedral angles. The idealize application program loads the pdb file and replaces all bond lengths and plane angles with the values defined in Rosetta database. The result of this simulation is non-deterministic, so many runs may be attempted. See also Cartesian minimization which work with non-ideal bond lengths and angles.
interaction graph A representation of protein interactions during packing; can affect simulation speed
interface The region of a structure where two chains interact
internal coordinates Storage of the positions of atom based on bond lengths, angles and dihedrals, rather than Cartesian coordinates (xyz coordinates). The conversion between the two is kinematics.

# J

Term Description
jump A portion of the fold tree representing a rigid body (non-covalent) movement.

# K

Term Description
knowledge-based potentials Also "statistical potentials": energy function terms based on the probability of occurrence in a data set

# L

Term Description
Lennard-Jones Also "LJ": A function that approximates the non-bonded interactions of neutral atoms, combines Pauli repulsion and the van der waals attractive term (also known as Lennard Jones 6-12 potential)
ligand A molecule which binds a protein; for Rosetta this is specifically a non-polymeric small molecule
local minimum The lowest energy 3 dimensional state of a protein in a neighborhood of similar conformations, there may be many local minimums of a protein, but only one global minimum.
loop Structurally a loop region is a combination of phi-psi angles which is in a certain area of the Ramachandran plot. Loops are very loosely defined: a working definition is secondary structure that isn't defined as either an alpha helix or a beta sheet. [XXX: A picture would be good here] In Rosetta code a loop is anything between two fixed ends that you want to model. This usually corresponds to the structural definition of loops, but can also refer to regions which aren't.
low energy A 3 dimensional model of a protein is low energy if it has good packing, satisfied polar or charged residues, appropriately placed small molecules or ligands, etc. However, it may need to be minimized into the given energy function before further use to prevent artifacts.
low Resolution An experimentally determined structure of a protein is low resolution if atoms is not distinct, thypically this equates to a crystal structure resolution above 3-4 angstroms.

# M

Term Description
main chain Used interchangeably with backbone atoms.
Metropolis criterion Used by Monte Carlo methods, this equation tells whether to accept or reject a random move
MiniCON This was the winter Rosetta developer's meeting, which moved around the country to be hosted by different RosettaCommons labs. We discussed code issues of wide interest and narrow. The name has changed to Winter RosettaCON.
minimization Optimize the protein structure by making small movements to lower energy conformations
minirosetta The name of Rosetta3 project during initial development. Also, the name of a wrapper program which exposes multiple protocols, mainly used for Rosetta@Home.
MiniRosettaCON See MiniCON.
mmCIF Macromolecular Crystallographic Information File, file format used to describe the 3 dimensional structure of a protein
model A representation of the 3 dimensional structure of a protein "All models are wrong, but some are useful" George Box
MOL format A file type that contains information about the structure of a chemical; same as "SDF format"
MolProbity MolProbity is a general-purpose web server offering quality validation for 3D structures of proteins, nucleic acids and complexes. It provides detailed all-atom contact analysis of any steric problems within the molecules as well as updated dihedral-angle diagnostics and it can calculate and display the H-bond and van der Waals contacts in the interfaces between components. MolProbity: all-atom contacts and structure validation for proteins and nucleic acids, Davis et al., Nucleic Acids Res. 2007 July; 35(Web Server issue): W375–W383.
monomer Monomeric protein
Monte Carlo method Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. In Rosetta, Monte Carlo methods are used frequently to sample: rotamers in repacking, amino acids in design, fragments in folding, numerous other chemical changes. (Discuss specifics of Monte Carlo in Rosetta)
MoveMap A class in Rosetta which contains lists of mobile and immobile degrees of freedom. Normally used during minimization to specify which parts of the Pose can be minimized. (e.g. for fixed backbone minimization)
Mover An abstract class and parent of all protocols. Every protocol in Rosetta has to inherit from this class and implement the apply function, which then alters the Pose and implements the protocol.

# N

Term Description
native structure The structure of a protein, ligand, etc that is found in nature, usually refers to the crystal or NMR structure of a protein
NNMAKE An earlier version of the fragment picker application.
nstruct The number of models that Rosetta will output

# O

Term Description
options User specified directions given to Rosetta, either through the command line or through the options file, sometimes called "flags"

# P

Term Description
PackerTask A class which sets up what is allowed in packing.
packing In Rosetta, optimizing the conformation (and identity) of protein sidechains. The Rosetta Packer uses Metropolis Monte Carlo Simulated Annealing to optimize rotamers.
packing density How close atoms are to each other; closer is better, up to a point
params file A file which tells Rosetta how a residue behaves.
Parser Another name for RosettaScripts
patch files A file which makes a small adjustment to a score function
PDB Can refer to either the Protein Data Bank, a website that contains structural information of proteins, usually determined by x-ray crystallography or NMR. Or PDB can refer to the file type used by the protein data bank to represent the 3 dimensional structure of a protein
phi The dihedral angle describing the position of the C-N-Calpha-C atoms
pilot apps Rosetta applications written by the community that have not been yet officially released.
Pose Represents a molecular structure in Rosetta (of proteins, RNA, etc) and contains all of its properties such as Energies, FoldTree, Conformation** and more. Each and every Mover in Rosetta operates on a pose through its apply function.
protocol Workflow to do specific calculations in Rosetta; sometimes a protocol uses movers.
psi The dihedral angle describing the position of the N-Calpha-C-N atoms

# R

Term Description
ReferenceCount ReferenceCount was the core class in the smart pointer system that Rosetta3 used up until 2015. Nearly every class in Rosetta ultimately inherits from this class. The class remains as an empty class, because it was too hard to move after Luki Goldschmidt's transition to our newer smart pointers, and because having a base class for nearly all classes is useful for the Pose DataCache.
refinement Starting from a low-resolution model, use the full-atom energy function to modify the conformation so it is closer to an experimentally determined structure.
relax A protocol in Rosetta which optimizes the structure of the protein
release The Releases are when we make Rosetta code available to academic and industrial users. The code in trunk is copied into a branch in git, cleaned up to remove unreleaseable code (usually devel and pilot_apps, then posted for wider use. We are currently on a "weekly release" schedule, where a new release is produced more-or-less each week. (It is not every week, as certain weeks the code does not pass our quality control measures.)
repack Determine the conformation of sidechains which minimizes the energy
representation How Rosetta sees a protein molecule. Rosetta supports two representation: 1. fullatom - full atom representation, slow but accurate. 2. centroid - a reduced representation. faster, but less precise.
repulsive term fa_rep: The part of the Lennard Jones equation which describes the effects of overlapping electron orbitals, the energy will be positive
resfile The resfile is a file format used to manually pass complex instructions to the packer / PackerTask.
residue Each Pose/Conformation is broken down into small units called "Residues", which could be amino acids, nucleic acids or any group atoms with certain rules of what they are and how they are connected, such as a small chemical ligand moiety. The chemical content of a Residue is stored in an object called "ResidueType" and aside from that each Residue has other data storing actual coordinate information of each atom it contains as well as coordinate-related data such as mainchain/sidechain torsion angles, sequence position etc. For example, in a protein there might be multiple Leucine residues, each of which will be an individual "Residue" object. Each Leu Residue has its own coordinate data, but all Leu will have the same Leu ResidueType which contains information on what are the atoms, their names, chemical elements and connectivity. This setup also allows a sidechain Rotamer to be represented just as a Residue.
residue types A set of atoms defined for each residue known to Rosetta. The set defines also bonds and local geometry. The data are stored in database). Each kind of residue normally has distinct ResidueType objects for each of the different Rosetta representation.
Residue A class in Rosetta which stores the coordinates and details about a specific residue in a Pose.
ResidueType A class in Rosetta specifying how a particular residue behaves chemically. It does not contain the coordinates of the residue (that is stored in a Residue object), but rather things like chemical connectivity and atom properties.
ResidueTypeSet A class containing a collection of ResidueTypes all of the same type. The standard ResidueTypeSets are centroid and fullatom.
restraints Adjustments to the energy function; often called "constraints" in Rosetta
REU Rosetta Energy Units - Rosetta's arbitrary energy term, does not correspond with physical energy measurements. With REF2015, REU is now optimized to approximate Kcal/mol
rigid-body There is no intramolecular flexibility between the protein backbone atoms or bonds and angles are frozen for the backbone.
Rohl review This term refers to Rohl et. al., 2004, [Protein structure prediction using Rosetta (http://www.ncbi.nlm.nih.gov/pubmed/15063647), the earliest review paper of Rosetta. See its entry in the Rosetta Canon.
root mean square deviation (RMSD) [It's not enough to just explain how RMSD is calculated: it's also important to discuss what significance it plays in Rosetta, and what values are to be expected calculating it in various places.]
Robetta An online, automated tool for protein structure prediction and analysis.
Rosetta Best software ever? Or merely the easiest to use? You decide!
Rosetta++ Rosetta++ was the 2.x edition of Rosetta. It is so-named because it was in C++, as a human-assisted machine translation of the original FORTRAN Rosetta.
Rosetta3 paper ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules, which was the paper that described the transition from C++-but-monolithic Rosetta++ to object-oriented-C++ Rosetta3.
RosettaCommons This is the organization that manages the intellectual property of the Rosetta code.
RosettaCON This is a summer convention held every year, usually around the last week of July-first week of August, usually at the Sleeping Lady in Leavenworth, WA. It's a scientific conference just for Rosetta developers and users in industry or RosettaCommons labs, along with a few invited speakers.
Rosetta Developer's Meeting This is a one-day addendum to RosettaCON, usually held at the University of Washington in Seattle the day before RosettaCON. It's used to handle Rosetta code issues of wide interest that are too technical for the RosettaCON audience.
RosettaScripts An XML based interface for controlling Rosetta, allows the user greater control of methods, score functions, etc, without requiring the user to change the source code of Rosetta.
rotamer Rotamers, rotameric isomers, represent the most stable sidechain configurations, which are commonly observed in crystal structures. Using rotamers allows Rosetta to efficiently consider many discrete side chain conformations, where continuous side chain motion would be expensive.
rotamer trial minimization the optimal combination of rotamers (sidechains) is found using a simulated-annealing Monte Carlo search. Minimization techiniques are adopted afterwards to optimize sidechains and rigibody displacements simulataneously.

# S

Term Description
SASA Solvent accessible surface area – the area of a protein that can be reached by water or another solvent
scorefile A flat-text file produced by Rosetta applications that contain all energy component values. Each row provides values for a single pose (structure). An equivalent file is can be also made from a silent file by the following grep command: grep SCORE silentfile.out > scorefile.sc
ScoreFunction The class in Rosetta which handles scoring the pose. A particular Rosetta run can use multiple different ScoreFunctions, each with their own weights files and settings.
scoring grid A rapid pre-calculation of scoring for ligand docking
secondary structure Secondary structures describe classes of local conformations of a molecule (usually a nucleic acid or protein). The most basic formulation of protein secondary structure classes are alpha helices, beta sheets, and loops. In ab-initio projects, Rosetta uses secondary structure prediction programs to predict the secondary structure of the target protein. The predicted secondary structure is then used to select fragments from the Vall database of fragments. Secondary structure prediction in Rosetta is currently achieved by a combination of the following three methods: - PsiPred. D.T. Jones, J. Mol. Biol. 292, 195 (1999). - SAM-T99. K. Karplus, R. Karchin, C. Barrett, S. Tu, M. Cline, M. Diekhans, L. Grate, J. Casper, R. Hughey, Protein Struct. Funct. Genet. S5, 86 (2001). - JUFO. J. Meileer, M. Muller, A. Zeidler, F. Schmaschke, J. Mol. Biol. 7, 360 (2001).
sequence Peptide sequence or amino acid sequence is the order in which amino acid residues, connected by peptide bonds, lie in the chain in peptides and proteins. The sequence is generally reported from the N-terminal end containing free amino group to the C-terminal end containing free carboxyl group. Peptide sequence is often called protein sequence if it represents the primary structure of a protein. In Rosetta, a sequence is input in a *.fasta file format.
SDF format A file format that describes the structure and connectivity of a molecule, used primarily for small molecules, not for proteins; also known as MOL format
Shultzy's Shultzy's is a favorite bar and sausage grill of the Rosetta community when in Seattle for RosettaCON. It's on the east side of The Ave.
side chain The 20 aminoacids contain an amino group (NH2), a carboxylic acid group (COOH), and any of various sideChains R, and have the basic formula NH2-CH-COOH(R)
silent file A flat-text file that stores poses (structures) computed with Rosetta along with the relevant scores (energies). By default, the file name is default.out but it may be changed with -out::silent flag.The file contains only internal degrees of freedom of a pose (Phi, Psi, omega and Chi angles). Cartesian coordinates must be restored with extract_pdbs application.
small molecule For Rosetta, anything that's not a polymeric biomacromolecule
soft_rep An energy function where the Lennard Jones potential is adjusted so that clashes aren't scored as badly; contrast "hard_rep"
ss2 File format used to store secondary structure information. Originally introduced by PsiPred program (by D. Jones)
symmetry definitions symdef files tell Rosetta how to treat a symmetric protein

# T

Term Description
target sequence The sequence of the protein of unknown structure you're trying to model
TaskOperation A specification in RosettaScripts which tell the Packer how to optimize rotamers
test servers The Gray lab maintains a testing server which runs a set of standardized tests on each commit of the code to trunk. The tests ensure that code compiles for each platform and distribution type.
Thai Tom Thai Tom is one of the two 'Rosetta restaurants' that many developers like to visit in Seattle before/after RosettaCON. 4543 University Way NE, Seattle, WA 98105 (it's on the west side of The Ave). Excellent Thai food, can be very spicy. Wait times are a problem if 40 Rosetta people show up at once.
theozyme A theozyme, or "theoretical enzyme," is a convention used from enzyme design. Unsurprisingly, it's a good idea to generate a geometrically idea active site to stabilize the desired transition state conformation; once you set that up, you can thread it onto a pose.
threading Protein threading, also known as fold recognition, is a method of protein modeling (i.e. computational protein structure prediction) which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. Threading is the process of placing the amino acids of a target protein onto the 3D structure of a template according to a sequence alignment. A comparative model can then be build of the target protein sequence.
_ Top7 / Top7 paper_ Top7 is the name of a protein de novo designed with Rosetta. Its paper, Design of a novel globular protein fold with atomic-level accuracy, is also of broad interest for its description of the early energy function.
torsion angle aka dihedral; the degree of freedom of rotating around a bond
torsion space Internal coordinates; torsion space minimization optimizes the protein by rotating dihedrals
trunk trunk is a name for where the developers' current version of Rosetta lives. It's called trunk because it's the main line of the code; side development projects are in branches. Also known as master.

# U

Term Description
unbound docking the crystal PDB structures of the 2 proteins are determined separately and then combined into one complex

# V

Term Description
Vall Pronounced "V-all". The Vall database is a condensed representation of the entire PDB for the purpose of fragment picking. The fragment picker filters the Vall database based on the sequence and secondary structure predictions (and other information) to pull out those backbone conformations which represent the desired fragments.
van der Waals Describes the interactions between neutral, non-bonded atoms, in protein prediction often used interchangeably with Lennard-Jones potential

# W

Term Description
weights file The file which specifies the coefficients to use when linearly combining score terms into a scoring function.
Winter RosettaCON This is the winter Rosetta developer's meeting, which moves around the country to be hosted by different RosettaCommons labs. We discuss code issues of wide interest and narrow.

# X

Term Description
XML A hierachical data format, a custom version is used by RosettaScripts