Rosetta 3.3
Documentation for pepspec application
Author:
Chris King dr.chris.king@gmail.com

Metadata

Last updated 4/13/2010; P.I. Phil Bradley pbradley@fhcrc.org

Code and Demo

This application lives in src/apps/pilot/chrisk/pepspec.cc The demo lives in demo/pepspec The integration test lives in test/integration/tests/pepspec

References

C.A. King and P. Bradley, Structure-based prediction of protein–peptide specificity in rosetta, Proteins 78 (2010), pp. 3437–3449.

Purpose

This application can be used for structure-based prediction of protein-peptide specificity The algorithm is not restricted to any one peptide-binding domain family and, does not require a structure of the target peptide nor any information about sequence specificity, although known structural data can be incorporated when available to improve performance. Supplied with a target protein structure and one or more homologous protein-peptide complexes, pepspec will simultaneously design sequence and structure for peptides bound to a region of the target protein. These peptides are then ranked by predicted binding affinity to produce as position-specific scoring matrix for the target protein.

Algorithm

The pepspec application implements an anchored, flexible-backbone peptide docking and design algorithm in which the sequence and structure of the peptide are simultaneously optimized. Rather than performing global peptide docking searches, pepspec requires as input an approximate location for a key "anchor" residue of the peptide; the remainder of the peptide is assembled from fragments as in de novo structure prediction and refined with simultaneous sequence optimization. Backbone flexibility of the protein is optionally incorporated implicitly by docking into a structural ensemble for the protein partner.

Limitations This application is *NOT* for structure prediction of an entire protein. You need to have a model of the peptide-binding protein, although this model may be derived from experiment, homology modeling, or de novo protein folding. This applcation does *NOT* move the backbone of the input protein structure. Backbone ensembles can be generated with the backrub or relax applications. This application does *NOT* support de novo docking of the peptide anchor residue; you need to have at bare minimum a model of a protein-peptide complex homologous to your target protein. To dock a single residue with no knowledge of where the binding pocket might be, you may consider using the docking application.

Modes This application has two major modes: Anchor Docking and Peptide Design. Anchor Docking: If you already have a structure of the target protein bound to an N-mer peptide, you may not need to do this step. If you need to dock an anchor residue onto your protein, then the anchor docking mode allows you to use structures of homologous protein-peptide complexes to predict the position of the anchor residue on your target protein. You provide a single structure of your target protein or an ensemble of structures, along with a set of homologous complexes. The homologues must be aligned to the target protein! The algorithm uses the relative positions of the homologues’ anchor residues to dock a new anchor residue to your target protein, and outputs the structures and associated score data for use in the next step. Peptide Design: In the peptide design phase, putative binding peptides are designed at the surface of the target protein. The algorithm takes as input one or more protein-peptide complexes. The "peptide" may be a single residue docked in the previous phase. The existing peptide is optionally extended from each termini by a user-defined number of residues, and low-resolution backbone sampling takes place before high-resolution peptide sequence design. The low resolution step uses a full-atom (not centroid) poly-A or poly-G peptide with a minimal score function that only penalizes atomic clashes and insures the peptide remains near the surface of the protein. The design phase attempts full combinatorial sequence design with both soft repulsive atoms and then with full repulsive atoms, followed by minimization. Then, the sequence is diversified using a Monte-Carlo+minimization design phase (which can optionally be used in a multi-state design way to optimize binding score instead of total score). In this way, each peptide backbone generates many different peptide sequences. Sequence-score data is output for post-processing, and protein structures may also be optionally saved.

Input Files

Anchor Docking Iput Files

<homolog_pdb_filename> <peptide_chain> <peptide_anchor_res>

for each homolog. It is highly recommended you pre-align homologue structures to your target structure. You can optionally choose an option for Rosetta to attempt a sequence alignment and subsequent structural alignment, but the alignment may not be ideal.

Documentation for pepspec application

<atom_name> <peptide_position> <x_coord> <y_coord> <z_coord> <0.0> <std_dev> <tolerance>

Options

You will probably only need to use General and Typical Options. These options will make more sense after you read the Tips section below.

-option:name [data_type] - this is a description (default_value)

General Options

Anchor Docking Options (Typical)

Anchor Docking Options (Extra)

Pepspec Options (Typical)

Pepspec Options (Extra)

Benchmarking Options

Tips

Anchor Docking Iput Files

Documentation for pepspec application

Expected Outputs

This application produces protein-peptide structures and scorefiles. The scorefiles may be used to generate sequence-specitificity position-weight matrices by using the scripts described below.

Post Processing

A position-weight matrix (PWM) can be generated from pepspec output using the script gen_pepspec_pwm.py found in (ROSETTA_LOCATION)/analysis/apps. This script will sort all peptide sequences by Rosetta binding score and generate a matrix of peptide positions by residue frequencies. A background PWM can optionally be supplied to normalize the raw pepspec output PWM (see References at the top of this document). Run 'gen_pepspec_pwm.py help' for more information.

New things since last release

Nope.

 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines