Fragments are used in the assembly of proteins whether for structure prediction or design, to cut down on the size of the protein-folding search space. They are a core
part of the Rosetta design. : Fragment libraries are used by many protocols but are a core
part of ab initio.
- Fragment libraries follow a complex naming scheme:
- Terms in fragment filenames Terms in fragment filenames:
series The two character code used to disambiguate Rosetta runs. For fragment libraries this is almost always aa.
pdb The four character code of the PDB the fragments were generated for.
chain The one character code of the protein chain the fragments were generated for. Usually "_" for all.
size Fragment size, either a 3-mer or 9-mer. Acceptable values are 03 and 09.
strategy Fragment selection strategy used in NNMake. Acceptable values are 04, 05 and 06. See below.
depth Number of fragments of descending score which are kept in the library, usually 200.
version Version of NNMake, usually v1_3.
Making Fragments from Free Web Server: Robetta Server
Making Fragments by yourself: DATABASES: nr - downloadable from ftp://ftp.ncbi.nih.gov/blast/db/external link nnmake_database included in release. chemshift_database include in release. PROGRAMS: PSI_BLAST PSIPRED JUFO PROFphd SAM nnmake include in release chemshift include in release Configure paths at the top of nnmake/make_fragments.pl to point to these databases and programs. PSI-BLAST must be installed locally After PSIBLAST and PSIPRED are installed, refer to its README or see quick directions below on how to create a filtered "NR" seqeuence data bank, called "filtnr", which is also used by make_fragments.pl. Quick directions for creating filtnr:
tcsh81538cfilt nr.fasta > filtnr
tcsh 0.000000ormatdb -t filtnr -i filtnr
tcsh p filtnr.p?? $BLASTDB
Obtain a fasta file for the desired sequence. This file must have 60 characters/line with no white space. First line can be a comment starting with the '>' character.
Obtain secondary structure predictions from web servers, or setup shareware locally so that make_fragment.pl can run secondary structure predictions locally. The fragment maker can use predictions from psipred (.jones or .psipred extension), PhD (.phd) and SAM-T99 rdb format (.rdb) and jufo (.jufo). Up to three predictions can be used. At least one must be used. The getSSpred.pl script can be used to obtain predictions off the web. Edit the config portion of this script to include your email address and to include the correct path to the httpget script. To use this script, provide the fasta filename and the desired method.(invoke the command without arguments to see the usage explanation). Retrieve the secondary structure predictions from your email mail box.
(Optional) Prepare files with NMR data if avialbe - these include .cst and .dpl files that are the same files that rosetta uses, and the .chsft_in file that contains chemical shift information. The information from these files can help Rosetta better pick fragments. See the file 'data_formats.README' for the formatting information.
Run make_fragments.pl. Invoke without arguments for usage options. Likely the only argument you need to provide is the fasta file.
If you want to exclude homologous seqeunces from the fragment search, add the -nohoms argument. $> make_fragments.pl -verbose -nohoms 2ptl_.fasta Note that if you want to exclude homologs from the chemical shift/TALOS search, you need to edit the talos database. See the README in the chemshift_source directory for instructions. If you do not have a particular type of secondary structure prediction (say the .jufo file) and you do NOT want make_fragments to try to run the method locally, use the -nojufo option.
$> make_fragments.pl -verbose 2ptl_.fasta
Two fragment files will be generated with names like aa2ptl_03_05.200_v1_3 and aa2pt_09_05.200_v1_3. The prefix "aa" can be changed by -xx option. "2ptl_" is the five-letter base name which can be specified by -id option or it is derived from the name of fasta file. 03 or 09 indicate the lengths of fragments.
$> make_fragments.pl -verbose -nohoms -nojufo 2ptl_.fasta
Generate loop library in addition to fragment files. Run make_fragments.pl with -template option such as (five-letter code is 2ptl_ for example):
it requires 2ptl_.pdb and 2ptl_.zones to be present in your run dir and this pdb is a template pdb file which has been generated by createTemplate.pl described in README.loops. From the zone file, loops can be defined and a library of loop conformations for each defined loop are complied into a file called "2pt_.loops_all" (which usually contains 2000 loop conformations) based on fragment picking. Then the script "trimLoopLibrary.pl" is automatically called to reduce the size of the loop library and output the file as "2ptl_.loops". This file is later on used in the Rosetta loop modeling mode to build variable loops onto the template structure. A loop library differs from a fragment library mainly in that geometrical information is considered to pick "loop" fragments with desired length which can roughly close the gap based on the "take-off" stub positions. A newer version vall database (2006-05-05) has been provided in nnmake_database together with the orginal version 2001-02-02. You can make fragments using either version of database, just modifying make_fragments.pl to have it pointing to the version you want to use. Currently, making loop library only works with 2001-02-02 version as some newly developed loop modeling methods do not need a loop library any more. NOTES: 1. name all your files with a five character base name followed by the appropriate extension. The base-name should be the four-letter pdb code and 1 letter chain id. 2. See also pNNMAKE? for a listing of the files involved in the fragment process 3. If a pdb file is in the directory you are making fragments in, nnmake will evaluate the fragment match to the pdb. Note that if the pdb file disagrees with the fasta file, the program will detect an error and stop.
$> make_fragments.pl -template 2ptl_ 2ptl_.fasta
How to Make a vall Without Knowing What You are Doing.(Jack Schonbrun May 27, 2004) You need to use Rosetta executable to make a vall data. Please check How to make a Vall for details