You are here

Structure refinement with RDC constraint

2 posts / 0 new
Last post
Structure refinement with RDC constraint
#1

Hi~

I've done a structure refinement of a protein by Rosetta3.8.

However, I'm not sure that what I've done is correct or not.

The protocol that I followed are list below:

First, I put my protein sequence(.fasta file), torsion angle(predicted by talos+), RDC value(experimental data) into the website( http://robetta.bakerlab.org/fragmentsubmit.jsp) to do fragment generation.

Second, I used the command "minirosetta.default.linuxgccrelease @HGDC_broker_cst.options" to run rosetta.

The file"HGDC_broker_cst.options" are shown below:

============HGDC_broker_cst.options======================================================================================================

#make sure all variable names have been replaced with absolute path and that no line begins with a $ or ~s

-in

            -file

                           -native HGDC.pdb # native PDB file (optional)

                           -fasta HGDC.fasta # protein sequence in fasta format

                           -frag3 aat000_03_06.200_v1_3 # protein 3-residue fragments file

                           -frag9 aat000_09_06.200_v1_3 # protein 9-residue fragments file

                           -rdc HGDC_native_RDC.rdc

-abinitio

            -increase_cycles 10 # Increase the number of cycles at each stage in AbinitioRelax by this factor

            -rg_reweight 0.5 # Reweight contribution of radius of gyration to total score by this scale factor

            -rsd_wt_helix 0.5 # Reweight env, pair, and cb scores for helix residues by this factor

            -rsd_wt_loop 0.5 # Reweight env, pair, and cb scores for loop residues by this factor

            -stage1_patch rdc_patch.wts_patch

            -stage2_patch rdc_patch.wts_patch

            -stage3a_patch rdc_patch.wts_patch

            -stage3b_patch rdc_patch.wts_patch

            -stage4_patch rdc_patch.wts_patch

            -relax

-run

            -protocol abrelax

            -reinitialize_mover_for_each_job # jd generate fresh copy of its mover before each apply (once per job)

-score

            -find_neighbors_3dgrid # Use a 3D lookup table for doing neighbor calculations. For spherical, well-distributed conformations

            -patch /opt/rosetta/tools/protein_tools/scripts/generate_atom_pair_constraint_file.py

-out

            -nstruct 1000 # how many structures do you want to generate?  Usually want to fold at least 1,000.

            -file

                            -silent HGDC_broker_cst.out # full path to silent file output

                            -silent_struct_type binary # we want binary silent files

                            -scorefile HGDC_broker_cst.fsc

-user_tag 002

-overwrite # overwrite any existing output with the same name you may have generated

-fold_cst

-force_minimize # minimize the structure after making a move, even if no restraints given

==============================================================================================================================

 

Finally, I've got a score.fsc file like this:

SCORE:     score     fa_atr     fa_rep     fa_sol    fa_intra_rep    fa_elec    pro_close    hbond_sr_bb    hbond_lr_bb    hbond_bb_sc    hbond_sc    dslf_fa13       rama      omega     fa_dun    p_aa_pp    yhh_planarity        ref    Filter_Stage2_aBefore    Filter_Stage2_bQuarter    Filter_Stage2_cHalf    Filter_Stage2_dEnd    co        rms     maxsub    clashes_total    clashes_bb       time    user_tag    description
SCORE:  -227.165   -772.819     83.957    444.286           2.071    -84.559        0.478        -18.327        -42.232        -18.666     -13.370        0.000     -8.885     17.856    234.566    -30.453            0.395    -21.463                    0.000                     0.000                  0.000                 0.000 27.286     19.088     54.000            0.000         0.000    719.000 002 S_002_00000001
SCORE:  -247.325   -770.395     78.062    439.372           2.099    -90.080        0.374        -20.420        -38.220        -12.950     -20.084        0.000    -13.605     15.676    235.575    -31.396            0.130    -21.463                    0.000                     0.000                  0.000                 0.000 20.812     15.127     68.000            0.000         0.000    687.000 002 S_002_00000002

 

 

My questions are :

1. I don't know what the flags -run:reinitialize_mover_for_each_job & -score:find_neighbors_3dgrid & -out:file:silent_struct_type respectively represents for

should I use them for strcture refinement?

or where can I find the meaning of them in detail 

(I've found this page (https://www.rosettacommons.org/docs/latest/full-options-list) that describes the meaning of them but it seems not that clear)

2. In my score.fsc file I can't understand what the fa_atr fa_rep ....... stand for?

Besides, most of the tutorials suggest us compare the score term. But how about other terms? Are they as important as each other?

3. If I want to obtain  rmsd of each structures compared to the structure which has the lowest energy, what flags should i use.

(just like what csrosetta online server do)

4. Whether I can refine an inexact structure by inputting a pdb file and other constraint but not reconstruct a strcuture by fragment picking?

 

I'll really appreciate your advice and suggestions.   

Post Situation: 
Thu, 2018-01-18 00:44
shushunur

1) The page you found is probably the best documentation of options. Depending on the option, there may be more explanation elsewhere in the documentation.

     -run:reinitialize_mover_for_each_job   is just a safey thing. The "mover" is what is actually acting on the structure. Some movers have internal state which can carry through from output structure to output structure (and in fact, some protocols rely on this). What the -run:reinitialize_mover_for_each_job option does is say that each output structure (each job) should get a fresh copy of the mover at the start of the protocol. This way you don't inadvertantly carry over state between the output structures. I'd keep this on. It won't appreciably slow your protocol, and it will keep you from carrying over state between structures.

   -out:file:silent_struct_type sets the type of silent file output. Silent files are an efficient but Rosetta-only file format.  There's two main types of silent file formats: protein and binary. Protein silent files are only for proteins, and require the protein to have ideal bond length and angles. Binary silent files can handle more residue types, and can handle arbitrary atom positions. There is some autodetermination logic in Rosetta to choose the best format, but there's nothing wrong with telling Rosetta to always use binary silent files - it's a more general format. (And contrary to the name, it's still an ASCII-only text file.)

   -score:find_neighbors_3dgrid  -- I haven't encounted this before, but it's apparently an alteration in how the scoring function works. From what I can determine, it should speed up runs for very large protein/protein complexes. It would have minimal/negative impact for smaller protein systems, which is why it isn't turned on by default.

2) For a good rundown of all of Rosetta's energy terms, see Alford et al. (https://doi.org/10.1021/acs.jctc.7b00125). See also https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/score-types and the links in the "See Also" section of that page.

3) Most Rosetta protocols are once-through, so to compute an rmsd metric to one of the output structures, you'll need to do a separate Rosetta run.  If all you want is the rmsd, this is relatively straightforward. Simply do something like:   `score.linuxgccrelease -in:file:silent HGDC_broker_cst.out -in:file:native lowest_energy_structure.pdb -out:file:scorefile rmsd.sc -score:weights empty`  -- you unfortunately will need to extract the lowest energy structure as a PDB prior to running it though. (The score application can't handle silent file-based "natives".)

4) I'm not entirely sure what you mean by this, but if you're talking about starting from an existing structure, rather than a fasta sequence file, the current protocol you use cannot do so (it always starts from an extended sequence). However, if you look into "comparative modeling" protocols, you should be able to find something related that will likely work. Comparative modeling also works if you use a (partial) structure of the current protein as your "homolog".  (This approach has had some success with refinement into electron density information.)

Tue, 2018-01-30 09:08
rmoretti