You are here

Cannot use SimpleMetrics

6 posts / 0 new
Last post
Cannot use SimpleMetrics
#1

Hello,

I am trying to use SimpleMetrics to report the total_score after subsequent steps in my RosettaScript for homology modelling. I am getting an error like <SIMPLE_METRICS> is not expected! (find the xml script below)

XML script:

<ROSETTASCRIPTS>

    <TASKOPERATIONS>
    </TASKOPERATIONS>

    <SCOREFXNS>
        <ScoreFunction name="stage1" weights="stage1" symmetric="0">
            <Reweight scoretype="atom_pair_constraint" weight="1"/>
        </ScoreFunction>
        <ScoreFunction name="stage2" weights="stage2" symmetric="0">
            <Reweight scoretype="atom_pair_constraint" weight="0.5"/>
        </ScoreFunction>
        <ScoreFunction name="fullatom" weights="stage3_rlx" symmetric="0">
            <Reweight scoretype="atom_pair_constraint" weight="0.5"/>
            <Reweight scoretype="pro_close" weight="0.0" />
            <Reweight scoretype="cart_bonded" weight="0.625" />
        </ScoreFunction>
         <ScoreFunction name="ref_2k15" weights="ref2015" symmetric="0">
        </ScoreFunction>
        <ScoreFunction name="ref_2k15_cart" weights="ref2015" >
            <Reweight scoretype="pro_close" weight="0.0" />
            <Reweight scoretype="cart_bonded" weight="0.625" />
        </ScoreFunction>
    </SCOREFXNS>

    <FILTERS>
    </FILTERS>
    
    <SIMPLE_METRICS>
        <TotalEnergyMetric name="total_energy" scorefxn="ref_2k15"/>
    </SIMPLE_METRICS>    

    <MOVERS>
        <Hybridize name="hybridize" stage1_scorefxn="stage1" stage2_scorefxn="stage2" fa_scorefxn="fullatom" batch="1" stage1_increase_cycles="1.0" stage2_increase_cycles="1.0" linmin_only="1">
                    <Fragments three_mers="../../fragments/aat000_03_05.200_v1_3" nine_mers="../fragments/aat000_09_05.200_v1_3"/>                    
                    <Template pdb="inputs/model1_0007.pdb" cst_file="AUTO" weight="1.000" />
                    <Template pdb="inputs/model2_0010.pdb" cst_file="AUTO" weight="1.000" />
                    <Template pdb="inputs/model3_0010.pdb" cst_file="AUTO" weight="1.000" />
                    <Template pdb="inputs/model4_0009.pdb" cst_file="AUTO" weight="1.000" />
                    <Template pdb="inputs/model5_0010.pdb" cst_file="AUTO" weight="1.000" />
        </Hybridize>
        <FastRelax name="relax" scorefxn="ref_2k15" repeats="8" min_type="dfpmin_armijo_nonmonotone">
            <MoveMap>
            <Chain number="1" chi="True" bb="True"/>
            </MoveMap>
        </FastRelax>
        <MinMover name="min_cart" scorefxn="ref_2k15_cart" chi="true" bb="true" cartesian="T" />
        <RunSimpleMetrics name="report_score_1" metrics="total_energy" prefix="score_hybridize_" />
        <RunSimpleMetrics name="report_score_2" metrics="total_energy" prefix="score_relax_" />        
        <RunSimpleMetrics name="report_score_3" metrics="total_energy" prefix="score_min_" />
        <ClearConstraintsMover name="clear_constraints" />
    </MOVERS>

    <PROTOCOLS>
        <Add mover="hybridize"/>
        <Add mover="report_score_1"/>
        <Add mover="relax"/>
        <Add mover="report_score_2"/>
        <Add mover="min_cart" />
        <Add mover="report_score_3"/>
    </PROTOCOLS>

    <OUTPUT scorefxn="ref_2k15" />

</ROSETTASCRIPTS>

 

Error output:

Error messages were:
From line 29:
Error: Element 'SIMPLE_METRICS': This element is not expected. Expected is ( PROTOCOLS ).

24:     </SCOREFXNS>
25:
26:     <FILTERS>
27:     </FILTERS>
28:     
29:     <SIMPLE_METRICS>
30:         <TotalEnergyMetric name="total_energy" scorefxn="ref_2k15"/>
31:     </SIMPLE_METRICS>    
32:
33:     <MOVERS>
34:         <Hybridize name="hybridize" stage1_scorefxn="stage1" stage2_scorefxn="stage2" fa_scorefxn="fullatom" batch="1" stage1_increase_cycles="1.0" stage2_increase_cycles="1.0" linmin_only="1">


 

Is there a way of fixing this? Or am I doing something wrong?

 

Regards,

Veda.

Post Situation: 
Sun, 2018-06-10 23:38
Vedasheersh

SimpleMetrics are a relatively recent addition. If you're not running a recent weekly release, they might not be availible in the version you're using.

If you don't want to update, you can try using the FilterReportAsPoseExtraScoresMover https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Movers/FilterReportAsPoseExtraScoresMover with one of the scoring filters. This is also somewhat recent, though it's been around longer than SimpleMetrics, so it may be present in your version.

Mon, 2018-06-11 08:16
rmoretti

Hi rmoretti,

Thank you for the reply. I understand the point. I will try with the latest weekly release.

I have some queries regarding homology modelling using RosettaCM. I would be happy to hear your suggestions on them. I have a relatively small protein (95aa long) with decent number of templates (around 5) with ~25% sequence identity. I want to generate a homology model for this protein and use it later for designing 'Single point and Double point mutations'.

  1. From the docs, I read that ~10^6 nstruct is required for structure determination. (https://www.rosettacommons.org/docs/latest/getting_started/Rosetta-on-different-scales) But, I donot have supercomputing facilities  (just running a 4Ghz 8 core workstation). So, I thought of using relatively finer models as input to RosettaCM i.e., I am using output models from other servers (Itasser/ SwissModel/ Robetta) to lower the number of nstruct needed for converging results. Is this approach correct in my case? If yes, how many structures do I need to generate?
  2. If I use different sets of runs using different sets of input structures (say one set using models from SwissModel and the other using models from Itasser as inputs), can I merge all the output models and use clustering to arrive at the final structure? Can this improve the chances of finding the true conformation?
  3. Finally, what is the best way to choose final structures for modelling mutations based on the results? I read that clustering the output models is necessary but, on some initial test results of around 800 models, I noticed that clustering based on 1.1Angstrom outputs ~160 clusters which is too many. So, I used 2Angstrom cutoff, which gave me only 32 clusters (with the largest cluster having ~400 models). But, many models with very different scores (ranging from -300 to -30) fall in the same cluster i.e., they are structurally same but have same different scores. Also, many models from different clusters have the same scores i.e., structurally different but have the same scores. How can I tell then that my results are converging and how to choose the final model for further analysis?

 

Sorry for the long questions please let me know if you need any futher clarification on any of the points.

Veda.

Mon, 2018-06-11 19:25
Vedasheersh

The short answer is that you are asking good questions.  The long answer is that the way to find out the answer is to do a long research project.

I don't think you need 10^6 for homology modeling - you need that for ab initio.  You should be able to get away with a LOT less if you have good alignments and good templates, probably on the order of thousands.  Another way to look at it as "if the answer is converging, I have enough models, if the answer is not convergent, I don't have enough".  If you run 10 models and none are similar: not enough.  If you run 1000 and all of the top 10 by energy are similar: probably you've done enough. 

I'm not sure what you mean by using outputs from other programs as input to RosettaCM: as templates?  If they are good models, then yes it will make RosettaCM better and you can run fewer models; if they aren't good models it won't help.  I don't know how you'd know if they were good models; if you already had good models there then presumably you'd be done with the problem already.

You can mix model populations in clustering if you want - it may be laborious to set up the input - but unless they've all been relaxed into the same scorefunction their scores will be wildly different.

The purpose of clustering is to present a manageable number of models for the scientist to examine manually.  Set the clustering parameters to return a number of models you feel like dealing with.  It clusters by structure not energy; you should expect the energies to vary within a cluster.  The results are converging when running more models does not change which structural cluster has the lowest energy model (which cluster by structure, i mean, not by rank) and what that lowest energy is doesn't vary by much.

Mon, 2018-06-11 20:37
smlewis

Hi smlewis,

Thank you for the reply. I understand how to look at convergence of my runs.

And as for the input models: yes, I have no idea to say if the input models are good/bad in the first place. I hope that the models are not too bad and RosettaCM can probably improve them.

Now that you are saying that convergent results mean that the lowest energy models are almost similar: may be the scores depend highly on the final refinement step of my protocol. So, could you please clarify this?

I am using a relax step and a minimise step after the hybridize_mover. Is that enough for refinement? Is it better to use relax with ref2015_cart instead of ref2015? Also, should I remove the constraints from pose after hybridize_mover before performing the refinement steps?

Veda.

Mon, 2018-06-11 22:10
Vedasheersh

The point about relax is simply that if you want to compare models from Rosetta hybridize with models directly from other software, they need the same treatment (like relax) before the comparison.  The protocol you describe is sufficient.  I don't know if cart / not cart and constraints/no constraints are better for a final refinement step, unfortunately.  

Tue, 2018-06-12 10:54
smlewis