You are here

Models which are not recorded (Abinitio protocol)

4 posts / 0 new
Last post
Models which are not recorded (Abinitio protocol)
#1

Hello everyone,

I was running Abinitio protocol and I have noticed that a certain sequence yield sequeced models (S_000001, S_000002, S_000003) in the absence of the secondary structure predicition file. Conversely, in the presence of this file in the protocol some models where not generated (or, at least, not recorded). In fact, I have something like the sequence: S_00002, S_00009, S_00013, etc. From 100,000 models required only around 10,000 were possible to be extract. Is there any resonable reason for this fact?

Addicionaly, I was wondering where is the secondary structure prediction file used in the modelling (and its weight in the score)?

Cheers,
Állan.

Post Situation: 
Fri, 2015-08-28 13:39
allan.ferrari

I'm not sure if this is exactly what's happening in your case, but one possibility for "missing" output structures in certain protocols is filtering. Rosetta generates the structure internally, but it fails certain quality checks, so Rosetta doesn't bother to output it. Normally you'll see this indicated in the (non-muted) tracer output. Depending on the particular protocol, you will either repeat the output name, or possibly skip to the next output structure. Changing the input parameters means that which filters and thresholds may change slightly, so that may be why you're seeing it in some cases by not others.

Another possibility is that there was some issue during the run - particularly if you used multiple processors - where output structures were skipped or corrupted due to conflicts between the processors, or issues with accessing the disk.

On the secondary structure prediction file, I'm not quite understanding the context. Which secondary structure prediction are you talking about? How are you providing it? In the most basic ab initio protocol, the secondary structure prediction is used for fragment picking, but not for the actual model building run. There's other variants, which could possibly incorporate the secondary structure information during the model building itself, but details would vary based on what you're hoping to achieve and the details of the ab initio protocol you're using.

Fri, 2015-09-04 13:06
rmoretti

Thank you for your answer rmoretti.

On the secondary structure prediction file, I was refering to the psipred file that can be used with the flag (in:file:psipred_ss2 file.psipred_ss2).

The interesting fact is that when I don't use this flags (in:file:psipred_ss2) all the models are recorded.

Is it irrelevant for the score? Or how is it affectiong the prediction?

 

Best regards,

Állan.

Wed, 2015-09-16 09:38
allan.ferrari

If you pass the option -abinitio:use_filters, there are a number of structural quality filters which get turned on. The default set includes radius of gyration, contact order, and beta sheet filters (as well as others). These filters need to know what the predicted secondary structure composition of the protein is in order to determine where to set the pass/fail cutoff for the filter. So if you don't pass the -in:file:psipred_ss2 option, these filters are turned off. When you do pass it, the filters are on, and some of the structures will be rejected due to not having a structure consistent with the desired secondary structure.

You should be able to see this in the tracer output. You should be getting lines like "protocols.simple_filters.AbinitioBaseFilter: apply filter: " and when you don't provide the -in:file:psipred_ss2 option also "protocols.simple_filters.AbinitioBaseFilter: Warning: Needs psipred_ss2 to run filters" lines.

From best I can tell, the -in:file:psipred_ss2 option is only used by the filters - I don't believe there's anything that it would change about the scoring itself.

By the way, if you want to keep the filters, but also write out the failing structures, there's an "-abinitio:no_write_failures" option that you can use to control this output ... though as I read things it should default to writing everything out, and omitting failed structures would have to be explicitly turned on. But anyway, when structures are written, you can see if they failed by the letter in their name:

  • S  -- successful run
  • P  -- had problems with the abinito run itself
  • F  -- failed filters
  • C -- couldn't close loops
  • X -- couldn't close loops *and* failed the filters.
Thu, 2015-09-17 12:42
rmoretti