You are here

Up to date Vall database

4 posts / 0 new
Last post
Up to date Vall database
#1

Dear all,

I want to know how to create an up to date vall database (contains all current PDB's structures) used for fragment_picker.
I have downloaded nr.gz from:

ftp://ftp.ncbi.nih.gov/blast/db/FASTA/

Would you please state the next steps/statements to be executed?

What about the Robetta server? which version does it use to create fragments libraries?

Thank you in advance.

Jad

Post Situation: 
Sat, 2013-06-29 10:35
JadAbbass

Take a look at rosetta_tools/fragment_tools/pdb2vall/

Making a Vall is a bit of a black art, though.

Mon, 2013-07-01 10:18
rmoretti

I am curious, which version of VALL Robetta uses for the moment. 

A related question: does the fragment database not play a role in the benchmark of the scoring function, to certain extent?

Thu, 2017-06-08 00:21
attesor

The most up-to-date version of the Vall database is the one we distribute with Rosetta in Rosetta/tools/fragment_tools/vall.jul19.2011.gz -- So that would be based on the PDB as it existed in mid-2011.  I've also heard that for some purposes people actually prefer even older versions of Vall, as the lower numbers of structures in the PDB back then means that you're not "oversampling" structures. (It also helps with benchmarking, as you know the newer structures aren't represented in the older Vall.)

 

The fragment database only has a round-about effect on the scoring function benchmarking. Rosetta doesn't currently use a score term which references statistics from the fragment database. So there's no direct influence of fragments on the scorefunction. The only effect of the fragment set on scoring is due to the coupling of sampling and scoring. Changing the fragment set means the landscape of sampling changes subtly, which can change which terms are important from distinguishing good decoys from poor decoys.

However, typically in benchmarking efforts the fragment set is fixed to a particular version before the different versions of the scorefunction are tested.  Note that it's not just the Vall database which will have an effect on fragment selection. Which algorithms (and algorithm versions) you use for secondary structure prediction and homolog profile assembly will also cause a difference in which fragments you select. If you allow too many parameters to shift, things get overwhelmingly hard to optimize, so typically people benchmarking scorefunctions consider the sampling strategy (including the fragment selection) to be "fixed" and optimize with respect to that fixed version.

Tue, 2017-06-20 09:02
rmoretti