You are here

How to read the output score file from rosetta?

2 posts / 0 new
Last post
How to read the output score file from rosetta?
#1

Dear all, 

I find the score file output by rosetta is different to read. For example, a score-snugdock.sf file produced by snugdock reads like the following in gedit:

https://www.dropbox.com/s/6oba8nitiwob7pb/score.PNG?dl=0

I cannot tell the column names and their corresponding values. The Rosetta documnet gives a sort command that can pick out the structure with the lowest score:

 sort -nk5 score-snugdock.sf | head -n1 | awk '{print $NF}'

But I don't understand why the value for the "-k" option is 5. As I see it, the sort should be based on the "total_score" column, which is the scond column, so I think the  value for the "-k" option should be 2.

I have also tried to use LibreOffice Calc to open the file with selecting all the seperate option, the arrangement is also Could you tell me how to display the score file to each column aligned neatly?

Best regards.

Category: 
Post Situation: 
Sun, 2017-08-27 05:29
Sunyp_IM

The issue you're having with gedit is that gedit is wrapping the lines for you. It will likely be much easier to read if you turn off line-wrapping. I think this varies a bit on the version of gedit you have, but for me it's Edit->Preferenced->View->"Enable text wrapping"

With sort, the 'k' value is the column number you want to sort by. To figure out which one this is, you need to look at the column heads and pick the appropriate one. Unfortunately, the values often move around, so you need to change the value based on what it is in your scorefile. For example, in your scorefile column 5 is I_sc  (the score of the protein-protein interface) whereas total_score is column 2. -- Note that very often you will want to select the score with the best binding score, rather than the one with the best total score, which may be why the example uses -k2.

LibreOffice Calc is also a good choice for looking at scorefiles. Often when you open the file it will automatically bring up the "import" option, but sometimes it will open the file with everything as a string in a single column. To fix this (to bring up the import option), you can select the columns/cells which need to be split and go to Data->"Text to Columns".  This will bring up the same "text import" box, where you can change the separator options (Check "Tab", "Space" and "Merge delimiters" to successfully import a silent file.)

A final option, if you have a somewhat recent Rosetta, is to use the Rosetta/main/source/scripts/python/public/select_columns.py script. This will allow you to pull out just the columns you want (using either their numeric position or their name), which you can then process however you want.

 

Mon, 2017-08-28 09:32
rmoretti