You are here

Running the Matcher on multiple PDBs

2 posts / 0 new
Last post
Running the Matcher on multiple PDBs
#1

I'm having issues getting the Matcher to run. My .flags file looks like this (file names changed):

-s PDB1.pdb

-s PDB2.pdb

-match:lig_name LIG

-match:scaffold_active_site_residues POS.pos

-match:geometric_constraint_file CST.cst

-extra_res_fa PARAM.params

-ex1

-ex2

-use_input_sc

Both PDBs are in the same directory, and the Matcher runs fine in that it will get a number of matches for PDB1.pdb, but then it just ends... see:

protocols.match.output.PDBWriter: 2 upstream hits and 0 downstream hits for geom cst 2

protocols.match.output.PDBWriter: A total of 62 models were written for match group 77 in 1 files.

protocols.match.output.PDBWriter: beginning writing cloud for group 78

protocols.match.output.PDBWriter: 22 upstream hits and 29 downstream hits for geom cst 1

protocols.match.output.PDBWriter: 1 upstream hits and 0 downstream hits for geom cst 2

protocols.match.output.PDBWriter: A total of 30 models were written for match group 78 in 1 files.

apps.public.match.match: Matcher ran for 38726 seconds, where finding hits took 38646 seconds and processing the matches took 80 seconds.

I'm wondering what I'm doing wrong, and any help would be appreciated. I'd rather not have to run it individually for each PDB I have.

Thanks!

Category: 
Post Situation: 
Thu, 2016-02-18 08:50
Jhreed

The matcher is a little bit different than other Rosetta protocols. In particular, it's not really set up to be able to input multiple starting structures in a single run. (That is, only running on the first PDB is the intended behavior.)

However, it *is* the general use case for the matcher to run on a large number of starting structures. But the way this is normally done is by launching multiple matcher runs on each structure. You don't necessarily need to manually do this, though. I'd recommend using something like a shell script to iterate over a list of PDBs and then launch each job. For example the bash script:

for pdb in PDB*.pdb; do
    ~/Rosetta/main/source/bin/match.linuxgccrelease @options.txt -s ${pdb}
done

 

Will iterate through all the pdbs in the current directory named like "PDB*.pdb" (that is, everything which is listed with a "ls PDB*.pdb" and then launch match jobs one after the other, filling in the "${pdb}" with the actual filename of the PDB file.

You can, of course, get more complicated than this, which is useful if you want to split up the jobs on multiple processors.

Thu, 2016-02-18 14:39
rmoretti