Run AbinitioRelax (rosetta 3.3) on cluster

18 replies [Last post]
grisha
Offline
Joined: 2011-12-11
Category: 
Structure prediction

Hi everyone!!!

I am a new Rosetta user.
I compilated MPI on Rosetta ....
Now I dont know what paramets used to run AbinitioRelax on cluster ( 24 cores)....

Thanks for your help.

Grigor

rmoretti
Offline
Joined: 2011-02-28

Have you taken a look at the manual? (http://www.rosettacommons.org/manuals/archive/rosetta3.3_user_guide/ ) Specifically the abinitio modeling section (http://www.rosettacommons.org/manuals/archive/rosetta3.3_user_guide/app_abinitio.html). Like for most academic programs, documentation for Rosetta can be threadbare/obscure in portions (though we're trying to get better), but basic usage should be covered. Feel free to ask if anything's not clear or you have additional questions.

Regarding using MPI, a lot of that is going to depend on how your cluster is set up. I'd recommend consulting your local sysadmin/cluster guru for general help on running MPI programs.

smlewis
Offline
Joined: 2010-08-03

Abinitio still doesn't work under MPI. Ugh. Please read one of the fifteen other threads on the subject that explain the pros and cons (namely that there's no speed benefit to MPI). After reading that, if you want the abinitio mpi patch, let me know here and I'll send it along.

grisha
Offline
Joined: 2011-12-11

Hi
Thank you for your answers!
As I said, I'm new in Rosetta ... I'm trying to run AbinitioRelax on all cores to save time.

Smlewis can you please send me the patch and tell me how I can do the run with that?

grisha
Offline
Joined: 2011-12-11

And another questions:
1)Other applications also do not support MPI?
2)What value should have the -nstruct option to obtain the best structure ?

smlewis
Offline
Joined: 2010-08-03

1) Some do, some don't. The oldest ones that existed in Rosetta++ have spotty MPI support; newer ones for Rosetta3 generally have MPI support. Abinitio is the big one that doesn't. Loop modeling does, I think, but it works differently than the rest. Docking used to not have it but does now. Fixbb does.

2) Well, infinity, I guess. It depends on what you are doing. You should probably be aiming in terms of 10,000 to 1,000,000 structures generated depending on the scope of your problem.

grisha
Offline
Joined: 2011-12-11

Thank you very much smlewis for your helpful advice and for patch!!!
I compiled it !
Another question:
1)If I use the AbInitio_MPI.linuxgccrelease my task will be performed on all cores of cluster ? (24 cores )
2)To perform the de nova simulations AbInitio_MPI.linuxgccrelease must use the same commands and flags files as for AbinitioRelax.linuxgccrelease or they differ ?
If they are different please tell me what they should be for AbInitio_MPI.linuxgccrelease .

Thank you !

Grigor

smlewis
Offline
Joined: 2010-08-03

The MPI patch allows it to communicate via MPI. You'd use it with MPI the way you'd use any other MPI program, with mpirun and whatever flags (to mpirun) your implementation takes. I'd guess -np24 for 24 processors, so something like "mpirun -np24 [mpi flags] AbInitio_MPI.linuxgccrelease [rosetta flags]. (Also consider "-mpi_tracer_to_file proc" as a Rosetta flag to separate the log files by processor instead of piling them all together).

It should use the same flags as AbinitioRelax to my knowledge. The patch was never intended for public use so it's not documented by itself. Eventually I'll convince someone to either fix or replace the AbinitioRelax executable with one that works for MPI.

grisha
Offline
Joined: 2011-12-11

Hi!
Smlewis thank you very much for your help!

I started abinitio with the following parameters:

#!/bin/sh

cd /home/grisha/rosetta/

nohup \
mpirun \
mpiexec -machinefile /home/grisha/mpich/m2file_3n8c \
-np 24 /home/biosoft/rosetta3.3_bundles/rosetta_source/bin/AbInitio_MPI.linuxgccrelease \
-database ../../../home/biosoft/rosetta3.3_bundles/rosetta_database \
-in:file:fasta ./inp/b30_2_.fasta \
-in:file:frag3 ./inp/aab30_2_03_05.200_v1_3 \
-in:file:frag9 ./inp/aab30_2_09_05.200_v1_3 \
-abinitio:relax \
-relax:fast \
-abinitio::increase_cycles 10 \
-abinitio::rg_reweight 0.5 \
-abinitio::rsd_wt_helix 0.5 \
-abinitio::rsd_wt_loop 0.5 \
-use_filters true \
-psipred_ss2 ./inp/b30_2.psipred_ss2 \
-out:file:silent ./out/b30_2_silent.out \
-nstruct 1 < /dev/null

AttachmentSize
ab_mpi.txt 692 bytes
b30_2_silent.txt 503.68 KB
grisha
Offline
Joined: 2011-12-11

But if you look at silent file you will see that there are 24 structures , but in my case -nstruct = 1 , and if I remember correctly, this means that in silent file should be only one structure....
This means that all the cores do not work together as one on one and the same process in isolation??

Smlewis please help me if possible to do so that all cores performed a single task together.

smlewis
Offline
Joined: 2010-08-03

"/home/biosoft/rosetta3.3_bundles/rosetta_source/bin/AbInitio_MPI.linuxgccrelease" is a symlink. To what does it point? You may be running the single-threaded executable under MPI instead of the MPI executable under mpi. The target of the symlink path should have MPI in it.

This would explain your weird -nstruct 1, but 10 structures result.

jadolfbr
Offline
Joined: 2010-08-03

If you do ab initio prediction on a sequence, you need anywhere between 15 - 30 thousand structures depending on the size of your protein. There will almost never be a case where you would have nstruct=1.

Each processor will take the job of doing one of those 30 thousand structures. Even if code was written to distribute a single structure among all of those processors, it would probably end up taking longer as each processor needs to get data from every other processor, and then back out again. This works well in MD (NAMD), where you have a huge system, with pieces of the system being distributed to each processor...but for Rosetta, the benefit mainly comes from independent trajectories, and the abinito step of abrelax is actually fairly quick on a single processor. The relax step takes longer depending on how many you do (-fast vs -thorough), but I don't know if it is possible or beneficial to farm the minimization part of the relax step out to multiple processors....

grisha
Offline
Joined: 2011-12-11

Sorry but I think it is not - I did the same thing for nstruct 10 on different number of cores (8, 16 ,24 ) and in all cases, as with nstruct= 1 or 10 in the silent file number of the predicted structures for each number structure varied based on the Number of activated cores ( = 8 16 24)
e.g. in case of nstruct=1 number of s_001 = 8 (8 cores ) 16 ( 16 cores ) 24 ( 24 cores)

in case of nstruct=10 number of s_001 ....... s_010 = 8 (8 cores ) 16 ( 16 cores ) 24 ( 24 cores)

which I think indicates that each core does the same job!!! I used a value of nstruct= 1 or 10 in order to quickly verify that the MPI is working......
And based on the fact that I had to use the value of nstruct= 10000 - 100000 ... I would like to understand how to run MPI for save time!

so Please help me to run one job at all 24 cores !!!

rmoretti
Offline
Joined: 2011-02-28

e.g. in case of nstruct=1 number of s_001 = 8 (8 cores ) 16 ( 16 cores ) 24 ( 24 cores)
in case of nstruct=10 number of s_001 ....... s_010 = 8 (8 cores ) 16 ( 16 cores ) 24 ( 24 cores)
which I think indicates that each core does the same job!!!

This is how it is supposed to work. Each core will work on its own structure. They all may be named _001, but if you take a look at them, the different _001's from different cores should be different structures. (If not, you might have to adjust things so that the different cores get different random number seeds.)

I would like to understand how to run MPI for save time!

In case smlewis wasn't clear enough, running Abinitio (or for that matter most Rosetta protocols) under MPI won't save time. The recommended way of using 24 cores in Abinitio is to launch 24 separate single core jobs and then combining the result files. Running a 24 core MPI job is entirely equivalent to running 24 separate 1 core jobs. A single result structure is only ever processed on one core, and the only difference between different result structures is which random numbers were used.

The only reason it would be of significant benefit to run Abinitio with MPI would be if your clustering software *required* the use of MPI-enabled software. And even then all the MPI version of Abinitio would be doing is faking things enough so that 24 separate one core job get launched.

grisha
Offline
Joined: 2011-12-11

Thank you rmoretti for your answer!!!

If I understand correctly that each core performs the exact same job at the output gives a different structure.

That is, when nstruct = 1 to 24 cores at the output get 24 different structures (I looked at it and have all 24 structures composed by different core are different)

By analogy I can give the value of nstruct = 416 and it wakes equal nstruct = 10000?
And this is the right approach??

But are there any -run command through which I could run all cores for a single process?

In my opinion the laboratory using Rosetta use it on super clusters!

That is, should be the -run comand to use all the cores for a single work to save time (in my opinion)

rmoretti
Offline
Joined: 2011-02-28

But are there any -run command through which I could run all cores for a single process?

In general, no. Even the Rosetta protocols that are explicitly written to take advantage of MPI typically use the approach of running separate, independent subprotocols on each core. There's certainly use of Rosetta on superclusters, but that's mainly by using all of the available cores with separate jobs.

Frankly speaking, it's more efficient that way. You get linear scaling (twice the number of processors gives you twice the output results), whereas using more complex schemes typically gives you less than linear scaling (e.g. twice the cores gives only 1.8 times the processing power), because of communication overhead and syncronization issues, so are most useful in cases like MD where you can't use trivial parallelism. (e.g. if you gave 20-40 ns to processor 2 and 40-60 ns to processor 3, processor 3 would be idle until processor 2 got done, but giving decoys 20-39 to processor 2 and 40-59 to processor 3 allows processor 3 to start decoy 40 even before processor 2 gets done with decoy 20)

grisha
Offline
Joined: 2011-12-11

Thank you for your help rmoretti !!!
I was convinced of the power loss when using multiple cores, and the loss increased with an increase in the number of cores...
Now I'm trying to run on each core of a separate job (use all cores on parallel) .

I tried to run from different directories separate jobs but despite the fact that the jobs were started from different directories and in different times they both run on the same core.

Question:
1)How to run a separate job on different cores ( more detail if possible )?

rmoretti
Offline
Joined: 2011-02-28

Which core a particular Rosetta application runs on is controlled by your system, not Rosetta. On most systems, two Rosetta runs launched one after each other will each run on separate cores, but your system may be set up in such a way that you need to do something else to get it to work.

I would recommend consulting your local system administrator or cluster guru to learn how to launch such jobs on the system you are using.

grisha
Offline
Joined: 2011-12-11

Dear Rocco Moretti thank you very much for help !!!