You are here

chooising an appropriate cluster -- parallel granularity

2 posts / 0 new
Last post
chooising an appropriate cluster -- parallel granularity


I am interested in any advice on choosing an appropriate cluster on which to run Rosetta 3.4. Most likely, I'll be using it for relaxation and high-resolution, all-atom refinement. I am just wondering about the parallel granularity -- the ratio of communication to computation (e.g. coarse vs fine grained). Is it better to go with more of a throughput cluster or a more tightly coupled, low latency cluster?

Here are my main options (pertaining to the SHARCNET systems):
Kraken: "Throughput clusters, an amalgamation of older point-of-presence and throughput clusters, suitable for serial applications, and small-scale low latency demanding parallel MPI applications."
Orca: "Low latency parallel applications."
Requin: "Large, tightly-coupled MPI jobs."
Saw: "Parallel applications."

More info is here:

Any advice would be much appreciated. So far, I have been using the kraken throughput cluster, but I'd like some confirmation that that's a suitable choice.



Post Situation: 
Tue, 2012-06-19 09:57

For most applications, Rosetta does zero or nearly zero communication. Most applications are parallelized ONLY at the independent-trajectory level, and so it is "embarassingly parallel" and the speed (in terms of structures per unit time) is linear (and slope of 1) with the number of processors. As a corollary, you cannot accelerate a single trajectory.

These applications communicate a little or not at all to begin trajectories (deciding which processors do which jobs). This is extremely insensitive to the cluster architecture; poor communication won't hurt at all. When using silent file output, there is an amount of communication at the end of a job, as the job's results are emitted to a single node responsible for disk I/O; this is a bolus rather than constant communication. I think this is only weakly sensitive to communication.

Certain applications behave differently, but none are communication-bound; all are strongly computation-bound.

BTW, your URL requires a login, so we can't see what you wanted us to see.

Tue, 2012-06-19 11:35