Rosetta 3.4
Public Member Functions | Protected Member Functions | Protected Attributes | Friends
protocols::jd2::MPIWorkPoolJobDistributor Class Reference

#include <MPIWorkPoolJobDistributor.hh>

Inheritance diagram for protocols::jd2::MPIWorkPoolJobDistributor:
Inheritance graph
[legend]
Collaboration diagram for protocols::jd2::MPIWorkPoolJobDistributor:
Collaboration graph
[legend]

List of all members.

Public Member Functions

virtual ~MPIWorkPoolJobDistributor ()
 dtor WARNING WARNING! SINGLETONS' DESTRUCTORS ARE NEVER CALLED IN MINI! DO NOT TRY TO PUT THINGS IN THIS FUNCTION! here's a nice link explaining why: http://www.research.ibm.com/designpatterns/pubs/ph-jun96.txt
virtual void go (protocols::moves::MoverOP mover)
 dummy for master/slave version
virtual core::Size get_new_job_id ()
 dummy for master/slave version
virtual void mark_current_job_id_for_repetition ()
 dummy for master/slave version
virtual void remove_bad_inputs_from_job_list ()
 dummy for master/slave version
virtual void job_succeeded (core::pose::Pose &pose, core::Real run_time)
 dummy for master/slave version
virtual void mpi_finalize (bool finalize)
 should the go() function call MPI_finalize()? It probably should, this is true by default.

Protected Member Functions

 MPIWorkPoolJobDistributor ()
 ctor is protected; singleton pattern
virtual void handle_interrupt ()
 This function got called when job is not yet finished and got termitated abnormaly (ctrl-c, kill etc). when implimenting it in subclasses make sure to delete all in-progress-data that your job spawn.
virtual void master_go (protocols::moves::MoverOP mover)
 Handles the receiving of job requests and the sending of job ids to and from slaves.
virtual void slave_go (protocols::moves::MoverOP mover)
 Proceeds to the parent class go_main() as usual.
virtual core::Size master_get_new_job_id ()
 Always returns zero, simply increments next_job_to_assign_ to the next job that should be run based on what has been completeted and the overwrite flags.
virtual core::Size slave_get_new_job_id ()
 requests, receives, and returns a new job id from the master node or returns the current job id if the repeat_job_ flag is set to true
virtual void master_mark_current_job_id_for_repetition ()
 This should never be called as this is handled internally by the slave nodes, it utility_exits.
virtual void slave_mark_current_job_id_for_repetition ()
 Sets the repeat_job_ flag to true.
virtual void master_remove_bad_inputs_from_job_list ()
 Simply increments next_job_to_assign_ to the next job that should be run based on what has been completed and if the input job tag of the job marked as having bad input.
virtual void slave_remove_bad_inputs_from_job_list ()
 Sends a message to the head node that contains the id of a job that had bad input.
virtual void master_job_succeeded (core::pose::Pose &pose)
 This should never be called as this is handled internally by the slave nodes, it utility_exits.
virtual void slave_job_succeeded (core::pose::Pose &pose)
 Sends a message to the head node upon successful job completion to avoid output interleaving.

Protected Attributes

core::Size npes_
 total number of processing elements
core::Size rank_
 rank of the "local" instance
core::Size current_job_id_
 where slave jobs store current job id
core::Size next_job_to_assign_
 where master stores next job to assign (in a good state after get_new_job_id up until it's used)
core::Size bad_job_id_
 where master temporarily stores id of jobs with bad input
bool repeat_job_
 where slave stores whether it should repeat its current job id
bool finalize_MPI_
 should the go() function call MPI_finalize? There are very few cases where this should be false

Friends

class JobDistributorFactory

Detailed Description

This job distributor is meant for running jobs where the machine you are using has a large number of processors, the number of jobs is much greater than the number of processors, or the runtimes of the individual jobs could vary greatly. It dedicates the head node (whichever processor gets processor rank #0) to handling job requests from the slave nodes (all nonzero ranks). Unlike the MPIWorkPartitionJobDistributor, this JD will not work at all without MPI and the implementations of all but the interface functions have been put inside of ifdef directives. Generally each function has a master and slave version, and the interface functions call one or the other depending on processor rank.


Constructor & Destructor Documentation

protocols::jd2::MPIWorkPoolJobDistributor::MPIWorkPoolJobDistributor ( ) [protected]

ctor is protected; singleton pattern

constructor. Notice it calls the parent class! It also builds some internal variables for determining which processor it is in MPI land.

References npes_, and rank_.

protocols::jd2::MPIWorkPoolJobDistributor::~MPIWorkPoolJobDistributor ( ) [virtual]

dtor WARNING WARNING! SINGLETONS' DESTRUCTORS ARE NEVER CALLED IN MINI! DO NOT TRY TO PUT THINGS IN THIS FUNCTION! here's a nice link explaining why: http://www.research.ibm.com/designpatterns/pubs/ph-jun96.txt

WARNING WARNING! SINGLETONS' DESTRUCTORS ARE NEVER CALLED IN MINI! DO NOT TRY TO PUT THINGS IN THIS FUNCTION! here's a nice link explaining why: http://www.research.ibm.com/designpatterns/pubs/ph-jun96.txt


Member Function Documentation

core::Size protocols::jd2::MPIWorkPoolJobDistributor::get_new_job_id ( ) [virtual]

dummy for master/slave version

Implements protocols::jd2::JobDistributor.

References master_get_new_job_id(), rank_, and slave_get_new_job_id().

void protocols::jd2::MPIWorkPoolJobDistributor::go ( protocols::moves::MoverOP  mover) [virtual]

dummy for master/slave version

Reimplemented from protocols::jd2::JobDistributor.

References finalize_MPI_, master_go(), rank_, and slave_go().

virtual void protocols::jd2::MPIWorkPoolJobDistributor::handle_interrupt ( ) [inline, protected, virtual]

This function got called when job is not yet finished and got termitated abnormaly (ctrl-c, kill etc). when implimenting it in subclasses make sure to delete all in-progress-data that your job spawn.

Implements protocols::jd2::JobDistributor.

void protocols::jd2::MPIWorkPoolJobDistributor::job_succeeded ( core::pose::Pose pose,
core::Real  run_time 
) [virtual]

dummy for master/slave version

Reimplemented from protocols::jd2::JobDistributor.

References master_job_succeeded(), rank_, and slave_job_succeeded().

void protocols::jd2::MPIWorkPoolJobDistributor::mark_current_job_id_for_repetition ( ) [virtual]
core::Size protocols::jd2::MPIWorkPoolJobDistributor::master_get_new_job_id ( ) [protected, virtual]
void protocols::jd2::MPIWorkPoolJobDistributor::master_go ( protocols::moves::MoverOP  mover) [protected, virtual]

Handles the receiving of job requests and the sending of job ids to and from slaves.

This is the heart of the MPIWorkPoolJobDistributor. It consists of two while loops: the job distribution loop (JDL) and the node spin down loop (NSDL). The JDL has three functions. The first is to receive and process messages from the slave nodes requesting new job ids. The second is to receive and process messages from the slave nodes indicating a bad input. The third is to receive and process job_success messages from the slave nodes and block while the slave node is writing its output. This is prevent interleaving of output in score files and silent files. The function of the NSDL is to keep the head node alive while there are still slave nodes processing. Without the NSDL if a slave node finished its allocated job after the head node had finished handing out all of the jobs and exiting (a very likely scenario), it would wait indefinitely for a response from the head node when requesting a new job id.

Reimplemented in protocols::unfolded_state_energy_calculator::UnfoldedStateEnergyCalculatorMPIWorkPoolJobDistributor.

References protocols::jd2::BAD_INPUT_TAG, bad_job_id_, protocols::jd2::JOB_SUCCESS_TAG, master_get_new_job_id(), master_remove_bad_inputs_from_job_list(), MPI_ANY_SOURCE, protocols::jd2::NEW_JOB_ID_TAG, next_job_to_assign_, npes_, rank_, and protocols::jd2::TR().

Referenced by go().

void protocols::jd2::MPIWorkPoolJobDistributor::master_job_succeeded ( core::pose::Pose pose) [protected, virtual]

This should never be called as this is handled internally by the slave nodes, it utility_exits.

References rank_, and protocols::jd2::TR().

Referenced by job_succeeded().

void protocols::jd2::MPIWorkPoolJobDistributor::master_mark_current_job_id_for_repetition ( ) [protected, virtual]

This should never be called as this is handled internally by the slave nodes, it utility_exits.

References rank_, and protocols::jd2::TR().

Referenced by mark_current_job_id_for_repetition().

void protocols::jd2::MPIWorkPoolJobDistributor::master_remove_bad_inputs_from_job_list ( ) [protected, virtual]

Simply increments next_job_to_assign_ to the next job that should be run based on what has been completed and if the input job tag of the job marked as having bad input.

References bad_job_id_, protocols::jd2::JobDistributor::get_jobs(), protocols::jd2::JobDistributor::job_outputter(), master_get_new_job_id(), next_job_to_assign_, rank_, and protocols::jd2::TR().

Referenced by protocols::unfolded_state_energy_calculator::UnfoldedStateEnergyCalculatorMPIWorkPoolJobDistributor::master_go(), master_go(), and remove_bad_inputs_from_job_list().

void protocols::jd2::MPIWorkPoolJobDistributor::mpi_finalize ( bool  finalize) [virtual]

should the go() function call MPI_finalize()? It probably should, this is true by default.

Reimplemented from protocols::jd2::JobDistributor.

References finalize_MPI_.

void protocols::jd2::MPIWorkPoolJobDistributor::remove_bad_inputs_from_job_list ( ) [virtual]
core::Size protocols::jd2::MPIWorkPoolJobDistributor::slave_get_new_job_id ( ) [protected, virtual]

requests, receives, and returns a new job id from the master node or returns the current job id if the repeat_job_ flag is set to true

References current_job_id_, protocols::jd2::NEW_JOB_ID_TAG, rank_, repeat_job_, and protocols::jd2::TR().

Referenced by get_new_job_id().

void protocols::jd2::MPIWorkPoolJobDistributor::slave_go ( protocols::moves::MoverOP  mover) [protected, virtual]

Proceeds to the parent class go_main() as usual.

References protocols::jd2::JobDistributor::go_main(), and rank_.

Referenced by go().

void protocols::jd2::MPIWorkPoolJobDistributor::slave_job_succeeded ( core::pose::Pose pose) [protected, virtual]

Sends a message to the head node upon successful job completion to avoid output interleaving.

References protocols::jd2::JobDistributor::current_job(), current_job_id_, protocols::jd2::JobDistributor::job_outputter(), protocols::jd2::JOB_SUCCESS_TAG, rank_, and protocols::jd2::TR().

Referenced by job_succeeded().

void protocols::jd2::MPIWorkPoolJobDistributor::slave_mark_current_job_id_for_repetition ( ) [protected, virtual]

Sets the repeat_job_ flag to true.

References current_job_id_, rank_, repeat_job_, and protocols::jd2::TR().

Referenced by mark_current_job_id_for_repetition().

void protocols::jd2::MPIWorkPoolJobDistributor::slave_remove_bad_inputs_from_job_list ( ) [protected, virtual]

Sends a message to the head node that contains the id of a job that had bad input.

References protocols::jd2::BAD_INPUT_TAG, current_job_id_, and rank_.

Referenced by remove_bad_inputs_from_job_list().


Friends And Related Function Documentation

friend class JobDistributorFactory [friend]

Reimplemented from protocols::jd2::JobDistributor.


Member Data Documentation

should the go() function call MPI_finalize? There are very few cases where this should be false

Referenced by go(), and mpi_finalize().

where master stores next job to assign (in a good state after get_new_job_id up until it's used)

Referenced by master_get_new_job_id(), protocols::unfolded_state_energy_calculator::UnfoldedStateEnergyCalculatorMPIWorkPoolJobDistributor::master_go(), master_go(), and master_remove_bad_inputs_from_job_list().

where slave stores whether it should repeat its current job id

Referenced by slave_get_new_job_id(), and slave_mark_current_job_id_for_repetition().


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines