MPI stall
Hi all,
Has anyone ever had any problems with the master MPI process stalling as it's cleaning up / giving more work to a slave or a little afterward?
I am running more tests tomorrow, but I have had it stall (While still saying it at 100 % cpu) with its output garbled like this before it just stops communicating with the rest the slaves. Once it stops communicating, all the slaves continue to run, but nothing gets output and the job needs to be cancelled. Also, no error message is reported from MPI or Rosetta. It just silently stops:
- Read more about MPI stall
- 9 comments
- Log in or register to post comments