Dear Sir and Madam,
I have a question regarding memory leak in FragmentPicker.cxx11thread application.
When I launch Rosetta FragmentPicker thread application on SLURM (with all relevant frags:non-local and frags:contacts settings turned on), the amount of consumed memory grows up fastly, cluster gets overloaded and application's run is aborted (st9bad_alloc) with ExitCode 134.
I noticed, that with bigger number of thread this memory growth usage gets faster, consequently, the software stops even sooner. On this basis I may suppose, that it is thread-related problem. I read, that the problem often consists in some features of C++.
I have tried to restrict the input and amount of processed data by a list of denied_pdb and cutting the number of candidated and selected fragments per position, respectively. I would notice, that the foregoing error (or stucking at this stage) happens at the saving fragments stage.
UPD:
But after thorough examination of slurm[jobid].out file I noticed, that this memory usage increases even after closing all threads, when there is only one thread left (I mean, that application is still running). At least, slurm[jobid].out doesn't present any further multithread activity despite the highest possible log output ('trace') level 500 - probably the progress displaying stucks at some stage. That contradicts, what I see at cluster - htop shows usage of specified number of CPUs until the very end.
Would you be kind to tell me, how to fix this problem, please?
I will be sincerely grateful for your response.
Best regards,
Corvin.
Category:
Post Situation: