You are here

RuntimeError: CUDA out of memory.

3 posts / 0 new
Last post
RuntimeError: CUDA out of memory.

I am having an error while running RFDiffusion.

I don't know where is the 

RuntimeError: CUDA out of memory. Tried to allocate 60.00 MiB (GPU 0; 1.95 GiB total capacity; 931.08 MiB already allocated; 25.00 MiB free; 984.00 MiB reserved in total by PyTorch)

Post Situation: 
Tue, 2023-03-07 00:27

Is the code to run RFDiffusion publically available? 


Also even if it was, there is much too little info to help you, you did not share the command you ran etc...

Tue, 2023-03-07 00:31

CUDA out of memory errors simply mean that your GPU is too small for whatever it is you're trying to do with it.

The first thing to check is to see if you have a bunch of unneeded programs taking up your GPU memory. I've found `nvidia-smi` to be a good program to check this -- it should show you which programs are taking up how much GPU memory, and may help if there are ones (e.g. Chrome, Firefox) which are taking up a lot which you could potentially close.

If you're running the code from within a Python session or a Jupyter notebook, it may be that there's a bunch of "old" results hanging around which are taking up GPU memory. There's ways to flush things, but the simplest method is to simply restart the session/notebook, as this will get rid of everything you don't need. Also, for notebooks, be careful of PyTorch (or other ML) objects you have stored in variables (or the implicit variable of the printed results). If you need to store a Pytroch tensor from an ML calculation it may be worth calling .cpu() on it first.

Other than that, it could just be that your GPU is underpowered for the sort of modeling you're trying to do. If you can't find a bigger GPU, you may need to try running a smaller sized protein system, or tweaking the parameters to use "simpler" models and techniques. (I'm not familiar enough with RFDiffusion internals to give specific recommendation for things to try - I'd just play around with things to see what might be helpful.)

But again, the general recommendation for CUDA out-of-memory issues is to find a bigger GPU to run things on. (The recent ML models are almost always GPU hungry.)

Tue, 2023-03-07 08:18