We're using SLURM to manage job scheduling on our computing cluster, and we experiencing a problem with memory management. Specifically, we can't find out how we can allocate memory for a specific task.
Consider the following setup:
- Each node has 32GB memory
- We have a SLURM job that sets
--mem=24GB
Now, assume we want to run that SLURM job twice, concurrently. Then what I expect (or want) to happen is that when I queue it twice by calling sbatch runscript.sh twice, one of the two jobs will run on one node, and the other will run on another node. However, as it currently is, SLURM schedules both tasks on the same node.
One of the possible causes we've identified is that it appears to check only whether the 24GB of memory is available (i.e., not actively used by other node), instead of checking whether it is requested/allocated.
The question here is: is it possible to allocate/reserve memory per task in SLURM?
Thanks for your help!
SelectType=select/cons_resandSelectTypeParameters=CR_Core- engelen