This is an old revision of the document!

GPU selection

In order to use a GPU you should submit your job to the gpu queue, and request GPU shares. Each node equipped with a GPU provides as many GPU shares as it has cores, independent of how many GPUs are built in. So for example, on the nodes, which have 24 CPU cores, the following would give you exclusive access to GPUs:

#BSUB -R "rusage[ngpus_shared=24]"

Note that you need not necessarily also request 24 cores with -n 24, as jobs from the MPI queue may utilize free CPU cores if you do not need them. The latest “gpu” nodes have two GPUs each, and you should use both, if possible.

If you request less shares than cores available, other jobs may also utilize the GPUs. However, we have currently no mechanism to select a specific one for a job. This would have to be handled in the application or your job script.

A good way to use the nodes which have 2 GPUs with jobs only working on one GPU would be to put two together in one job script and preselect a GPU for each one.

Currently we have several generations of NVidia GPUs in the cluster, selectable in the same way as CPU generations:

nvgen=1 : Kepler
nvgen=2 : Maxwell
nvgen=3 : Pascal

Most GPUs are commodity graphics cards, and only provide good performance for single precision calculations. If you need double precision performance, or error correcting memory (ECC RAM), you can select the Tesla GPUs with

#BSUB -R tesla

Our Tesla K40 are of the Kepler generation (nvgen=1).

If you want to make sure to run on a node equipped with two GPUs use:

#BSUB -R "ngpus=2"

Memory selection

Note that the following paragraph is about selecting nodes with enough memory for a job. The mechanism to actually reserve that memory does not change: The memory you are allowed to use equals memory per core times slots (-n option) requested.

You can select a node either by currently available memory (mem) or by maximum available memory (maxmem). If you request complete nodes, the difference is actually very small, as a free node's available memory is close to its maximum memory. All requests are in MB.

To select a node with more than about 500 GB available memory use:

#BSUB -R "mem>500000"

To select a node with more than about 6 GB maximum memory per core use:

#BSUB -R "maxmem/ncpus>6000"

(Yes, you can do basic math in the requirement string!)

It bears repeating: None of the above is a memory reservation. If you actually want to reserve “mem” memory, the easiest way is to combine -R “mem>… with -x for an exclusive job.

Finally, note that the -M option just denotes the memory limit of your job per core (in KB). This is of no real consequence, as we do not enforce these limits and it has no influence on the host selection.

Besides the options shown in this article, you can of course use the options for controlling walltime limits (-W), output (-o), and your other requirements as usual. You can also continue to use job scripts instead of the command line (with the #BSUB <option> <value> syntax).

Please consult the LSF man pages if you need further information.

Scientific Computing