How to use our new GPU Cluster “GRETE” for HLRN users

We’re happy to announce the beginning of regular user operation for our new GPU cluster, “Grete” in Göttingen.

The main part of the cluster is available via the new partition grete, consisting of 33 nodes equipped with 4 NVIDIA Tesla A100 40 GB GPUs, 2 AMD Epyc CPUs, and an Infiniband HDR interconnect. In addition, “Grete” has a dedicated new login node, glogin9, which is also available via its DNS alias glogin-gpu.hlrn.de.
For smaller single GPU jobs and interactive usage, we provide additional partitions.

Part of “Grete” is a new dedicated small flash-based WORK storage system for the active set of the currently running jobs, which is mounted at /scratch on the new GPU nodes and glogin9. The existing “Emmy” WORK file system is reachable under /scratch-emmy. The HOME and PERM filesystems are shared between “Emmy” and “Grete”.

CUDA, the NVIDIA HPC SDK, and CUDA-enabled OpenMPI versions are available via the module system.

More information about using the new GPU system can be found at [1], and the accounting information has been extended to include the GPUs and MIG slices. [2]
Please do not hesitate to contact us if you have questions or need support migrating suitable applications to the GPU system.

The existing GPU nodes ggpu[01-03] with Nvidia V100 32GB GPUs will be migrated to the same site (“RZGö”) as “Grete” in mid-May. The operation will resume with the same “Rocky Linux 8” based OS image as the new GPU nodes.

[1] https://www.hlrn.de/doc/display/PUB/GPU+Usage
[2] https://www.hlrn.de/doc/display/PUB/Accounting+in+Core+Hours

How to use our new GPU Cluster “GRETE” for HLRN users 26. April 2023, 10:28

Categories

Archives