Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:services:application_services:high_performance_computing:start [2020/11/17 10:31] – add nodes remaining at FMZ ckoehle2en:services:application_services:high_performance_computing:start [2023/06/21 10:03] (current) – [Hardware Overview] jbierma1
Line 1: Line 1:
 +====== High Performance Computing ======
  
 +Welcome to the official documentation of the Scientific Compute Cluster (SCC). It is the high performance computing system operated by the GWDG for both the Max Planck Society and the University of Göttingen.
 +
 +This documentation will give you the necessary information to get access to the system, find the right software or compile your own, and run calculations.
 +
 +===== Latest News =====
 +{{rss>https://listserv.gwdg.de/~parallel/hpc-announce.rss 5 date 10m }}
 +
 +An archive of all news items can be found at the [[https://listserv.gwdg.de/pipermail/hpc-announce/|HPC-announce maling list]].
 +
 +
 +===== Accessing the system =====
 +to use the compute cluster, you need a [[en:services:general_services:customer_portal:account_info|full GWDG account]]. Most employees of the University of Göttingen and the Max Planck Institutes already have such an account. This account is not activated for the use of the compute resources by default. More information on how to get your account activated or how to get an account can be found [[en:services:application_services:high_performance_computing:account_activation|here]].
 +
 +Once you are activated, you can ''login-mdc.hpc.gwdg.de''. These nodes are only accessible via ssh from the GÖNET. If you come from the internet, you need to either use a VPN or use our login server. You can find detailed instructions [[en:services:application_services:high_performance_computing:connect_with_ssh|here]].
 +
 +===== Submitting jobs =====
 +Our compute cluster is divided into frontends and compute nodes. The frontends are meant for editing, compiling, and interacting with the batch system. Please do not use them for intensive testing, i.e. calculations longer than a few minutes. All users share resources on the frontends and will be impaired in their daily work if you overuse them.
 +
 +To run a program on one (or more) of the compute nodes, you need to interact with our batch system, or scheduler, Slurm. You can do this with several different commands, such as ''srun'', ''sbatch'', and ''squeue''((As you may have noticed, they all start with an s)). A very simple example for such an interaction would be this:
 +<code>
 +$ srun hostname
 +dmp023</code>
 +This runs the program ''hostname''((a program that just prints the name of the host)) on one of our compute nodes. However, the program would only get access to a single core and very little memory. Not a problem for the ''hostname'' program, but if you want to calculate something more serious, you will need access to more resources. You can find out how to do that in our [[en:services:application_services:high_performance_computing:running_jobs_slurm|Slurm documentation]].
 +
 +===== Software =====
 +We provide a growing number of programs, libraries, and software on our system. These are available as ''modules''. You can find a list with the ''module avail'' command and load them via ''module load''. For example, if you want to run [[https://www.gromacs.org|GROMACS]], you simply use ''module load gromacs'' to get the most recent version. Additionally, we use a package management tool called Spack to install software. A guide on how to use modules and Spack is available [[en:services:application_services:high_performance_computing:spack_and_modulefiles|here]].
 +
 +We provide different compilers and libraries if you want to compile your software on your own. As with the rest of the software, these are available as modules. These include ''gcc'', ''intel'', and ''nvhpc'' as compilers, ''openmpi'', ''intel-oneapi-mpi'' as MPI libraries, and others such as ''mpi4py'', ''fftw'' and ''hdf5''. You can find more specific instructions on code compilation on our [[en:services:application_services:high_performance_computing:software:compilation|dedicated page]].
 +
 +===== Performance Engineering and Analysis =====
 +Performance engineering, analyis and optimization is imperative for HPC applications especially considering the huge amount of resources spent on assembling and operating large computing systems with complex microprocessors (X-PUs) and memory architectures.
 +
 +Performance analysis and optimization of HPC applications involve mainly three steps, application instrumentation, run-time measurements of key events and visual analysis of profiles and events traces.
 +
 +Performance tools currently available for use in our clusters, for CPUs and GPUs are: "LIKWID", "Score-P", "Vampir", and "Scalasca" for CPUs and Nsight Toolset (System, Compute and Graphic) for Nvidia GPUs. The tools and how to use them in the cluster are documented in [[en:services:application_services:high_performance_computing:Performance Engineering and Analysis|Performance Tools page]]. 
 +
 +===== A short note on naming =====
 +
 +The frontends and [[en:services:application_services:high_performance_computing:transfer_data|transfer]] nodes also have descriptive names of the form ''$func-$site.hpc.gwdg.de'' based on their primary function and site, where ''$func'' is either ''login'' or ''transfer'' while ''$site'' is ''mdc'' (modular data center, access to ''scratch''). For example, to reach any login node at the MDC site, you would connect to ''login-mdc.hpc.gwdg.de''.
 +
 +
 +=====  Hardware Overview  =====
 +
 +[[https://gwdg.de/hpc/systems/scc/|Please see the list of hardware, located in Göttingen, at our HPC website.]]