Table of Contents

Usage of Slurm within a Singularity Container

When working with complex codes that require a certain environment to work, e.g. specific software of a specific version it might be useful to provide this code as a singularity container to other users.

The user may then work with (execute) this singularity container on the GWDG HPC system as is explained here. One obstacle, however, is that from within the singularity container one cannot execute slurm commands (or equally pbs commands which are outdated and therefore will not be discussed here) by default.

This page serves as a tutorial on how to achieve the usage of slurm from a singularity container on the SCC of the GWDG (as of 2020-05-12). The explanations operate under the assumption that the singularity container <my_container> models a UNIX like operating system. (At least for this case the procedure described below was tested to work.)

Short Summary

In order to execute a slurm command from inside a singularity container follow these steps:

singularity shell -B /opt/slurm,/usr/lib64/libreadline.so.6,/usr/lib64/libhistory.so.6,/usr/lib64/libtinfo.so.5,/var/run/munge,/usr/lib64/libmunge.so.2,/usr/lib64/libmunge.so.2.0.0,/run/munge <my_container> 
  >> export LD_LIBRARY_PATH=/usr/lib64:$LD_LIBRARY_PATH
  >> echo "slurmadmin:x:300:300::/opt/slurm/slurm:/bin/false" >> /etc/passwd
  >> echo "slurmadmin:x:300:" >> /etc/group
  >> <any_other_code>
  >> /opt/slurm/bin/<any_slurm_command> <any_optional_arguments>
  >> <any_other_code>

Detailed Explanation

Two steps are necessary in the procedure:

Step 1: Binding Files/Paths into the Container

First, the container needs to have access to the installation of slurm on the host system, i.e. the SCC, and to various other required libraries. Therefore one needs to bind the following paths/files when starting the container:

  1. /opt/slurm
  2. /usr/lib64/libreadline.so.6
  3. /usr/lib64/libhistory.so.6
  4. /usr/lib64/libtinfo.so.5
  5. /var/run/munge
  6. /usr/lib64/libmunge.so.2
  7. /usr/lib64/libmunge.so.2.0.0
  8. /run/munge

such that the final command for executing a command in <my_container> reads

singularity exec -B /opt/slurm,/usr/lib64/libreadline.so.6,/usr/lib64/libhistory.so.6,/usr/lib64/libtinfo.so.5,/var/run/munge,/usr/lib64/libmunge.so.2,/usr/lib64/libmunge.so.2.0.0,/run/munge <my_container> <any_command_to_be_executed_within_my_container>

Here, 1. lets the container know where to find slurm on the host system, 2. - 4. provide necessary libraries for slurm itself (Note that problems may occur if your container needs a different major version of any of these libraries!), and the remaining files/paths will load the munge encryption tool which the GWDG uses to ensure that only registered users operate on the system (with their respective rights).

Step 2: Adjust Settings inside the Container

Having started a container the following commands have to be executed once within the container to allow for it to make use of the libraries/paths introduced in the previous section:

  1. export LD_LIBRARY_PATH=/usr/lib64:$LD_LIBRARY_PATH
  2. echo “slurmadmin:x:300:300::/opt/slurm/slurm:/bin/false” » /etc/passwd
  3. echo “slurmadmin:x:300:” » /etc/group

The first command just adds a new path to the library path, thereby letting the container know that new libraries are to be loaded in /usr/lib64. The other commands let the container know the user's identity on the host by creating an account inside the container that does not require a password (x - Note that normally these accounts would not have any rights, however, it is just important for the container to know the user's identity on the host as all slurm requests will be redirected to the host, where the user then has his/her native rights) adding this information to the files /etc/passwd and /etc/group respectively.

Remarks on Optimum Usage

Starting a singularity container is quite cumbersome for the operating system of the host and, in case too many requests to start a singularity container are received at the same time, it might fail. It is, hence, beneficial to keep the calls to singularity itself to a minimum.

Therefore, the usage of the exec command of singularity is suitable with the commands explained in Step 2 outsourced to a shell (sub-)script. A call to singularity executing code that incorporates a slurm command on the host system may look like this:

<any_other_code>
singularity exec -B /opt/slurm,/usr/lib64/libreadline.so.6,/usr/lib64/libhistory.so.6,/usr/lib64/libtinfo.so.5,/var/run/munge,/usr/lib64/libmunge.so.2,/usr/lib64/libmunge.so.2.0.0,/run/munge <my_container> bash <a_file_to_be_executed_within_the_container>.sh
<any_other_code>

with <a_file_to_be_executed_within_the_container>.sh containing

<any_other_code>
export LD_LIBRARY_PATH=/usr/lib64:$LD_LIBRARY_PATH
echo "slurmadmin:x:300:300::/opt/slurm/slurm:/bin/false" >> /etc/passwd
echo "slurmadmin:x:300:" >> /etc/group
/opt/slurm/bin/<any_slurm_command> <any_optional_arguments>
<any_other_code>

The prefix to the slurm command (/opt/slurm/bin/) may be omitted when the PATH variable is updated inside the singularity container via

export PATH=opt/slurm/bin:$PATH

which may be added to <a_file_to_be_executed_within_the_container>.sh after exporting the LD_LIBRARY_PATH.