{"id":22815,"date":"2023-04-17T09:12:13","date_gmt":"2023-04-17T07:12:13","guid":{"rendered":"https:\/\/info.gwdg.de\/news\/?p=22815"},"modified":"2023-04-27T14:45:02","modified_gmt":"2023-04-27T12:45:02","slug":"what-scientists-should-know-to-efficiently-use-the-scientific-computing-cluster","status":"publish","type":"post","link":"https:\/\/info.gwdg.de\/news\/what-scientists-should-know-to-efficiently-use-the-scientific-computing-cluster\/","title":{"rendered":"What scientists should know to efficiently use the Scientific Computing Cluster"},"content":{"rendered":"<pre>Author: Dorothea Sommer<\/pre>\n<h3>Introduction<\/h3>\n<p>Switching from executing code locally to running it on a compute cluster can be a daunting and sometimes also confusing experience. This article provides a brief overview of fundamental concepts to efficiently run code on the cluster. It is <strong>not<\/strong> a step-by-step guide on how to set up a particular environment or run a specific software. Rather, it gives an introductory explanation about concepts and ideas to keep in mind while using the cluster, particularly if you have little to no prior experience.<\/p>\n<p>Find also information regarding this article <a href=\"https:\/\/gitlab-ce.gwdg.de\/hpc-team-public\/science-domains-blog\/-\/blob\/main\/20230417_cluster-practical.mdcluster\" class=\"external\" rel=\"nofollow\">here.<\/a><\/p>\n<h3>Overview<\/h3>\n<ol dir=\"auto\" data-sourcepos=\"20:1-25:0\">\n<li data-sourcepos=\"20:1-20:42\">General Workflow<\/li>\n<li data-sourcepos=\"21:1-21:48\">Hardware Components<\/li>\n<li data-sourcepos=\"22:1-22:68\">Requesting Hardware via Slurm<\/li>\n<li data-sourcepos=\"23:1-23:48\">Cluster Connections<\/li>\n<li data-sourcepos=\"24:1-25:0\">Using Environment Variables in Slurm<\/li>\n<\/ol>\n<h3 dir=\"auto\" data-sourcepos=\"26:1-26:20\"><a id=\"user-content-general-workflow\" class=\"anchor\" href=\"#general-workflow\" aria-hidden=\"true\"><\/a>General Workflow<\/h3>\n<p dir=\"auto\" data-sourcepos=\"28:1-28:330\">We will start with a simplified cluster concept and add details when they become relevant. For now, a cluster consists of some computing units (called <strong>nodes<\/strong>) that are connected to each other. There are some nodes reserved for login (<strong>green<\/strong> in Figure 1) and some reserved for the actual computations (<strong>blue<\/strong> in Figure 1).<\/p>\n<p dir=\"auto\" data-sourcepos=\"30:1-30:51\"><a href=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.018.jpeg\" class=\"external\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" style=\"width: 100%; height: 100%;\" src=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.018.jpeg\" alt=\"\" width=\"1920\" height=\"1080\" \/><\/a><\/p>\n<p dir=\"auto\" data-sourcepos=\"32:1-32:1097\">Once you logged into a frontend node (<strong>1<\/strong>), do not directly execute code that you would like to run. (If some users run heavy computations on the frontend nodes, other users will have problems to login in!). Instead, there is a <strong>scheduler<\/strong> installed on the cluster. This is a programme that coordinates who gets to calculate when on which nodes. The scheduler on the scientific compute cluster is called <a href=\"https:\/\/slurm.schedmd.com\/documentation.html\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" class=\"external\">Slurm<\/a>. Thus, to run code, you prepare a Slurm script (<strong>2<\/strong>). It includes both which programme you would like to run and which computing resources (hardware) you need. Such a script is called a <strong>job<\/strong> script. It is taken by the Slurm controller and, after some waiting (<strong>3<\/strong>) you get the requested resources and your code is executed automatically. All nodes have access to a shared file system, here called <strong>shared scratch<\/strong>. You can save the output of your computations in this file system. As for many users, the most unintuitive part of this workflow is to understand which hardware resources they need, we will cover this topic next.<\/p>\n<blockquote dir=\"auto\" data-sourcepos=\"34:1-34:121\">\n<p data-sourcepos=\"34:3-34:121\"><strong><em>TAKEAWAY:<\/em><\/strong> Please schedule your programme as a Slurm job instead of blocking the frontend with heavy computations.<\/p>\n<\/blockquote>\n<h3 dir=\"auto\" data-sourcepos=\"36:1-36:23\"><a id=\"user-content-hardware-components\" class=\"anchor\" href=\"#hardware-components\" aria-hidden=\"true\"><\/a>Hardware Components<\/h3>\n<p dir=\"auto\" data-sourcepos=\"37:1-37:580\">So, which resources should you request? This depends on your computation, so for instance whether the code uses multithreading or multiprocessing. We have a <a href=\"https:\/\/docs.gwdg.de\/doku.php?id=en:services:application_services:high_performance_computing:start#hardware_overview\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" class=\"external\">brief overview<\/a> of the available hardware. However, to understand <em>how<\/em> software components map to the hardware and then decide which you need, let us start with the an amazing hardware-software overview, which is adapted from <a href=\"https:\/\/smileipic.github.io\/Smilei\/parallelization.html\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" class=\"external\">Smilei<\/a> in Figure 2<\/p>\n<p dir=\"auto\" data-sourcepos=\"40:1-40:51\"><a href=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.019.jpeg\" class=\"external\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" style=\"width: 100%; height: 100%;\" src=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.019.jpeg\" alt=\"\" width=\"1920\" height=\"1080\" \/><\/a><\/p>\n<p dir=\"auto\" data-sourcepos=\"42:1-42:232\">The software (<strong>bottom<\/strong> in Figure 2) consists of processes and threads (i.e., lines of execution). It can be mapped to hardware components (<strong>top<\/strong>), which are the physical elements of the cluster. The hardware elements consist of:<\/p>\n<ul dir=\"auto\">\n<li><strong>Shared Scratch<\/strong> All nodes share the same file system. This is similar to your hard drive on the computer. Files (such as images) that are needed in subsequent computations are stored here. All nodes (and thus all programs) have access to this storage.<\/li>\n<li><strong>Node<\/strong> A node is a computing unit. Inuitively, you can think of a node being similar to your computer. It has RAM, multiple cores and can run multiple programs. As depicted, the nodes can differ with respect to these components (such as having more or less cores).<\/li>\n<li><strong>Cores<\/strong> Cores are processing units, such as CPU or GPU. Each node can a have different number of cores.<\/li>\n<li><strong>Memory (RAM)<\/strong> Each node has its own (temporary) random-access memory (RAM). This is memory used for temporary computations such remembering numbers in a computation. The RAM is not shared between nodes and also not shared between processes. However, multiple cores can share the same RAM.<\/li>\n<\/ul>\n<p dir=\"auto\" data-sourcepos=\"58:1-58:983\">These hardware components can be mapped to the software components. Most importantly, each <strong>process<\/strong> (so each of your programmes) is assigned to <em>one<\/em> node. These processes might have multiple <strong>threads<\/strong>, so different lines of execution are run in parallel <strong>multithreading<\/strong>. For instance, in Monte Carlo simulations, you could run multiple simulations at once (in multiple threads) and then collect all your results in one thread at the end. Most importantly, these threads share the same memory. Two or more threads can also share the same core. This can be useful if one thread has not much to do while the others are active (such as collecting the results of the simulation at the end). In contrast to this multithreading, you can also run multiple processes that communicate with each other (<strong>multiprocessing<\/strong>). They can be either on the same node or even on different nodes! However, they do not share the same RAM, so this has to be accounted for in their communication.<\/p>\n<blockquote dir=\"auto\" data-sourcepos=\"60:1-60:165\">\n<p data-sourcepos=\"60:3-60:165\"><strong><em>TAKEAWAY:<\/em><\/strong> Think of the hardware your programme needs (in terms of running one process, multithreading or multiprocessing), e.g. how many cores this maps to.<\/p>\n<\/blockquote>\n<h3 dir=\"auto\" data-sourcepos=\"63:1-63:33\"><a id=\"user-content-requesting-hardware-via-slurm\" class=\"anchor\" href=\"#requesting-hardware-via-slurm\" aria-hidden=\"true\"><\/a>Requesting Hardware via Slurm<\/h3>\n<p dir=\"auto\" data-sourcepos=\"64:1-64:532\">In this part, we will link the more abstract hardware view to practically requesting resources. First of all, the more resources you request, the longer you need to wait for them. Even if your code execution might be fastest on a ton of resources, there is a sweet spot between balancing the waiting time (for the resources) and the actual execution time of your code. Moreover, to keep the waiting time to a minimum, it is also important that you correctly estimate what your programme will need, with respect to time and hardware.<\/p>\n<p dir=\"auto\" data-sourcepos=\"66:1-66:194\">The resources are assigned to each job by the Slurm scheduler. To do so, a script is submitted to it via <code>sbatch personal-slurm-script.sh<\/code>. Each Slurm script consists of the following parts:<\/p>\n<ol dir=\"auto\" data-sourcepos=\"67:1-70:0\">\n<li data-sourcepos=\"67:1-67:53\"><strong>requesting resources<\/strong> hardware and compute time<\/li>\n<li data-sourcepos=\"68:1-68:79\"><strong>environment setup<\/strong> such as loading modules, setting environment variables<\/li>\n<li data-sourcepos=\"69:1-70:0\"><strong>code<\/strong><\/li>\n<\/ol>\n<h4 dir=\"auto\" data-sourcepos=\"71:1-71:19\"><a id=\"user-content-single-program\" class=\"anchor\" href=\"#single-program\" aria-hidden=\"true\"><\/a>Single Program<\/h4>\n<p dir=\"auto\" data-sourcepos=\"72:1-72:679\">The simplest case is to run a script without multithreading nor multiprocessing. Thus, we have 1 process and 1 thread. This directly translates to the Slurm script. 1. At the top, we request 1 node, for 1 task (1 code execution) and 1 CPU per task (so 1 core). Could you speed up your calculation by requesting multiple cores? No, these resources would increase the waiting time, but not improve the actual execution time, because the script cannot make use of the additional resources. The script ideas are taken from <a href=\"https:\/\/researchcomputing.princeton.edu\/get-started\/guide-princeton-clusters\/3-first-slurm-job\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" class=\"external\">Princeton Computing<\/a> and can be found as text in the appendix.<\/p>\n<p dir=\"auto\" data-sourcepos=\"74:1-74:51\"><a href=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.020.jpeg\" class=\"external\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" style=\"width: 100%; height: 100%;\" src=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.020.jpeg\" alt=\"\" width=\"1920\" height=\"1080\" \/><\/a><\/p>\n<h4 dir=\"auto\" data-sourcepos=\"76:1-76:19\"><a id=\"user-content-multithreading\" class=\"anchor\" href=\"#multithreading\" aria-hidden=\"true\"><\/a>Multithreading<\/h4>\n<p dir=\"auto\" data-sourcepos=\"77:1-77:131\">A multithreaded program runs on 1 node. The memory is shared for multiple cores, which is reflected in requesting 4 cores (4 CPUs).<\/p>\n<p dir=\"auto\" data-sourcepos=\"79:1-79:51\"><a href=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.021.jpeg\" class=\"external\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" style=\"width: 100%; height: 100%;\" src=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.021.jpeg\" alt=\"\" width=\"1920\" height=\"1080\" \/><\/a><\/p>\n<h4 dir=\"auto\" data-sourcepos=\"81:1-81:20\"><a id=\"user-content-multiprocessing\" class=\"anchor\" href=\"#multiprocessing\" aria-hidden=\"true\"><\/a>Multiprocessing<\/h4>\n<p dir=\"auto\" data-sourcepos=\"83:1-83:308\">For multiprocessing, multiple nodes are requested. In this example, we request 2 nodes with 2 CPU cores. Optimally, we would try to fit the processes onto one node, because the communication would be faster than between nodes. However, this example illustrates that you can also do computations across nodes.<\/p>\n<p dir=\"auto\" data-sourcepos=\"85:1-85:51\"><a href=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.022.jpeg\" class=\"external\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" style=\"width: 100%; height: 100%;\" src=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.022.jpeg\" alt=\"\" width=\"1920\" height=\"1080\" \/><\/a><\/p>\n<h4 dir=\"auto\" data-sourcepos=\"87:1-87:23\"><a id=\"user-content-things-not-covered\" class=\"anchor\" href=\"#things-not-covered\" aria-hidden=\"true\"><\/a>Things not covered<\/h4>\n<p dir=\"auto\" data-sourcepos=\"88:1-88:59\">There are some things that we will <strong>not<\/strong> cover in detail.<\/p>\n<ul dir=\"auto\" data-sourcepos=\"90:1-94:0\">\n<li data-sourcepos=\"90:1-90:185\">What if you would like to run a job <em>n<\/em> times with different seeds? It is possible to do so via using a <code>job array<\/code>. In principle, the requested resources are requested <em>n<\/em> times.<\/li>\n<li data-sourcepos=\"91:1-91:191\">Using <code>srun<\/code>, it is possible to use Slurm in an interactive node (piping the output to the Shell). This can be useful for debugging, but the job will terminate once you close the shell.<\/li>\n<li data-sourcepos=\"92:1-92:154\">You can provide on which partition (i.e., on which part of the cluster) the code should be executed. Note that some labs also have their own partitions.<\/li>\n<li data-sourcepos=\"93:1-94:0\">If you are not sure what to request (e.g., in terms of RAM) &#8211; the easiest answer is: try. After your job has finished, you can see the details of it with <code>sacct -j {jobid}<\/code> by providing relevant fields (for list of fields, see <a href=\"https:\/\/slurm.schedmd.com\/sacct.html\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" class=\"external\">&#8211;helpformat<\/a>).\n<p data-sourcepos=\"95:3-95:121\">\n<\/li>\n<\/ul>\n<p><strong><em>TAKEAWAY:<\/em><\/strong> Map the hardware requirements to the Slurm script. In case of doubt, guess, asses the usage and adapt.<\/p>\n<h3 dir=\"auto\" data-sourcepos=\"97:1-97:23\"><a id=\"user-content-cluster-connections\" class=\"anchor\" href=\"#cluster-connections\" aria-hidden=\"true\"><\/a>Cluster-Connections<\/h3>\n<p dir=\"auto\" data-sourcepos=\"99:1-99:385\">Having explored the nodes, we now add some more details about the connections in the cluster. Knowing about which connection type to use might speed up your calculations. The workflow remains as described in the first section, you log into the frontend (<strong>left<\/strong>) and send the job to the Slurm controller (<strong>bottom center<\/strong>), which then executes your script on some nodes (<strong>center<\/strong>).<\/p>\n<p dir=\"auto\" data-sourcepos=\"101:1-101:51\"><a href=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.023.jpeg\" class=\"external\" rel=\"nofollow\"><img loading=\"lazy\" decoding=\"async\" style=\"width: 100%; height: 100%;\" src=\"https:\/\/info.gwdg.de\/news\/wp-content\/uploads\/2023\/04\/figures_paper.023.jpeg\" alt=\"\" width=\"1920\" height=\"1080\" \/><\/a><\/p>\n<h4 dir=\"auto\" data-sourcepos=\"103:1-103:21\"><a id=\"user-content-connection-speed\" class=\"anchor\" href=\"#connection-speed\" aria-hidden=\"true\"><\/a>Connection Speed<\/h4>\n<p dir=\"auto\" data-sourcepos=\"104:1-104:835\">The connection speed depends on the hardware used between the components. In Figure 6, it is indicated by <strong>color<\/strong> (red and bold means faster). The connection can be either Ethernet (<strong>black<\/strong>) or a very fast connection with InfiniBand\/Omni-Path hardware (<strong>red<\/strong>). There are some practical takeaways: The nodes are connected with InfiniBand\/Omni-Path, so if you can select a communication channel in your programme, use this fact. You can use <code>ifconfig<\/code> to see the exact network endpoints you can use. (Of course the fastest connection is within a node). Secondly, if you need to access files in a computation that are stored on the hard disk, put them in the shared filesystem (<code>shared scratch<\/code>, <strong>top right<\/strong>). This filesystem has a fast connection to the nodes, in contrast to the home folders of each user (<strong>center right<\/strong>).<\/p>\n<h4 dir=\"auto\" data-sourcepos=\"106:1-106:12\"><a id=\"user-content-storage\" class=\"anchor\" href=\"#storage\" aria-hidden=\"true\"><\/a>Storage<\/h4>\n<p dir=\"auto\" data-sourcepos=\"107:1-107:455\">Using the <code>shared scratch<\/code> for your computation results is great, for long-term storage it is not. This file system has no backup! Your home folder has a backup. This is indicated by the the connection to the tape archive (<strong>right center<\/strong>). These are literal tapes that are, after being written, stored as physical components not connected to any electricity. Lastly, it is also possible to connect S3 buckets to the whole cluster (<strong>bottom right<\/strong>).<\/p>\n<blockquote dir=\"auto\" data-sourcepos=\"109:1-109:77\">\n<p data-sourcepos=\"109:3-109:77\"><strong><em>TAKEAWAY:<\/em><\/strong> Use the fastest connection. Think about speed and backups.<\/p>\n<\/blockquote>\n<h3 dir=\"auto\" data-sourcepos=\"112:1-112:40\"><a id=\"user-content-using-environment-variables-in-slurm\" class=\"anchor\" href=\"#using-environment-variables-in-slurm\" aria-hidden=\"true\"><\/a>Using Environment Variables in Slurm<\/h3>\n<p dir=\"auto\" data-sourcepos=\"113:1-113:223\">Lastly, we illustrate where important <strong>environment variables<\/strong> point to. Environment variables describe where things are stored, such as the location of your Python installation or specific dependencies of your programme.<\/p>\n<p dir=\"auto\" data-sourcepos=\"115:1-116:116\">Why is this important to know when computing? Well, sometimes things go wrong, such as a programme not finding dependencies. It is nice to have a rough overview what environment variables are written and where important ones point to. More importantly, the following variables can also be used in the code for practical purposes (such as saving a file with your Slurm job name and a Slurm job id). Let us inspect the following Slurm submission script by paragraph, you can follow along <a href=\"##appendix\" target=\"_blank\" rel=\"nofollow noreferrer noopener\">interactively<\/a>.<\/p>\n<div class=\"gl-relative markdown-code-block js-markdown-code\">\n<pre id=\"code-4\" class=\"code highlight js-syntax-highlight language-shell white\" lang=\"shell\" data-sourcepos=\"118:1-141:3\"><code><span id=\"LC1\" class=\"line\" lang=\"shell\"><span class=\"c\">#!\/bin\/bash<\/span><\/span>\r\n<span id=\"LC2\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH --job-name=testing-variables<\/span><\/span>\r\n<span id=\"LC3\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH -p medium<\/span><\/span>\r\n<span id=\"LC4\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH -t 02:00<\/span><\/span>\r\n<span id=\"LC5\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH -o job-%x-%j.out # will be saved as job-{name}-{jobid}.out<\/span><\/span>\r\n<span id=\"LC6\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH -e job-%x-%j.err # will be saved as job-{name}-{jobid}.err<\/span><\/span>\r\n<span id=\"LC7\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC8\" class=\"line\" lang=\"shell\"><span class=\"c\"># 1. Variables for job description.<\/span><\/span>\r\n<span id=\"LC9\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">Variables set by Slurm<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC10\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"User: <\/span><span class=\"k\">${<\/span><span class=\"nv\">USER<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC11\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Slurm job ID: <\/span><span class=\"k\">${<\/span><span class=\"nv\">SLURM_JOB_ID<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC12\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Name of this job script: <\/span><span class=\"k\">${<\/span><span class=\"nv\">SLURM_JOB_NAME<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC13\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Submitting job with sbatch from directory: <\/span><span class=\"k\">${<\/span><span class=\"nv\">SLURM_SUBMIT_DIR<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC14\" class=\"line\" lang=\"shell\"><span class=\"c\"># echo \"Array task ID: ${SLURM_ARRAY_JOB_ID}\"<\/span><\/span>\r\n<span id=\"LC15\" class=\"line\" lang=\"shell\"><span class=\"c\"># echo \"Array task ID: ${SLURM_ARRAY_TASK_ID}\"<\/span><\/span>\r\n<span id=\"LC16\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC17\" class=\"line\" lang=\"shell\"><span class=\"c\"># 2. Saving (temporary) files.<\/span><\/span>\r\n<span id=\"LC18\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">The following directories are available for saving files<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC19\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Local address of the node for temporary files: <\/span><span class=\"k\">${<\/span><span class=\"nv\">TMP_LOCAL<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC20\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Local Address on scratch for temporary files: <\/span><span class=\"k\">${<\/span><span class=\"nv\">TMP_SCRATCH<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC21\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Home directory: <\/span><span class=\"k\">${<\/span><span class=\"nv\">HOME<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC22\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Working directory: <\/span><span class=\"nv\">$PWD<\/span><span class=\"s2\">\"\r\n<\/span><\/span><\/code><\/pre>\n<\/div>\n<ol dir=\"auto\" data-sourcepos=\"143:1-145:0\">\n<li data-sourcepos=\"143:1-143:255\"><strong>Setting the script output<\/strong> Standard output and standard error can be set in the #SBATCH section with <code>-o<\/code> (output) and <code>-e<\/code> (error). They refer to the output of the Slurm script. The job name is denoted by <code>%x<\/code> and Slurm job id by <code>%j<\/code>.<\/li>\n<li data-sourcepos=\"144:1-145:0\"><strong>Using environment variables<\/strong> Some environment variables are already set by us; for instance, the job id (<code>$SLURM_JOB_ID<\/code>) and job name (<code>$SLURM_JOB_NAME<\/code>). These variables can be used in your script (use <code>env<\/code> to get a complete list). You can also define your own environment variables, but avoid overwriting the ones already set as this can break stuff.<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<blockquote dir=\"auto\" data-sourcepos=\"146:1-146:80\">\n<p data-sourcepos=\"146:3-146:80\"><strong><em>TAKEAWAY:<\/em><\/strong> Make use of environment variables set by your friendly admins.<\/p>\n<\/blockquote>\n<p dir=\"auto\" data-sourcepos=\"148:1-148:158\">Now let us dive a little deeper what happens when you set environment variables via loading packages or need to debug. (This is not needed for a quick start.)<\/p>\n<div class=\"gl-relative markdown-code-block js-markdown-code\">\n<pre id=\"code-5\" class=\"code highlight js-syntax-highlight language-shell white\" lang=\"shell\" data-sourcepos=\"150:1-168:3\"><code><span id=\"LC1\" class=\"line\" lang=\"shell\"><span class=\"c\"># (continued)<\/span><\/span>\r\n<span id=\"LC2\" class=\"line\" lang=\"shell\"><span class=\"c\"># 3. Package manager conda example (for Python) on what is done with loading packages.<\/span><\/span>\r\n<span id=\"LC3\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">How to explore modules in depth (example)<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC4\" class=\"line\" lang=\"shell\">module spider anaconda3 2&gt;&amp;1 <span class=\"c\"># which anaconda3 versions are available (to standard out)<\/span><\/span>\r\n<span id=\"LC5\" class=\"line\" lang=\"shell\">module show anaconda3 2&gt;&amp;1 <span class=\"c\"># which path are modified by loading anaconda3 (to standard out)<\/span><\/span>\r\n<span id=\"LC6\" class=\"line\" lang=\"shell\">module load anaconda3 <span class=\"c\"># loading module<\/span><\/span>\r\n<span id=\"LC7\" class=\"line\" lang=\"shell\">conda <span class=\"nb\">env <\/span>list <span class=\"c\"># your personal conda directories<\/span><\/span>\r\n<span id=\"LC8\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC9\" class=\"line\" lang=\"shell\"><span class=\"c\"># Direct meddling with environment packages.<\/span><\/span>\r\n<span id=\"LC10\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">Setting environment variables directly<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC11\" class=\"line\" lang=\"shell\"><span class=\"nb\">export <\/span><span class=\"nv\">EXAMPLE_DIR<\/span><span class=\"o\">=<\/span><span class=\"s2\">\"example-value\"<\/span> <span class=\"c\"># to manually set environment variables (no output)<\/span><\/span>\r\n<span id=\"LC12\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Set an environment variable to <\/span><span class=\"nv\">$EXAMPLE_DIR<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC13\" class=\"line\" lang=\"shell\"><span class=\"c\"># printenv # to see all current environment variables<\/span><\/span>\r\n<span id=\"LC14\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC15\" class=\"line\" lang=\"shell\"><span class=\"c\"># 4. For debugging purposes.<\/span><\/span>\r\n<span id=\"LC16\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">Example of debugging path in program.\"<\/span><\/span>\r\n<span id=\"LC17\" class=\"line\" lang=\"shell\">strace <span class=\"nt\">-f<\/span> <span class=\"nt\">-e<\/span> open python example.py <span class=\"c\"># Get system calls on what is opened, here with (non-existent) example.py, print to std err\r\n\r\n<\/span><\/span><\/code><\/pre>\n<\/div>\n<ol dir=\"auto\" start=\"3\" data-sourcepos=\"170:1-173:0\">\n<li data-sourcepos=\"170:1-171:0\">\n<p data-sourcepos=\"170:4-170:633\"><strong>Loading packages<\/strong> Software needs packages or specific paths to be set. As there are many users on a cluster and all have different needs (e.g., different versions of a programme, different environment variables they want to set), there is a <code>module<\/code> system that dynamically handles the loading. You can inspect which packages are available (with <code>module spider anaconda3<\/code>). If you are interested, which variables are modified <code>module show anaconda3<\/code>. Usually, you just load the module(s) you need and execute your script. However, you can also explicitly modify environment variables with <code>export<\/code>.<\/p>\n<\/li>\n<li data-sourcepos=\"172:1-173:0\">\n<p data-sourcepos=\"172:5-172:566\"><strong>Debugging<\/strong> Sometimes, things go wrong. For instance, you install and execute a programme written by someone else; this programme has a <code>config<\/code> file and you are not able to find its location. The command <a href=\"https:\/\/man7.org\/linux\/man-pages\/man1\/strace.1.html\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" class=\"external\"><code>strace<\/code><\/a> tracks system calls, so you can inspect files that are opened by a specific programme even without looking at the code! This command is also be helpful to see where temporary files are used. When necessary, you can also <a href=\"https:\/\/jvns.ca\/categories\/strace\/\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" class=\"external\">dive deeper<\/a>.<\/p>\n<\/li>\n<\/ol>\n<blockquote dir=\"auto\" data-sourcepos=\"174:1-174:95\">\n<p data-sourcepos=\"174:3-174:95\"><strong><em>TAKEAWAY:<\/em><\/strong> For debugging environment variables of a programme, you can use <code>strace<\/code>.<\/p>\n<\/blockquote>\n<h3 dir=\"auto\" data-sourcepos=\"177:1-177:11\"><a id=\"user-content-wrap-up\" class=\"anchor\" href=\"#wrap-up\" aria-hidden=\"true\"><\/a>Wrap-Up<\/h3>\n<p dir=\"auto\" data-sourcepos=\"179:1-179:329\">Hopefully, having clear concepts about the job submission worfklow and hardware concepts facilitates setting up your workflow and following <a href=\"https:\/\/docs.gwdg.de\/doku.php?id=en:services:application_services:high_performance_computing:start\" target=\"_blank\" rel=\"nofollow noreferrer noopener\" class=\"external\">further documentation<\/a>. Lastly, have fun computing and don&#8217;t be afraid to break things.<\/p>\n<h2 dir=\"auto\" data-sourcepos=\"181:1-181:9\"><a id=\"user-content-thanks\" class=\"anchor\" href=\"#thanks\" aria-hidden=\"true\"><\/a>Thanks<\/h2>\n<p dir=\"auto\" data-sourcepos=\"183:1-183:142\">Thanks to Sebastian Krey and Timon Vogt for teaching me a thing or two about the scientific compute cluster and for proofreading this article.<\/p>\n<h2 dir=\"auto\" data-sourcepos=\"186:1-186:13\"><a id=\"user-content-references\" class=\"anchor\" href=\"#references\" aria-hidden=\"true\"><\/a>References<\/h2>\n<h2 dir=\"auto\" data-sourcepos=\"189:1-189:11\"><a id=\"user-content-appendix\" class=\"anchor\" href=\"#appendix\" aria-hidden=\"true\"><\/a>Appendix<\/h2>\n<p dir=\"auto\" data-sourcepos=\"190:1-190:151\">For copying-and-pasting scripts, save the respective code under <code>test.sh<\/code> and submit it in the corresponding directory with <code>sbatch .\/test.sh<\/code>.<\/p>\n<p dir=\"auto\" data-sourcepos=\"192:1-192:65\">Slurm script for requesting resources: no parallelization, 1 core<\/p>\n<div class=\"gl-relative markdown-code-block js-markdown-code\">\n<pre id=\"code-6\" class=\"code highlight js-syntax-highlight language-plaintext white\" lang=\"plaintext\" data-canonical-lang=\"\" data-sourcepos=\"194:1-214:3\"><code><span id=\"LC1\" class=\"line\" lang=\"plaintext\">#!\/bin\/bash<\/span>\r\n<span id=\"LC2\" class=\"line\" lang=\"plaintext\">#SBATCH --job-name=custom-name<\/span>\r\n<span id=\"LC3\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC4\" class=\"line\" lang=\"plaintext\">#SBATCH --nodes=1                    # total number of nodes<\/span>\r\n<span id=\"LC5\" class=\"line\" lang=\"plaintext\">#SBATCH --ntasks=1                   # total number of tasks<\/span>\r\n<span id=\"LC6\" class=\"line\" lang=\"plaintext\">#SBATCH --cpus-per-task=1            # cpu cores per task<\/span>\r\n<span id=\"LC7\" class=\"line\" lang=\"plaintext\">#SBATCH --mem-per-cpu=4GB            # memory given to each cpu<\/span>\r\n<span id=\"LC8\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC9\" class=\"line\" lang=\"plaintext\">#SBATCH --time=00:01:00              # total run time limit (HH:MM:SS)<\/span>\r\n<span id=\"LC10\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-type=begin            # send mail when job begins<\/span>\r\n<span id=\"LC11\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-type=end              # send mail when job ends<\/span>\r\n<span id=\"LC12\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-user=custom@mail.com <\/span>\r\n<span id=\"LC13\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC14\" class=\"line\" lang=\"plaintext\"># Prepare the environment.<\/span>\r\n<span id=\"LC15\" class=\"line\" lang=\"plaintext\">module load anaconda3\/2021.05       <\/span>\r\n<span id=\"LC16\" class=\"line\" lang=\"plaintext\">source activate custom-env<\/span>\r\n<span id=\"LC17\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC18\" class=\"line\" lang=\"plaintext\"># Run the script.<\/span>\r\n<span id=\"LC19\" class=\"line\" lang=\"plaintext\">python custom-script.py<\/span><\/code><\/pre>\n<\/div>\n<p dir=\"auto\" data-sourcepos=\"216:1-216:53\">Slurm script for requesting resources: multithreading<\/p>\n<div class=\"gl-relative markdown-code-block js-markdown-code\">\n<pre id=\"code-7\" class=\"code highlight js-syntax-highlight language-plaintext white\" lang=\"plaintext\" data-canonical-lang=\"\" data-sourcepos=\"217:1-237:3\"><code><span id=\"LC1\" class=\"line\" lang=\"plaintext\">#!\/bin\/bash<\/span>\r\n<span id=\"LC2\" class=\"line\" lang=\"plaintext\">#SBATCH --job-name=custom-name<\/span>\r\n<span id=\"LC3\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC4\" class=\"line\" lang=\"plaintext\">#SBATCH --nodes=1                    # total number of nodes<\/span>\r\n<span id=\"LC5\" class=\"line\" lang=\"plaintext\">#SBATCH --ntasks=1                   # total number of tasks<\/span>\r\n<span id=\"LC6\" class=\"line\" lang=\"plaintext\">#SBATCH --cpus-per-task=4            # cpu cores per task, &gt; 1 but less than cores per node<\/span>\r\n<span id=\"LC7\" class=\"line\" lang=\"plaintext\">#SBATCH --mem-per-cpu=4GB            # memory given to each cpu<\/span>\r\n<span id=\"LC8\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC9\" class=\"line\" lang=\"plaintext\">#SBATCH --time=00:01:00              # total run time limit (HH:MM:SS)<\/span>\r\n<span id=\"LC10\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-type=begin            # send mail when job begins<\/span>\r\n<span id=\"LC11\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-type=end              # send mail when job ends<\/span>\r\n<span id=\"LC12\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-user=custom@mail.com <\/span>\r\n<span id=\"LC13\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC14\" class=\"line\" lang=\"plaintext\"># Prepare the environment.<\/span>\r\n<span id=\"LC15\" class=\"line\" lang=\"plaintext\">module load anaconda3\/2021.05        <\/span>\r\n<span id=\"LC16\" class=\"line\" lang=\"plaintext\">source activate custom-env<\/span>\r\n<span id=\"LC17\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC18\" class=\"line\" lang=\"plaintext\"># Run the script.<\/span>\r\n<span id=\"LC19\" class=\"line\" lang=\"plaintext\">python custom-script.py<\/span><\/code><\/pre>\n<\/div>\n<p dir=\"auto\" data-sourcepos=\"239:1-239:71\">Slurm script for requesting resources: multiple nodes, multiple threads<\/p>\n<div class=\"gl-relative markdown-code-block js-markdown-code\">\n<pre id=\"code-8\" class=\"code highlight js-syntax-highlight language-plaintext white\" lang=\"plaintext\" data-canonical-lang=\"\" data-sourcepos=\"241:1-261:3\"><code><span id=\"LC1\" class=\"line\" lang=\"plaintext\">#!\/bin\/bash<\/span>\r\n<span id=\"LC2\" class=\"line\" lang=\"plaintext\">#SBATCH --job-name=custom-name<\/span>\r\n<span id=\"LC3\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC4\" class=\"line\" lang=\"plaintext\">#SBATCH --nodes=2                    # total number of nodes<\/span>\r\n<span id=\"LC5\" class=\"line\" lang=\"plaintext\">#SBATCH --ntasks-per-node=1          # total number of tasks per node<\/span>\r\n<span id=\"LC6\" class=\"line\" lang=\"plaintext\">#SBATCH --cpus-per-task=2            # cpu cores per task, &gt; 1 but less than cores per node<\/span>\r\n<span id=\"LC7\" class=\"line\" lang=\"plaintext\">#SBATCH --mem-per-cpu=4GB            # memory given to each cpu<\/span>\r\n<span id=\"LC8\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC9\" class=\"line\" lang=\"plaintext\">#SBATCH --time=00:01:00              # total run time limit (HH:MM:SS)<\/span>\r\n<span id=\"LC10\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-type=begin            # send mail when job begins<\/span>\r\n<span id=\"LC11\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-type=end              # send mail when job ends<\/span>\r\n<span id=\"LC12\" class=\"line\" lang=\"plaintext\">#SBATCH --mail-user=custom@mail.com <\/span>\r\n<span id=\"LC13\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC14\" class=\"line\" lang=\"plaintext\"># Prepare the environment.<\/span>\r\n<span id=\"LC15\" class=\"line\" lang=\"plaintext\">module load anaconda3\/2021.05        <\/span>\r\n<span id=\"LC16\" class=\"line\" lang=\"plaintext\">source activate custom-env<\/span>\r\n<span id=\"LC17\" class=\"line\" lang=\"plaintext\"><\/span>\r\n<span id=\"LC18\" class=\"line\" lang=\"plaintext\"># Run the script.<\/span>\r\n<span id=\"LC19\" class=\"line\" lang=\"plaintext\">python custom-script.py <\/span><\/code><\/pre>\n<\/div>\n<p dir=\"auto\" data-sourcepos=\"263:1-263:59\">Slurm script for exploring different environment variables.<\/p>\n<div class=\"gl-relative markdown-code-block js-markdown-code\">\n<pre class=\"code highlight js-syntax-highlight language-shell white\" lang=\"shell\" data-sourcepos=\"264:1-304:3\"><code><\/code><\/pre>\n<pre class=\"code highlight\" lang=\"shell\"><span id=\"LC1\" class=\"line\" lang=\"shell\"><span class=\"c\">#!\/bin\/bash<\/span><\/span>\r\n<span id=\"LC2\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH --job-name=testing-variables<\/span><\/span>\r\n<span id=\"LC3\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH -p medium<\/span><\/span>\r\n<span id=\"LC4\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH -t 02:00<\/span><\/span>\r\n<span id=\"LC5\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH -o job-%x-%j.out # will be saved as job-{name}-{jobid}.out<\/span><\/span>\r\n<span id=\"LC6\" class=\"line\" lang=\"shell\"><span class=\"c\">#SBATCH -e job-%x-%j.err # will be saved as job-{name}-{jobid}.err<\/span><\/span>\r\n<span id=\"LC7\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC8\" class=\"line\" lang=\"shell\"><span class=\"c\"># 1. Variables for job description.<\/span><\/span>\r\n<span id=\"LC9\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">Variables set by Slurm<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC10\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"User: <\/span><span class=\"k\">${<\/span><span class=\"nv\">USER<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC11\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Slurm job ID: <\/span><span class=\"k\">${<\/span><span class=\"nv\">SLURM_JOB_ID<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC12\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Name of this job script: <\/span><span class=\"k\">${<\/span><span class=\"nv\">SLURM_JOB_NAME<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC13\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Submitting job with sbatch from directory: <\/span><span class=\"k\">${<\/span><span class=\"nv\">SLURM_SUBMIT_DIR<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC14\" class=\"line\" lang=\"shell\"><span class=\"c\"># echo \"Array task ID: ${SLURM_ARRAY_JOB_ID}\"<\/span><\/span>\r\n<span id=\"LC15\" class=\"line\" lang=\"shell\"><span class=\"c\"># echo \"Array task ID: ${SLURM_ARRAY_TASK_ID}\"<\/span><\/span>\r\n<span id=\"LC16\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC17\" class=\"line\" lang=\"shell\"><span class=\"c\"># 2. Saving (temporary) files.<\/span><\/span>\r\n<span id=\"LC18\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">The following directories are available for saving files<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC19\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Local address of the node for temporary files: <\/span><span class=\"k\">${<\/span><span class=\"nv\">TMP_LOCAL<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC20\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Local Address on scratch for temporary files: <\/span><span class=\"k\">${<\/span><span class=\"nv\">TMP_SCRATCH<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC21\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Home directory: <\/span><span class=\"k\">${<\/span><span class=\"nv\">HOME<\/span><span class=\"k\">}<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC22\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Working directory: <\/span><span class=\"nv\">$PWD<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC23\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC24\" class=\"line\" lang=\"shell\"><span class=\"c\"># 3. Package manager conda example (for Python) on what is done with loading packages.<\/span><\/span>\r\n<span id=\"LC25\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">How to explore modules in depth (example)<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC26\" class=\"line\" lang=\"shell\">module spider anaconda3 2&gt;&amp;1 <span class=\"c\"># which anaconda3 versions are available (to standard out)<\/span><\/span>\r\n<span id=\"LC27\" class=\"line\" lang=\"shell\">module show anaconda3 2&gt;&amp;1 <span class=\"c\"># which path are modified by loading anaconda3 (to standard out)<\/span><\/span>\r\n<span id=\"LC28\" class=\"line\" lang=\"shell\">module load anaconda3 <span class=\"c\"># loading module<\/span><\/span>\r\n<span id=\"LC29\" class=\"line\" lang=\"shell\">conda <span class=\"nb\">env <\/span>list <span class=\"c\"># your personal conda directories<\/span><\/span>\r\n<span id=\"LC30\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC31\" class=\"line\" lang=\"shell\"><span class=\"c\"># Direct meddling with environment packages.<\/span><\/span>\r\n<span id=\"LC32\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">Setting environment variables directly<\/span><span class=\"se\">\\n<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC33\" class=\"line\" lang=\"shell\"><span class=\"nb\">export <\/span><span class=\"nv\">EXAMPLE_DIR<\/span><span class=\"o\">=<\/span><span class=\"s2\">\"example-value\"<\/span> <span class=\"c\"># to manually set environment variables (no output)<\/span><\/span>\r\n<span id=\"LC34\" class=\"line\" lang=\"shell\"><span class=\"nb\">echo<\/span> <span class=\"s2\">\"Set an environment variable to <\/span><span class=\"nv\">$EXAMPLE_DIR<\/span><span class=\"s2\">\"<\/span><\/span>\r\n<span id=\"LC35\" class=\"line\" lang=\"shell\"><span class=\"c\"># printenv # to see all current environment variables<\/span><\/span>\r\n<span id=\"LC36\" class=\"line\" lang=\"shell\"><\/span>\r\n<span id=\"LC37\" class=\"line\" lang=\"shell\"><span class=\"c\"># 4. For debugging purposes.<\/span><\/span>\r\n<span id=\"LC38\" class=\"line\" lang=\"shell\"><span class=\"nb\">printf<\/span> <span class=\"s2\">\"<\/span><span class=\"se\">\\n\\n<\/span><span class=\"s2\">Example of debugging path in program.\"<\/span><\/span>\r\n<span id=\"LC39\" class=\"line\" lang=\"shell\">strace <span class=\"nt\">-f<\/span> <span class=\"nt\">-e<\/span> open python example.py <span class=\"c\"># Get system calls on what is opened, here with (non-existent) example.py, print to std err<\/span><\/span>\r\n<\/pre>\n<pre id=\"code-9\" class=\"code highlight js-syntax-highlight language-shell white\" lang=\"shell\" data-sourcepos=\"264:1-304:3\"><code><span id=\"LC39\" class=\"line\" lang=\"shell\"><\/span><\/code><\/pre>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Author: Dorothea Sommer Introduction Switching from executing code locally to running it on a compute cluster can be a daunting and sometimes also confusing experience. This article provides a brief overview of fundamental concepts to efficiently run code on the cluster. It is not a step-by-step guide on how to set up a particular environment &#8230; <a title=\"What scientists should know to efficiently use the Scientific Computing Cluster\" class=\"read-more\" href=\"https:\/\/info.gwdg.de\/news\/what-scientists-should-know-to-efficiently-use-the-scientific-computing-cluster\/\" aria-label=\"Mehr Informationen \u00fcber What scientists should know to efficiently use the Scientific Computing Cluster\">Weiterlesen<\/a><\/p>\n","protected":false},"author":166,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[136],"tags":[120],"class_list":["post-22815","post","type-post","status-publish","format-standard","hentry","category-science-domains","tag-hpc-en"],"_links":{"self":[{"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/posts\/22815","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/users\/166"}],"replies":[{"embeddable":true,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/comments?post=22815"}],"version-history":[{"count":15,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/posts\/22815\/revisions"}],"predecessor-version":[{"id":23005,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/posts\/22815\/revisions\/23005"}],"wp:attachment":[{"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/media?parent=22815"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/categories?post=22815"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/info.gwdg.de\/news\/wp-json\/wp\/v2\/tags?post=22815"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}