OUTDATED Running Jupyter on the SCC

Jupyter is a software for creating interactive data science and scientific computing articles with embedded Python scripts.

if the resources of the interactive queue are enough for you, then the easiest way to use Jupyter on the SCC would be to run it via Jupyter-Hub on SCC.

Otherwise the Docker version can be used to run Jupyter on the SCC. Below is the documentation to run the Docker version.

Please note that you can also use our Jupyter-Hub-Beta, if you do not need direct access to data on the SCC and have only low to medium compute resource requirements.

Jupyter container

The container with Jupyter is preloaded in Docker and based on the image of ubuntu:latest. The container includes Jupyter itself and Python 3 with preloaded numpy, scipy and matplotlib packages. However, it is possible to install other packages as well. To see how, go to the section on installing python packages.

After running a Jupyter on one of the computing nodes you will be able to connect to the web interface of Jupyter using the browser on your local PC. This requires port forwarding, and is only described for UNIX-like OS here.

Submitting the job

Before submitting the job you should run 2 commands to prepare the creation of a docker container:

umask a+r
export PATH=$PATH:/bin

The first command adds “readable for all” permission flags to the default mask of new files. Be aware of the security implications. Anyone will be able to read files you created in this session, so best use it only when submitting a Jupyter job.

The second command adds the path /bin to the list of paths where the shell looks for executables.

Below is an example of a Jupyter job script file that can be submitted using bsub.

#!/bin/sh  
#BSUB -q mpi  
#BSUB -app djupyter  
#BSUB -ISs  
  
jupyter.lsf

Usage of -app (docker application djupyter should be used) and -ISs (interactive session) are mandatory. Using the prepared script jupyter.lsfis highly recommended, since it is basically everything that is needed in most cases to run a Jupyter server and to start using it.

After a job submission, the output of the container will follow. All further instructions are within the output. The next chapters detail those instructions.

Environment

Inside the docker container you will have access to the folder where you did submit the job from. This folder will be your $HOME directory inside a container.

Connecting to the web interface

After the successful launch of the container you should see instructions like following:

[Jupyter] container is up. Preparing the environment...

[Jupyter] running jupyter notebook...

[Jupyter] jupyter is up. You can access it by following the instructions.

[Jupyter] 1. Open a new terminal and tunnel port 8888 from the host gwdc047

[Jupyter] a) With direct access to nodes: ssh -L 0.0.0.0:8888:172.18.0.2:8888 yourlogin@gwdc047.gwdg.de -N

[Jupyter] b) Without direct access: ssh -L 0.0.0.0:8888:0.0.0.0:8640 yourlogin@login.gwdg.de ssh -L 0.0.0.0:8640:0.0.0.0:8640 gwdu101 ssh -L 0.0.0.0:8640:172.18.0.2:8888 -N gwdc047

[Jupyter] 2. Open the link in a browser: http://0.0.0.0:8888/?token=ce28b2500a5f2c0e637c7c8a67fa318155e4c36bb3e5608b

The important part here is step 1. If you can directly log in to the computing nodes like gwdc047.gwdg.de, then you can run the command 1a in the terminal, otherwise run the command 1b. After that the corresponding port will be tunneled and the terminal becomes unavailable. Don’t close it! If everything is done right, then just open the link in your browser mentioned in step 2 and you should be able to access your Jupyter instance.

Installing python packages

In order to install Python packages, create a Terminal in Jupyter (New->Terminal). After that you will be redirected to a web page with a virtual shell inside the container, where you can execute commands.

You have no root access inside the container, therefore you should install Python packages in your $HOME directory (remember that the $HOME inside the container is actually the job submission folder). You can do this with the next command using pip3:

pip3 install --user package_name

After that, the package will be installed in $HOME/.local and will be available after the container is destroyed. In order to have these packages when you run other instances of Jupyter, you should submit the job from the same folder.