This is an old revision of the document!


Jupyter / Jupyter-Hub

GWDG offers Jupyter-Hub as a beta service for users of Python, Julia or R.

What is Jupyter / Jupyter-Hub?

Jupyter makes it possible to work interactively with Python, Julia or R with only a browser. Source Code is written, executed and edited directly in the user's browser. This happens in a so called “notebook”. Each notebook has a “kernel” which determines the type of notebook, which can be a Python, Julia or R notebook.

Jupyter-Hub is the portal users log in to and start and manage their notebooks and associated files.

Please note that you can also use Jupyter on our SCC if you do need direct access to data on the SCC and have medium to high compute resource requirements.

How to use Jupyter / Jupyter-Hub?

Prerequisites

Storage and calculation of notebooks happens server side, the client does not need to install any software or meet any other prerequisites other than having a fairly modern browser to work with. To log into the service a valid GWDG user account is required.

Starting a notebook

After successful login at Jupyter-Hub there is a drop down menu at the top right corner on the overview page. Under “New” a notebook kernel can be selected and started. Previously used notebooks and their files are listed at the overview page as well. Next to the “New” option is the button “Upload” with which files can be transfered to the hub to be used in notebooks.

The actual starting of a new notebook happens when a kernel is selected. The notebook is started in a new window and can be used immediately. The menu “Help” has an interface tour, an overview of the keyboard shortcuts and links to the documentation of the kernel in use, if available.

Examples

The folder public-ro contains sample notebooks from the official Jupyter project repository. The notebooks explain the basic Jupyter usage.

Managing notebooks

An active notebook can be closed through the “File” > “Close and Halt” menu or by closing the browser window or tab. The actual notebook is preserved in its last state and can be launched again from the overview page of Hub.

Deleting a notebook is done through the overview page my selecting its file entry from the list and clicking the trash can icon above the list.

Usage

  • Please note that all directories except the home directory are volatile and will be lost when the notebook server is closed.
  • A maximum of 50GB disk space and 4GB RAM can be used.
  • jupyter.gwdg.de is not suitable for continuous computations over multiple days.

Installing additional python modules

Additional Python modules can be installed via the terminal and the Python package manager “pip”. To do this, a terminal must be opened via the menu “New” → “Terminal”. Afterwards

pip install --user <module>

installs a new module in the home directory.

Installing large python modules and disk space

The installation of large Python modules like “tensorflow” may fail with a message “No space left on device”. This is caused by the temporary space under “/tmp” being too small for pip to work the downloaded packages. The following steps use a temporary directory in the much larger user home directory for this one installation:

mkdir -v ~/.user-temp
TEMP=~/.user-temp pip install --user <module>

Prefixing the installation with the TEMP variable makes pip use that location for this one installation.

Installing additional R packages

1) mkdir -p ~/R/library; mkdir ~/temp
2) create a file "/home/jovyan/.Renviron" with 2 lines:
"R_LIBS_USER=/home/jovyan/R/library" and "TMPDIR=/home/jovyan/tmp"
3) R
4) source("https://bioconductor.org/biocLite.R")
5) biocLite()

This is because R downloads and installs packages to and from the default tmp directory,
from which it cannot execute files. Using a tmp directory inside the home directory solves
this problem.

How to install packages from Github (in R):

1) library(devtools)
2) options(unzip = "internal")
3) install_github("repo/package")

Transfer data to the Unix / Linux home directory

In order to facilitate access to larger amounts of data on jupyter.gwdg.de, the Unix / Linux home directory can be used. To do this, data is transferred using the rsync tool. Here is an example that needs to be adapted to the user's environment:

Open a new jupyter terminal via the menu “New” → “Terminal”

jovyan@0d5793127e96:~$ ls mynotebooks/
myfile.txt
jovyan@0d5793127e96:~$ rsync -av ~/mynotebooks/ bbrauns@login.gwdg.de:/usr/users/bbrauns/mynotebooks/
bbrauns@login.gwdg.de's password:
sending incremental file list
./
myfile.txt

sent 145 bytes  received 44 bytes  75.60 bytes/sec
total size is 12  speedup is 0.06
jovyan@0d5793127e96:~$

For accessing the data in the Unix / Linux home directory from a Windows machine, see: Samba Server

Install addition kernel with pipenv

Open a new jupyter terminal via the menu “New” → “Terminal”

pip install pipenv --user
mkdir myproject
cd myproject
export PATH=~/.local/bin/:$PATH
pipenv --python /usr/bin/python3.6 #needed because of conda
pipenv install ipykernel networkx
pipenv shell
ipython kernel install --user --name=projectname
  • Stop and restart server via control panel
  • Afterwards “projectname” is usable as new kernel

More information about the service being in beta

To learn more about the practical implications of a beta service see the explanation of a beta service.