Table of Contents
Jupyter / Jupyter-Hub
Due to the limited scalability of the service jupyter.gwdg.de, it will be replaced by jupyter-cloud.gwdg.de on March 31, 2020. The functionality of the new system is identical and user can have their files transferred by visiting jupyter-cloud.gwdg.de/migrate.
Further details and answers can be found at our information page.
What is Jupyter / Jupyter-Hub?
Jupyter makes it possible to work interactively with Python, Julia or R with only a browser. Source Code is written, executed and edited directly in the user's browser. This happens in a so called “notebook”. Each notebook has a “kernel” which determines the type of notebook, which can be a Python, Julia or R notebook.
Jupyter-Hub is the portal users log in to and start and manage their notebooks and associated files.
Please note that you can also use Jupyter on our SCC if you do need direct access to data on the SCC and have medium to high compute resource requirements.
How to use Jupyter / Jupyter-Hub?
Storage and calculation of notebooks happens server side, the client does not need to install any software or meet any other prerequisites other than having a fairly modern browser to work with. To log into the service a valid GWDG user account is required.
Starting a notebook
After successful login at Jupyter-Hub there is a drop down menu at the top right corner on the overview page. Under “New” a notebook kernel can be selected and started. Previously used notebooks and their files are listed at the overview page as well. Next to the “New” option is the button “Upload” with which files can be transfered to the hub to be used in notebooks.
The actual starting of a new notebook happens when a kernel is selected. The notebook is started in a new window and can be used immediately. The menu “Help” has an interface tour, an overview of the keyboard shortcuts and links to the documentation of the kernel in use, if available.
The folder public-ro contains sample notebooks from the official Jupyter project repository. The notebooks explain the basic Jupyter usage.
An active notebook can be closed through the “File” > “Close and Halt” menu or by closing the browser window or tab. The actual notebook is preserved in its last state and can be launched again from the overview page of Hub.
Deleting a notebook is done through the overview page my selecting its file entry from the list and clicking the trash can icon above the list.
- ⚠ Please note that all directories except the home directory are volatile and will be lost when the notebook server is closed.
- A maximum of 50GB disk space and 4GB RAM can be used.
- jupyter.gwdg.de is not suitable for continuous computations over multiple days.
Installing additional python modules
Additional Python modules can be installed via the terminal and the Python package manager “pip”. To do this, a terminal must be opened via the menu “New” → “Terminal”. Afterwards
pip install --user <module>
installs a new module in the home directory.
Installing large python modules and disk space
The installation of large Python modules like “tensorflow” may fail with a message “No space left on device”. This is caused by the temporary space under “/tmp” being too small for pip to work the downloaded packages. The following steps use a temporary directory in the much larger user home directory for this one installation:
mkdir -v ~/.user-temp TEMP=~/.user-temp pip install --user <module>
Prefixing the installation with the TEMP variable makes pip use that location for this one installation.
Notebook fails to start after package installation or update, error 500
If every notebook fails to start after a package installation or upgrade (error 500) the issue can be resolved by renaming the folder
mv -v .local/ .local.gwdg-disable
Afterwards the notebook server should be restarted. In the upper right corner click on “Control Panel” - “Stop My Server” → “My Server”. This deactivates all additionally installed packages. The notebook should start again normally.
Installation of additional packages and environments via Conda
Management of software packages and environments with Conda requires a terminal session started from the notebook server. The terminal ist available after login via
New → Terminal.
Before working with
conda commands the necessary conda functions need to be loaded (mind the dot at the beginning!):
Creating a new environment
The following describes the creation of a new, simple environment
wikidoku, the installation of the package
jinja2 as an example and how to make the environment available in the kernel selection of the notebook.
Creating and activating the environment:
conda create -y --prefix ./wikidoku conda activate ./wikidoku
As an example the package
jinja2 will be installed next. This is the step to install the desired software packages from various Conda channels.
conda install -y jinja2
Next the new environment will be registered with the notebook. Terms for
–display-name (optional) can be chosen freely, the later being shown in the kernel selection of the notebook. The second command lists all known environments. The last command exists the environment.
python3 -m ipykernel install --user --name wikidoku --display-name "Python (wikidoku)" jupyter kernelspec list conda deactivate
If installation of the kernel fails with the message
/usr/bin/python: No module named ipykernel the additional package
jupyter needs to be installed and installation of the kernel repeated:
python3 -m pip install jupyter
Selecting the new environment
Restarting of the notebook server
After installation of a new environment it is recommended to restart the notebook server. Leave all existing terminals and close all open notebooks. In the Jupyter overview page click on
Control Panel and stop the server by clicking
Stop My Server and restart it with
Start My Server.
New → a new notebook with the new environment can be started. In existing notebooks the kernel can be changed after starting the notebook and selecting
Kernel → Change Kernel.
Installing additional kernels in an Conda environment
Installing a new, independent Python kernel für the current environment is possible. As an example an older Python 2.7 kernel will be installed next.
A new environment needs to be created and activated as per the steps above.
Next follows the installation of the kernel, the
jupyter module for that kernel and finally the kernel is made available for selection in the kernel list.
conda install -y python=2.7 python3 -m pip install jupyter python3 -m ipykernel install --user --name oldpython --display-name "Python 2.7 (oldpython)"
The new kernel is now available for new and existing notebooks after restarting the notebook server. The current kernel version can be queried from within Python:
import sys print (sys.version)
Removing an environment
In order to remove an environment it has to be de-registered from the notebook server and then its files removed (optional but recommended). We list the installed kernels, de-register and remove the environment's files:
jupyter kernelspec list jupyter kernelspec remove wikidoku rm -rf ./wikidoku
Installing additional R packages
1) mkdir -p ~/R/library; mkdir ~/temp 2) create a file "/home/jovyan/.Renviron" with 2 lines: "R_LIBS_USER=/home/jovyan/R/library" and "TMPDIR=/home/jovyan/tmp" 3) R 4) source("https://bioconductor.org/biocLite.R") 5) biocLite() This is because R downloads and installs packages to and from the default tmp directory, from which it cannot execute files. Using a tmp directory inside the home directory solves this problem. How to install packages from Github (in R): 1) library(devtools) 2) options(unzip = "internal") 3) install_github("repo/package")
Transfer data to the Unix / Linux home directory
In order to facilitate access to larger amounts of data on jupyter.gwdg.de, the Unix / Linux home directory can be used. To do this, data is transferred using the rsync tool. Here is an example that needs to be adapted to the user's environment:
Open a new jupyter terminal via the menu “New” → “Terminal”
jovyan@0d5793127e96:~$ ls mynotebooks/ myfile.txt jovyan@0d5793127e96:~$ rsync -av ~/mynotebooks/ firstname.lastname@example.org:/usr/users/bbrauns/mynotebooks/ email@example.com's password: sending incremental file list ./ myfile.txt sent 145 bytes received 44 bytes 75.60 bytes/sec total size is 12 speedup is 0.06 jovyan@0d5793127e96:~$
For accessing the data in the Unix / Linux home directory from a Windows machine, see: Samba Server
Install addition kernel with pipenv
Open a new jupyter terminal via the menu “New” → “Terminal”
pip install pipenv --user mkdir myproject cd myproject export PATH=~/.local/bin/:$PATH pipenv --python /usr/bin/python3.6 #needed because of conda pipenv install ipykernel networkx pipenv shell ipython kernel install --user --name=projectname
- Stop and restart server via control panel
- Afterwards “projectname” is usable as new kernel
Install additional julia packages with an extra kernel
The jupyter docker stacks image sets the variable JULIA_DEPOT_PATH to the path /opt/julia. However, this is volatile, since only the home directory is kept persistent. The following describes the installation of a new julia kernel, which has its package directory pointed to the home directory:
Start terminal Temporarily change julia package directories: export JULIA_DEPOT_PATH=/home/jovyan/.julia-depot export JULIA_PKGDIR=/home/jovyan/.julia-depot Create directory for custom packages and new julia kernel: > mkdir /home/jovyan/.julia-depot > julia julia > # switch to pkg with ']' character pkg > add IJulia # switch back to julia with CTRL+C julia > using IJulia installkernel("My-Julia-kernel", env=Dict("JULIA_DEPOT_PATH"=>"/home/jovyan/.julia-depot")) Restart notebook server Create new notebook with "My-Julia-kernel" kernel Add package example: using Pkg Pkg.add("DataFrames")