Jupyter Notebooks

In this section, we’ll go over how to launch a Jupyter Notebook on a compute node, but open and interact with it on your local computer. We will do this by choosing a specific port number to run our Jupyter Notebook on, and forwarding this port from rcfcluster to our local computer. This port number can be chosen to be any number between 1024 and 65535. For this tutorial, we will use 8889.

PREREQUISITES: Before using Jupyter for the first time, you will need to install jupyter into your python virtual enviornment.

Starting Jupyter Through an Interactive Job

  1. SSH into rcfcluster, and submit an interactive job request to slurm. For example:

    srun -N 1 --cpus-per-task=8 --mem=4GB --time=1:00:00 --job-name=jupyter-example --pty bash
    
  2. Wait a moment for slurm to log you in to a compute node. Once you have a terminal on a compute node, load the miniconda module and start your python virtual enviornment (replace ‘py37’ with the name of your conda environment):

    module load miniconda3
    conda activate py37
    
  3. Choose a port to run Jupyter on. This can be any number between 1024 and 65535 - for this tutorial we will use 8889. Then run the following command to forward all traffic on this port from the compute node to the head-hode (rcfcuster):

    ssh -N -f -R 8889:localhost:8889 rcfcluster
    
  4. Launch jupyter on the port chosen in Step 3. We use the --no-browser option to indicate that Jupyter should not try to launch a browser on the command-line-only interface:

    # Switch directory to where you want to launch jupyter from...
    cd /home/username/jupyter_example
    
    # And start the jupyter-notebook:
    jupyter-notebook --no-browser --port 8889
    

As Jupyter starts up, it will log text to standard output that resembles the following:

[I 14:08:11.921 NotebookApp] Serving notebooks from local directory: /home/username/jupyter_example
[I 14:08:11.922 NotebookApp] Jupyter Notebook 6.1.4 is running at:
[I 14:08:11.922 NotebookApp] http://localhost:8889/?token=eaddc9e1a77d15a1f3db7c97f76a6f50c8354b41b57ce7b5
[I 14:08:11.922 NotebookApp]  or http://127.0.0.1:8889/?token=eaddc9e1a77d15a1f3db7c97f76a6f50c8354b41b57ce7b5
[I 14:08:11.922 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 14:08:11.935 NotebookApp]

To access the notebook, open this file in a browser:
  file:///home/username/.local/share/jupyter/runtime/nbserver-31455-open.html
Or copy and paste one of these URLs:
  http://localhost:8889/?token=eaddc9e1a77d15a1f3db7c97f76a6f50c8354b41b57ce7b5
or http://127.0.0.1:8889/?token=eaddc9e1a77d15a1f3db7c97f76a6f50c8354b41b57ce7b5

Copy one of the two URLs listed at the end of the file.

  1. Next, we will need to set up port forwarding from rcfcluster to our local computer. We will do this by creating an “SSH tunnel”. The commands we use will be slightly different depending on how you are connected to the internet. In both instances, replace username and the port number 8889 accordingly:

If you are conected to the math department network (via VPN, wired, or wireless connection), then we can tunnel directly from rcfcluster to the local computer. Open a second terminal on your local computer, and enter the following command:

ssh -L 8889:localhost:8889 username@rcfcluster.math.umass.edu

If you are NOT on the math department network then you will have to make an intermediate stop through the system ssh.math.umass.edu, since rcfcluster is only available on our internal network. Open a second terminal on your local computer, and enter the following commmands, one after the other:

ssh -L 8889:localhost:8889 username@ssh.math.umass.edu

ssh -L 8889:localhost:8889 rcfcluster

(Note that as long as the same port number is used in both steps command, it does not matter if the SSH tunnel is started before or after submitting the interactive job request).

The SSH tunnel will stay open for as long as this terminal window stays open. If you accidentally close the terminal window before you are finished, you will break the port forwarding. If this happens, you can re-connect by simply running this step again.

  1. Finally, open up a browser window on your local computer, and paste the URL you copied in Step 4. You can now run and interact with the Jupyter Notebook as if you were working on your local computer.

  2. When you are finished running your code, simply click the “Quit” button in the Jupyter interface, and this will complete the slurm job. To close the SSH connection and stop the port forwarding, simply type exit at the command line, or close the terminal window.

Summary of Commands (as a SBATCH Script)

The following script summarizes the process needed for configuring and launching a Jupyter notebook on rcfcluster. This can be used as a template, although you will want to tweak the paths and specific requested resources to your own needs. Be sure to match the port number used in this script to the port number you use to set up port forwarding from your local computer (Step 5)

#!/bin/bash

#SBATCH --job-name=jupyter-example
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=8GB
#SBATCH --time=1:00:00

# Load miniconda module
module load miniconda

# Activate conda enviornment
eval "$(conda shell.bash hook)"
conda activate p37

# Set up port forwarding from compute-node to head-node:
# For this you can choose any port number greater than 1024. The default jupyter port is 8888
port=8889
ssh -N -f -R $port:localhost:$port rcfcluster

# Switch directory to where you want to launch jupyter from...
cd /home/username/jupyter_example

# And start the jupyter-notebook:
jupyter-notebook --no-browser --port $port

Submit the job to slurm (in this example, we have named the above sbatch script jupyter-example.sh)

sbatch jupyter-example.sh

If the job is submitted successfully, you should see an output file in your current working directory called slurm-1100.out (where 1100 is the unique identified used to refer to this specific job). Print the contents of this file with the cat command:

cat slurm-1100.out

This file should contain text similar to that described in Step 4 above. Copy one of the two URLs listed at the end of the file. **Note that you may have to wait a minute or two while jupyter launches for this text to appear in the .out file **

Next, start the SSH tunnel as described in Step 5, then open up a browser window and paste the URL from the .out file. You can now run the Jupyter Notebook as if you were working on your local computer.

Jupyter Lab

Jupyter Lab is a helpful interface that provides a code editor, console, data viewer, file browser, and much more. For full documentation on how to use JupyterLab, see https://jupyterlab.readthedocs.io/en/stable/

PREREQUISITES: Before using Jupyter Lab for the first time, you will need to install both jupyterlab and nodejs into your virtual enviornment.

You can launch a Jupyter Lab instance on a rcfcluster compute node by following the exact same steps as Starting Jupyter Through an Interactive Job. Simply replace all instances of jupyter-notebook with the command jupyter lab.