Submitting Jobs

DO NOT RUN JOBS DIRECTLY ON THE HEAD NODE!! What makes a computing cluster so powerful is it’s ability to utilize multiple CPUs across different compute nodes. Not only will running your jobs directly on the head node limit your available computing power, it will also slow down the cluster for other users!

Instead, submit your jobs to slurm, the scheduling manager, by either using the “srun” command (for interactive jobs) or writing an “sbatch” script (for batch jobs). Interactive jobs are those for which you will be logged in to a compute node (or set of compute nodes) and are given command line access on these node(s). Alternatively, batch jobs are those for which you can submit tasks, step away, and return when they are complete. Each method will be described in more detail in the following sections.

Submitting Interactive Jobs

After logging in to rcfcluster, enter an interactive bash environment for submitting jobs with the following command:

srun --pty bash

You can beef up this command by specifying the number of requested nodes, cpus per task, memory, time limit, etc. See https://slurm.schedmd.com/srun.html for a complete list of options. For example, to request 8 cores on a single compute node, for 1 hour:

srun -N 1 --cpus-per-task=8 --time=1:00:00 --pty bash

This command will automatically log you in to the compute node that best fits your resource requests. You can use the hostname command to determine which node you are on. From this point, you can run your desired scripts on the command line. Note that you can NOT ssh into any nodes that have not been allocated to you by slurm.

Unfortunately, there’s no obvious setting for what resource allocation will work “best” for a given application. For more information on how to estimate the resources you will need for specific tasks, see the Resource Requests section.

Submitting Batch Jobs

A sbatch script is a shell script that contains both the step-by-step code you wish to execute, as well as a header containing additional arguments to pass to slurm. The options you provide in the header are identical to those passed to srun when submitting interactive jobs. Again, a complete list of #SBATCH options can be found in the official slurm documentation.

Since batch jobs are run “hands-free”, there are some particularly helpful options that are not used as frequently in interactive jobs. These include the option to send an email when a job has started, failed, or completed, and to define an output and/or error file for recording output generated by your job. Note that if you do not define an output file, one will be created for you, with the name slurm-###.out, where ### is the jobID assigned to your submission. If you only define an output file, and not an error file, errors will be written to the same file as your output. The placeholder %j can be used in the #SBATCH arguments to indicate this unique job ID.

An example sbatch script, hello_world.sh, that utilizes each of these options is provided below. This script simply reports which compute-node the job has been allocated to (and is currently running on), and then prints “Minute X” to the output file once a minute, for 3 minutes. NOTE: You must change the paths in this script to point to locations in your home directory in order to run it!

#!/bin/bash

#SBATCH --job-name=helloWorld
#SBATCH --output=/home/username/examples/helloworld-%j.out
#SBATCH --error=/home/username/examples/helloworld-%j.err
#SBATCH --time=3:00
#SBATCH --nodes=1

# Email address to use for notifications
#SBATCH --mail-user=netid@umass.edu
# Send email if job fails:
#SBATCH --mail-type=FAIL
# Send email when job begins:
#SBATCH --mail-type=BEGIN
# Send email when job ends successfully
#SBATCH --mail-type=END

# Enter the code you wish to run:
echo "Hello from $(hostname)"
for i in 1 2 3
do
   echo "Minute $i"
   sleep 60
done

Then, to submit the sbatch script to slurm, type at the command line:

sbatch hello_world.sh

When a job is submitted successfully, you will see the output Submitted batch job #### (where #### is the unique jobID. Any output generated while the job is running will be in the text file you defined in the #SBATCH --ouput= option. You can view this output on the command line with the cat command, or retrieve any output files back to your local computer with the scp command.

Choosing a Partition

If you do not add an argument for a partition to your sbatch script, your job will be placed by default in the Normal queue. For most jobs this should be sufficient, however you should consider changing this if either of the following situations applies for your job:

  • You expect your job to take over 72 hours: submit to the long partition by adding:

    #SBATCH --partition=long
    
  • Your job is CPU-architecture-specific: To use only nodes with AMD CPUs, submit to the amd partition by adding:

    #SBATCH --partition=amd
    

And to use only nodes with Intel CPUs, submit to the intel partition by adding:

#SBATCH --partition=intel

See the Requested Resources page for more info.

Submitting Multiple Batch Jobs at Once

To run multiple jobs in parallel, you can write scripts (in the language of your choice) to generate and submit multiple batch scripts at once.

Example: suppose we want to run an R script, called “simulation.R” that depends on a parameter “p”:

# simulation.R - This is a (rather useless) R script that can be submitted to slurm

# Get arguments from the command line:
args = commandArgs(TRUE)
p = args[1]

# Set a random seed for consistency across simulations
set.seed(77)

# Run your R code below:
print(paste("Running simulation with p =", p))

We could write an sbatch script to run this for a given value of p, and then submit it with sbatch:

#!/bin/bash
#SBATCH --job-name=R_sim_p1
#SBATCH --output=/home/username/R_sim_p1-%j.out
#SBATCH --error=/home/username/R_sim_p1-%j.err
#SBATCH --time=1:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1

module load R
Rscript simulation.R 1

However, suppose we want to test out 100 different values of p. Instead of writting 100 individual sbatch scripts, let’s use a python script to generate and submit these scripts from a “for loop”. Note that while this submission script is written in python, the jobs that it submits to slurm are in R, and in fact can be any available software (R, python, sage, etc):

submit_R_simulations.py:

# This script (submit_R_simulations.py) creates and submits a series of sbatch
# scripts, in order to run a R script with different values of a given parameter.

# Credit: This script was inspired by an example available at:
# https://vsoch.github.io/lessons/sherlock-jobs/

import os
from path import Path
import numpy as np
from subprocess import Popen, PIPE
import sys

# This tutorial assumes all files are located in the folder $HOME/cluster/simulations_example:
# You should edit these path to reflect your directory structure:

# Top-level directory:
job_dir = Path(os.environ["HOME"]+'/cluster/simulations_example')
# Name of job script we want to run multiple times on the cluster:
job_script = Path(job_dir+'/simulation.R')
# Directory to save output files / results under:
output_dir = Path(job_dir+'/results')
# Directory for storing submitted sbatch scripts:
submission_dir = Path(job_dir+'/job_submissions/')

# Make the output & job submission directories if they don't already exist:
if not os.path.exists(output_dir):
    os.mkdir(output_dir)
if not os.path.exists(submission_dir):
    os.mkdir(submission_dir)

# Define which values of p we want to run the simulation script with:
p_array = np.arange(1,100,1) # Try p ranging from 1-99

# Create an SBATCH script for each value of p, and submit it to slurm:
for p in p_array:
    # Name each submission script & output file with its value of p:
    submission_script = os.path.join(submission_dir,'Rsim_p%d.sh' % p)
    output_file = os.path.join(output_dir, 'Rsim_p%d.out' % p)
    error_file = os.path.join(output_dir,'Rsim_p%d.err' % p)

    # Set resource requirements:
    n_threads = 4 # Choose how many CPU threads should be dedicated to each individual simulation
    mem = 512     # Choose how much memory (in MB) should be dedicated to each individual simulation
    time = "24:00:00" # Note time must be a string

    # Create an SBATCH script from within python:
    # (Note the use of the "w" option to overwrite the file if it already exists)
    with open(submission_script, "w") as f:
        # Write SBATCH arguments to the header of the file:
        f.writelines("#!/bin/bash\n")
        f.writelines("#SBATCH --job-name=Rsim_p%d\n" % p)
        f.writelines("#SBATCH --output=%s\n" % output_file)
        f.writelines("#SBATCH --error=%s\n" % error_file)
        f.writelines("#SBATCH --time=%s\n" % time)
        f.writelines("#SBATCH --mem=%d\n" % mem)
        f.writelines("#SBATCH --nodes=1\n")
        f.writelines("#SBATCH --ntasks=1\n")
        f.writelines("#SBATCH --cpus-per-task=%d\n" % n_threads)

        # Write the job steps to the sbatch script:
        # In this example, the job we want to submit is an R script, so we must
        # first load the R module:
        f.writelines("module load R\n")

        # And then we add the command to run the job_script, via Rscript, passing
        # in the current value of "p" as an argument:
        f.writelines("Rscript %s %d\n" % (job_script, p))

    # Now submit the sbatch script to slurm! We will use the "Popen" interface
    # from the subprocesses module to spawn a new system processes from python.
    proc = Popen(["sbatch", submission_script], stdout=PIPE)
    output, error = proc.communicate()
    print(output.rstrip().decode("utf-8"))

The above script can be run from the command line with python3 submit_R_simulations.py.

This same template can be used to submit jobs to run any script written in R, python, bash, sage, etc. Additional examples of these types of customizations will be provided in the later sections of this guide. While this example of a job submission script is written in python, users can write and submit similar scripts in R, bash, etc.

Submitting Multiple Batch Jobs with Job Arrays

In the case where we want to run a similar script multiple times (such as repeating an experiment with different values of a given parameter), we can actually bypass the need to generate multiple SBATCH scripts by using a slurm Job Array.

For example, say we want to run our simulation.R script from the previous example. This script takes in a single parameter, p, which we want to run with p=1-100. We create a single SBATCH script such as the following:

#!/bin/bash
#SBATCH --job-name=R_sim_p1-100
#SBATCH --time=1:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1

# send an email when each individual array task starts/ends/fails
#SBATCH --mail-user=netID@umass.edu
#SBATCH --mail-type=BEGIN,END,FAIL,ARRAY_TASKS

# Define what values the variable "$SLURM_ARRAY_TASK_ID" should take on:
#SBATCH --array=1-100
# The placeholder "%a" will take on the current array value
#SBATCH --output=/home/username/R_sim_p%a.out

# Run our R script for each value of $SLURM_ARRAY_TASK_ID defined in the ``--array`` argument
module load R
Rscript simulation.R $SLURM_ARRAY_TASK_ID

If you have a sbatch file with a very large job array, you should consider specifying a maximum number of simultaneously running tasks. This will ensure that your sbatch script does not hog all the cluster resources at once, allowing other users to still submit jobs while your array tasks paciently wait their turn. For example, to limit the max number of simultaneously running array tasks to 5, use the notation --array=1-100%5

Checking the Status of a Job

To view the status of a running interactive job, you can use the sstat command. For example, to view the status of job with jobID 1100:

sstat -j 1100

Note that sstat will only work for jobs launched with the srun command. To check on the status of jobs submitted with sbatch, use the sacct command instead. By default, running sacct with no additional arguments will show the job ID, name, partition, account, number of allocated cpus, state (pending/running/failed/completed), and ExitCode of each job you have submitted on the given day. You can customize the columns shown in this output table with the --format argument. For example, the following command will print your entire job history since Jan 2, 2021, showing the start time, end time, total runtime, and the hostname(s) of the node(s) the job ran on:

sacct --starttime 2021-01-02 --format=JobID,Jobname,start,end,elapsed,nodelist

For a complete list of available column names, use sacct --helpformat (or see the man page for sacct).

sacct shows only the jobs submitted by YOU. If you wish to see ALL jobs that are in either the RUNNING or PENDING state on the cluster, you can use squeue.