User Tools

Site Tools


tutorials:slurm

hometutorials

SLURM

SLURM is a resource manager that is used on Liger. One link to know all about slurm is here

Overview of commands

Slurm is loaded per default when you log in to Liger, so you don't have to add the slurm module to use it. Some of the most common commands that you might use are here.

For detailed information on each command, refer to their respective manpage or the –help option. Also see slurms manpage for commands not listed here.

Reserving a node

The login nodes on ICI-SC clusters are intended for editing and compiling, All other use should be carried out on nodes under batch-system control (interactive nodes or compute nodes). Short test jobs may be executed on the interactive nodes. Reserving dedicated nodes through the batch-system gives you exclusive access to the requested resource (you are the only person allowed to login). To run interactively on one node for 300 minutes using SLURM you execute (on the login node):

salloc -N 1 -t 300 bash

This will start a bash shell on the node, which lets you run interactively. Note that you are still located on the login node, e.g. if you run hostname it will report the hostname of the login node.You need to use a specific command to launch the program on the compute nodes, which varies according to the system you are using.

salloc -N vs -n option

Note that the salloc commands have two different meanings for upper and lower case n. -N defines how many nodes you want to book, whilst -n defines the maximum number of tasks that should be spawned on the nodes you book. Do note that -n and -N is shorthand notation for --ntasks and --nodes respectively, and we recommend that you use the longer notation in your job scripts. This makes them easier to read, and makes them less error-prone.

Requesting a specific type of node

It is also possible in SLURM to request a specific type of node, e.g. if there is a mix of large or small memory nodes e.g.

salloc -t 1:00:00 -N 1 --mem=64000

which requests only nodes with at least 64GB ram, or

salloc -N 1 -t 300 --mincpus=24

Which requests only nodes with at least 24 logical CPUs.

If the cluster does not have enough nodes of that type then the request will fail with an error message.

Submitting parallel jobs

As an alternative, you can submit job scripts. Below is a sample job script that starts 24 tasks for 2 hours:

#!/bin/bash -l
# The -l above is required to get the full environment with modules

# Set the allocation to be charged for this job
# not required if you have set a default allocation
#SBATCH -A 201X-X-XX

# The name of the script is myjob
#SBATCH -J myjob

# Only 1 hour wall-clock time will be given to this job
#SBATCH -t 0-01:00:00

# Number of nodes
#SBATCH -N 4

# Number of MPI processes per node (the following is actually the default)
#SBATCH --ntasks-per-node=32

# Number of MPI processes.
#SBATCH -n 128

#SBATCH -e error_file.e
#SBATCH -o output_file.o

# Run the executable named myexe 
# and write the output into my_output_file
srun -n 128 ./myexe > my_output_file 2>&1

You can submit this using the sbatch command:

sbatch myjobscript.sh

You can also specify the parameters on the command line -- if they are specified in both the job script and the command line then the command line's parameters will be used. For example:

sbatch -t 3:00:00 myjobscript.sh # run for 3 hours instead

Examining the queue

The command

squeue 

will show the current SLURM queue. You can filter to a specific user with the -u option e.g.

squeue -u 

You can also use the -j option to specify a list of job IDs. By adding “-i 5”, the queue will update every 5 seconds.

The queue will list a status for each job. The most common of these are listed below, see also the squeue manpage. You can filter by status codes with the -t command to squeue, e.g. “squeue -t TO,CD”.

Status Code Meaning
CA Cancelled - Job was explicitly cancelled by the user or an administrator
CD Completed
CG Completing - Job is completing, some processes are still running on some nodes
PD Pending - Job is in queue and waiting to start
R Running
TO Timeout - Job terminated upon reaching its timeout

You can also use smap, which provides a graphical interface for showing jobs and partitions (updates every 5 seconds):

smap -i 5

You can then use the keys r to show reservations, s to show partitions, and j to go back to showing jobs. Use q to quit.

Removing a job

First, find the job ID of the job you want to remove, by using for example “squeue -u ”. Then use the scancel command as

scancel <jobID>

Finding queue settings

You can find the current SLURM settings by using the command

scontrol show config

You can also see the current settings for the partitions with

scontrol show partition

The state of the partitions and their nodes is given by

sinfo

To know default limits configured on partitions, use this command

sinfo -o "%10P %.11L %.11l"

Keeping track of time spent

You can get information about a job with the sacct command. This will give information about number of tasks, elapsed time, along with a lot of other details. You can customize the output with the --format option, for example:

sacct --format=jobid,elapsed,ncpus,ntasks,state

This will give a list of jobs with their job IDs, elapsed time, number of CPUs allocated, number of tasks in the job, and the state of the job.

Follow your consumption utilization

CPU hours for a specific account from a certain date.

sreport cluster AccountUtilizationByUser  start=[start_date] accounts=[your_project_account] -t hours 

Keeping track of different jobs

You can assign a name to a job with the -J option to for example sbatch. You can then find the job in the queue with squeue -n name. You can also see more details about a queueing or running job with the command

scontrol show job <jobID>

When my job will run ?

squeue --start

Is my job running ?

scontrol show job <jobid> | grep StartTime=

or

squeue -o "%S"  -j <jobid>

Environment variables

The following variables can be used within a batch-script submitted using sbatch:

SLURM_JOB_ID		The ID of the job allocation.
SLURM_JOB_NAME		The name of the job.
SLURM_JOB_NODELIST	List of nodes allocated to the job.
SLURM_JOB_NUM_NODES	Number of nodes allocated to the job.
SLURM_NTASKS		Number of CPU tasks in this job.
SLURM_SUBMIT_DIR	The directory from which sbatch was invoked.

For a complete list, see the man page for sbatch.

tutorials/slurm.txt · Last modified: 2016/04/27 19:34 by Richard.Randriatoamanana@ec-nantes.fr