Skip to main content

FAQ Slurm

JOB SUBMISSION

How to submit an interactive job?

You can run an interactive job like this:

$ srun --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --time=01:00:00 --pty bash -i

Here we ask for a single task, one cpu-core and one interactive node for one hour with the default amount of memory.

Or simply request a bash session using default resources:

$ srun --pty bash -i

The command prompt will appear as soon as the job starts.

How to request memory per CPU core?

Use --mem-per-cpu option to request memory per core, for example:

$ sbatch --mem-per-cpu=6G ...

How to request specific node(s) at job submission?

Use -w (or --nodelist) option, for example:

$ sbatch -w <cn1-1>,<cn1-2> ...

How to exclude specific nodes from job?

Use -x (or --exclude) option, for example:

$ sbatch -x <cn1-1>,<cn1-2> ...

How to delay the start of a job?

Use -b (or --begin) option in order to defer the allocation of the job until the specified time.

Examples:

$ sbatch --begin=20:00 ...               # job can start after 8 p.m. 
$ sbatch --begin=now+1hour ... # job can start 1 hour after submission
$ sbatch --begin=2023-12-24T20:00:00 ... # job can start after specified date/time

How to submit dependency (chain) jobs?

Use -d (or --dependency) option, for example:

$ sbatch -d afterany:123456 ...  

This will defer the submitted job until the specified job 123456 has terminated.

tip

Note: Multiple jobs can be specified by separating their job ids by colon characters (:), for example:

$ sbatch -d afterany:123456:123457 ...

This will defer the submitted job until the specified jobs 123456 and 123457 have both finished.

How to choose a QoS ?

At submit time, you can choose a different QoS (Quality of service) by using --qos QOS_NAME

For example, if you want to submit a job to a project for which nodes have been reserved, we'll create a QoS dedicated to this project, e.g. QOS_PROJECT_A :

$ sbatch --qos QOS_PROJET_A ...  

JOB MONITORING AND CONTROL

How to view information about submitted jobs?

Use squeue command, for example:

$ squeue                  # all jobs owned by user (all jobs owned by all users for admins)
$ squeue --me # all jobs owned by user (same as squeue for regular users)
$ squeue -u <username> # jobs of specific user
$ squeue -t PENDING # pending jobs only
$ squeue -t RUNNING # running jobs only
tip

The output format of squeue (and most other Slurm commands) is highly configurable to your needs. Look for the --format or --Format options.

How to cancel jobs?

Use scancel command, for example:

$ scancel <jobid>         # cancel specific job
$ scancel <jobid>_<index> # cancel indexed job in a job array
$ scancel -u <username> # cancel all jobs of specific user
$ scancel -t PENDING # cancel pending jobs
$ scancel -t RUNNING # cancel running jobs

How to get estimated start time of a job?

$ squeue --start
tip

Estimated start times are dynamic and can change at any moment.

How to prevent (hold) jobs from being scheduled for execution?

$ scontrol hold <job_id>

How to unhold job?

$ scontrol release <job_id>

How to requeue (cancel and resubmit) a particular job?

$ scontrol requeue <job_id>

How to modify a pending/running job?

Use:

$ scontrol update JobId=<jobid> ...

For example:

$ scontrol update JobId=666 TimeLimit=4-0