FAQ Slurm
JOB SUBMISSION
How to submit an interactive job?
You can run an interactive job like this:
$ srun --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --time=01:00:00 --pty bash -i
Here we ask for a single task, one cpu-core and one interactive node for one hour with the default amount of memory.
Or simply request a bash session using default resources:
$ srun --pty bash -i
The command prompt will appear as soon as the job starts.
How to request memory per CPU core?
Use --mem-per-cpu
option to request memory per core, for example:
$ sbatch --mem-per-cpu=6G ...
How to request specific node(s) at job submission?
Use -w (or --nodelist) option, for example:
$ sbatch -w <cn1-1>,<cn1-2> ...
How to exclude specific nodes from job?
Use -x (or --exclude) option, for example:
$ sbatch -x <cn1-1>,<cn1-2> ...
How to delay the start of a job?
Use -b (or --begin) option in order to defer the allocation of the job until the specified time.
Examples:
$ sbatch --begin=20:00 ... # job can start after 8 p.m.
$ sbatch --begin=now+1hour ... # job can start 1 hour after submission
$ sbatch --begin=2023-12-24T20:00:00 ... # job can start after specified date/time
How to submit dependency (chain) jobs?
Use -d (or --dependency) option, for example:
$ sbatch -d afterany:123456 ...
This will defer the submitted job until the specified job 123456
has terminated.
Note: Multiple jobs can be specified by separating their job ids by colon characters (:), for example:
$ sbatch -d afterany:123456:123457 ...
This will defer the submitted job until the specified jobs 123456 and 123457 have both finished.
How to choose a QoS ?
At submit time, you can choose a different QoS (Quality of service) by using --qos QOS_NAME
For example, if you want to submit a job to a project for which nodes have been reserved, we'll create a QoS dedicated to this project, e.g. QOS_PROJECT_A :
$ sbatch --qos QOS_PROJET_A ...
JOB MONITORING AND CONTROL
How to view information about submitted jobs?
Use squeue command, for example:
$ squeue # all jobs owned by user (all jobs owned by all users for admins)
$ squeue --me # all jobs owned by user (same as squeue for regular users)
$ squeue -u <username> # jobs of specific user
$ squeue -t PENDING # pending jobs only
$ squeue -t RUNNING # running jobs only
The output format of squeue (and most other Slurm commands) is highly configurable to your needs.
Look for the --format
or --Format
options.
How to cancel jobs?
Use scancel command, for example:
$ scancel <jobid> # cancel specific job
$ scancel <jobid>_<index> # cancel indexed job in a job array
$ scancel -u <username> # cancel all jobs of specific user
$ scancel -t PENDING # cancel pending jobs
$ scancel -t RUNNING # cancel running jobs
How to get estimated start time of a job?
$ squeue --start
Estimated start times are dynamic and can change at any moment.
How to prevent (hold) jobs from being scheduled for execution?
$ scontrol hold <job_id>
How to unhold job?
$ scontrol release <job_id>
How to requeue (cancel and resubmit) a particular job?
$ scontrol requeue <job_id>
How to modify a pending/running job?
Use:
$ scontrol update JobId=<jobid> ...
For example:
$ scontrol update JobId=666 TimeLimit=4-0