This is an old revision of the document!
Table of Contents
Slurm
SLURM is the workload manger and job scheduler for tron.ift.uni.wroc.pl
Basic usage
| sinfo -alN | show nodes information | 
| squeue | Show job queue | 
| squeue -u <username> | List all current jobs for a user | 
| squeue -u <username> -t RUNNING | List all running jobs for a user | 
| squeue -u <username> -t PENDING | List all pending jobs for a user | 
| scancel <jobid> | To cancel one job | 
| scancel -u <username> | To cancel all the jobs for a user | 
| scancel -t PENDING -u <username> | To cancel all the pending jobs for a user | 
| scancel –name myJobName | To cancel one or more jobs by name | 
Slurm batch
The following parameters can be used as command line parameters with sbatch and srun or in jobscript, see also Job script example below Basic settings
| Parameter | Function | 
| –job-name=<name> | Job name to be displayed by for example squeue | 
| –output=<path> | Path to the file where the job (error) output is written to | 
| –mail-type=<type> | Turn on mail notification; type can be one of BEGIN, END, FAIL, REQUEUE or ALL | 
| –mail-user=<email_address> | Email address to send notifications to | 
Resources
| Parameter | Function | 
| –time=<d-hh:mm:ss> | Time limit for job. Job will be killed by SLURM after time has run out. Format days-hours:minutes:seconds | 
| –nodes=<num_nodes> | Number of nodes. Multiple nodes are only useful for jobs with distributed-memory (e.g. MPI). | 
| –mem=<MB> | Memory (RAM) per node. Number followed by unit prefix, e.g. 16G | 
| –mem-per-cpu=<MB> | Memory (RAM) per requested CPU core | 
| –ntasks-per-node=<num_procs> | Number of (MPI) processes per node. More than one useful only for MPI jobs. Maximum number depends nodes (number of cores) | 
| –cpus-per-task=<num_threads> | CPU cores per task. For MPI use one. For parallelized applications benchmark this is the number of threads. | 
| –exclusive | Job will not share nodes with other running jobs. You will be charged for the complete nodes even if you asked for less. | 
Additional
| Parameter | Function | 
| –array=<indexes> | Submit a collection of similar jobs, e.g. –array=1-10. (sbatch command only). See official SLURM documentation | 
| –dependency=<state:jobid> | Wait with the start of the job until specified dependencies have been satified. E.g. –dependency=afterok:123456 | 
| –ntasks-per-core=2 | Enables hyperthreading. Only useful in special circumstances. | 
Job array
Job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily; job arrays with millions of tasks can be submitted in milliseconds (subject to configured size limits). All jobs must have the same initial options (e.g. size, time limit, etc.)
#SBATCH --array 1-200 #SBATCH --array 1-200%5 # %N suffix where N is the number of active tasks
Variables
| SLURM_ARRAY_JOB_ID | will be set to the first job ID of the array | 
| SLURM_ARRAY_TASK_ID | will be set to the job array index value | 
| SLURM_ARRAY_TASK_COUNT | will be set to the number of tasks in the job array | 
| SLURM_ARRAY_TASK_MAX | will be set to the highest job array index value | 
| SLURM_ARRAY_TASK_MIN | will be set to the lowest job array index value | 
Job script example
#!/bin/bash -l # Give your job a name, so you can recognize it in the queue overview #SBATCH --job-name=example # Define, how many nodes you need. Here, we ask for 1 node. # Each node has 8 cores. #SBATCH --nodes=1 # You can further define the number of tasks with --ntasks-per-* # See "man sbatch" for details. e.g. --ntasks=4 will ask for 4 cpus. #SBATCH --ntasks=4 # How much memory you need. # --mem will define memory per node and # --mem-per-cpu will define memory per CPU/core. Choose one of those. #SBATCH --mem=5GB ##SBATCH --mem-per-cpu=1500MB # this one is not in effect, due to the double hash # Turn on mail notification. There are many possible self-explaining values: # NONE, BEGIN, END, FAIL, ALL (including all aforementioned) # For more values, check "man sbatch" ##SBATCH --mail-type=END,FAIL # this one is not in effect, due to the double hash # You may not place any commands before the last SBATCH directive # Define workdir for this job WORK_DIRECTORY=/home/${USER}/test cd ${WORK_DIRECTORY} # This is where the actual work is done. In this case, the script only waits. # The time command is optional, but it may give you a hint on how long the # command worked time sleep 10 #sleep 10 # Finish the script exit 0
Interactive mode
Get interactive access to shell on compute node
srun --nodes=1 --ntasks-per-node=1 --time=01:00:00 --pty bash -i # Request specific node by name srun --nodelist=node2 --ntasks-per-node=1 --time=01:00:00 --pty bash -i




