Job Types on Fox
Fox is designed to run serial and small ("narrow") parallel jobs, in addition to GPU jobs. If you need to run "wider" parallel jobs, the national clusters is a better choice.
The basic allocation units on Fox are cpu, memory and GPU. For jobs that don't request GPUs or much memory, the "billing units" below are simply the number of cpus the job requests. For other jobs, see Projects and Accounting for how the units are calculated.
There are three types of jobs on Fox, for different needs:
Name | Description | Job limits | Max walltime | Priority |
---|---|---|---|---|
normal | default job type | 1--128 units | 5 days | normal |
accel | jobs needing GPUs | 1--128 units | 1 day | normal |
accel_long | long jobs needing GPUs | 1--128 units | 7 days | normal |
devel | development jobs (compiling, testing)[^1] | 1--32 units | 2 hours | high |
In this setting, we do not differentiate between batch and interactive jobs -- all of the above job types can be either.
Normal
- Allocation units: cpus and memory
- Job Limits:
- maximum 128 units
- Maximum walltime: 5 days
- Priority: normal
- Available resources:
- 24 nodes with 128 cpus and 501 GiB RAM, in total 3072 cpus and 11.7 TiB RAM.
- Parameters for sbatch/salloc:
- None, normal is the default
- Job Scripts: Normal jobs
This is the default job type. Most jobs are normal jobs.
Accel
- Allocation units: cpus, memory and GPUs
- Job Limits:
- maximum 128 units
- Maximum walltime: 1 day
- Priority: normal
- Available resources: 4 nodes with 96 or 64 cpus, 1006 or 503 GiB RAM and 4 GPUs, in total 320 cpus, 3018 GiB RAM and 16 GPUs.
- Parameters for sbatch/salloc:
--partition=accel
--gpus=N
with N being the number of GPUs.
- Job Scripts: Accel jobs
Accel jobs give access to use the GPUs.
Can be combined with --qos=devel
to get higher priority but maximum wall time (2h)
and resource limits of devel apply.
If you need to run GPU jobs longer than 24 hours, you can use the
accel_long job type instead. Alternatively, you can set the
environment varialbe FOX_ACCEL_AUTO_LONG
to 1, for instance by
adding the line export FOX_ACCEL_AUTO_LONG=1
to the file
~/.bash_profile
(creating the file, if needed). Then accel
jobs
asking for more than 24 hours walltime will automatically be changed
into accel_long
jobs. Please note that accel_long jobs only have
access to a subset of the GPU nodes, so do not specify more than 24
hours unless really needed.
Accel_long
- Allocation units: cpus, memory and GPUs
- Job Limits:
- maximum 128 units
- Maximum walltime: 7 days
- Priority: normal
- Available resources: 2 nodes with 96 cpus, 1006 GiB RAM and 4 GPUs, in total 192 cpus, 2012 GiB RAM and 8 GPUs.
- Parameters for sbatch/salloc:
--partition=accel_long
--gpus=N
with N being the number of GPUs.
- Job Scripts: Accel_long jobs
Accel_long jobs give access to use the GPUs, for long jobs. They only get access to a subset of the GPU nodes, so please only use accel_long if needed.
Can be combined with --qos=devel
to get higher priority but maximum wall time (2h)
and resource limits of devel apply.
Devel
- Allocation units: cpus and memory and GPUs
- Job Limits:
- maximum 32 units per job
- maximum 128 units in use at the same time
- maximum 2 running jobs per user
- Maximum walltime: 2 hours
- Priority: high
- Available resources: devel jobs can run on any node on Fox
- Parameters for sbatch/salloc:
--qos=devel
- Job Scripts: Devel jobs
This is meant for small, short development or test jobs. Devel jobs get higher priority for them to run as soon as possible. On the other hand, there are limits on the size and number of devel jobs.
Can be combined with --partition=accel
.
If you have temporary development needs that cannot be fulfilled by the devel or short job types, please contact us at ec-drift@uio.no.
Footnotes
[^1]: It is possible to combine devel with accel.
CC Attribution: This page is maintained by the University of Oslo IT FFU-BT group. It has either been modified from, or is a derivative of, "Job Types on Saga" by NRIS under CC-BY-4.0. Changes: Removed non-applicable sections "Bigmem" and "Optimist". Added datatable.