Research HPC Cluster (Aoraki)¶
Shared computing resources available to Otago researchers include high performance computing, fast storage, GPUs and virtual servers.
Otago Resources¶
The RTIS Research cluster provides researchers with access to shared resources, such as CPUs, GPUs, and high-speed storage. Also available are specialised software and libraries optimised for scientific and datascience computing.
If you need special software or configurations please ask the RTIS team at rtis.solutions@otago.ac.nz
Cluster Overview¶
We offer a variety of SLURM partitions based on different resource needs. The default partition provides balanced compute and memory capabilities. Additional partitions include those optimized for GPU usage and those with expanded memory capacity. On every cluser node there are 2 cores reseved for the OS (weka storage), reducing the available compute cores by 2.
PARTITION | TIMELIMIT [DD-HH:MM:SS] | NODES | NODELIST | MAX_CORES PER NODE | MAX_MEMORY [MB] PER NODE | GRES PER NODE |
---|---|---|---|---|---|---|
aoraki* | 7-00:00:00 | 9 | aoraki[01-09] | 126 | 1030000 | nan |
aoraki_bigcpu | 14-00:00:00 | 5 | aoraki[15,20-23] | 254 | 1500000 | nan |
aoraki_bigmem | 14-00:00:00 | 5 | aoraki[14,17,24-26] | 126 | 2000000 | nan |
aoraki_long | 30-00:00:00 | 7 | aoraki[20-26] | 126 or 254 | 2000000 or 1500000 | nan |
aoraki_short | 1-00:00:00 | 3 | aoraki[11-12,16] | 80 | 600000 | nan |
aoraki_small | 7-00:00:00 | 4 | aoraki[18-19,27-28] | 40 | 800000 | nan |
aoraki_gpu | 7-00:00:00 | 2 | aoraki[11-12] | 126 | 1030000 | gpu:A100:2 |
aoraki_gpu_H100 | 7-00:00:00 | 1 | aoraki16 | 110 | 1030000 | gpu:H100:2 |
aoraki_gpu_L40 | 7-00:00:00 | 2 | aoraki[18-19] | 62 | 1030000 | gpu:L40:3 |
aoraki_gpu_A100_80GB | 7-00:00:00 | 2 | aoraki[11-12] | 126 | 1030000 | gpu:A100:2 |
aoraki_gpu_A100_40GB | 7-00:00:00 | 2 | aoraki[27-28] | 62 | 1030000 | gpu:A100:2 |
- Partition: Name of the partition, with an asterisk (*) denoting the default partition. Aoraki_small and aoraki_short are specialized partitions that utilize typically idle CPU cores on GPU nodes, designed to handle small or short-duration jobs efficiently.
- Time Limit: Maximum time a job can run in that partition. The time limit for running jobs can be extended upon request. In such cases, the extended time limit may exceed the partition's standard wall time.
- Nodes: Number of nodes available in the partition.
- Nodelist: The specific nodes allocated to that partition.
- Max Cores per Node: Maximum number of CPU cores available on a node.
- Max Memory (MB) per Node: The maximum amount of memory (in MB) available on each node in partition.
- GRES per Node: Generic Resources (e.g., GPUs) available in the partition.
Number | Node type | CPU | RAM | Extra |
---|---|---|---|---|
1 | aoraki-login | 2x 64 cores AMD EPYC 7763 | 1TB DDR4 3200 MT/s | nan |
9 | aoraki[01-9] | 2x 64 cores AMD EPYC 7763 | 1TB DDR4 3200 MT/s | nan |
2 | aoraki[11,12] | 2x 64 cores AMD EPYC 7763 | 1TB DDR4 3200 MT/s | 2x A100 80GB PCIe GPU per node cuda12.5 |
2 | aoraki[27,28] | 2x 32 cores AMD EPYC 7543 | 1TB DDR4 3200 MT/s | 2x A100 40GB PCIe GPU per node cuda12.5 |
5 | aoraki[14,17,24-26] | 2x 64 cores AMD EPYC 7763 | 2TB DDR4 2933 MT/s | nan |
5 | aoraki[15,20-23] | 2x 128 cores AMD EPYC 9754 | 1.5TB DDR5 4800 MT/s | nan |
1 | aoraki16 | 2x 56 cores Intel Xeon 8480+ | 1TB DDR5 4800 MT/s | 4x H100 80GB HBM3 GPU per node cuda12.4 |
2 | aoraki[18,19] | 2x 32 cores AMD EPYC 7543 | 1TB DDR4 3200 MT/s | 3x L40 48GB PCIe GPU per node cuda12.5 |
2 | standalone | 32 cores AMD Ryzen Threadripper PRO 3975WX | 128 GB DDR4 3200 MT/s | 1x RTX 3090 24GB cuda12.5 |
3 | standalone | 16 cores AMD Ryzen 9 5950X | 64 GB DDR4 3200 MT/s | 1x RTX 3090 24GB cuda12.5 |
2 | standalone | 2x 6 cores Intel Xeon CPU E5-2620 v3 | 256GB DDR4 3200 MT/s | 2x RTX A6000 48GB GPU per node |