Sol Hardware - How to Request

Overview

Sol is a homogeneous supercomputer. Homogeneous supercomputers feature processors and interconnects that are of the same type, brand, and architecture. This uniformity simplifies system management and optimization. This page describes the hardware within Sol for reference:

Node Type	CPU	Memory	Accelerator
Standard Compute	128 Cores (2x AMD EPYC 7713 Zen3)	512 GiB	N/A
High Memory	128 Cores (2x AMD EPYC 7713 Zen3)	2048 GiB	N/A
GPU A100	48 Cores (2x AMD EPYC 7413 Zen3)	512 GiB	4x NVIDIA A100 80GiB
GPU A30	48 Cores (2x AMD EPYC 7413 Zen3)	512 GiB	3x NVIDIA A30 24GiB
GPU MIG	48 Cores (2x AMD EPYC 7413 Zen3)	512 GiB	16x NVIDIA A100 sliced into 20GiB and 10GiB
Xilinx FPGA	48 Cores (2x AMD EPYC 7443 Zen3)	256 GiB	1x Xilinx U280
Bitaware FPGA	52 Cores (Intel Xeon Gold 6230R)	376 GiB	1x BittWare 520N-MX
NEC FPGA	48 Cores (2x AMD EPYC 9274F Zen4)	512 GiB	1x NEC Vector Engine
GraceHopper	72 Cores (NVIDIA Grace CPU aarch64)	512 GiB	1x NVIDIA GH200 480GB
GPU MI200	24 Cores (AMD EPYC 9254)	77 GiB	2x AMD MI200

ℹ️ There is privately-owned hardware that may have slightly different specs. See the Sol Status Page for the full features of every node.

⚠️ Requesting too many resources leads to lengthier job queueing time (wait time until start). Check the efficiency of a completed test job can help with determining an appropriate amount of resources to request.

Requesting Resources

Requesting CPUs

To request a given number of CPUs sharing the same node, you can use the following in your SBATCH:

#SBATCH -N 1  # Number of Nodes
#SBATCH -c 5  # Number of Cores per task
or
interactive -N 1 -c 5

This wil create a job with 5 CPU cores on one node.

To request a given number of CPUs spread across multiple nodes, you can use the following:

#SBATCH -N 2-4    # number of nodes to allow tasks to spread across (MIN & MAX)
#SBATCH -n 10    # number of TASKS
#SBATCH -c 5     # CPUs per TASK
or
interactive -N 2-4 -n 10 -c 5

The above example will allocate a total of 50 cores spread across as few as 2 nodes or as many as 4 nodes.

Take note of the inclusion or omission of -N:

#SBATCH -c 5     # CPUs per TASK
#SBATCH -n 10    # number of TASKS
or
interactive -n 10 -c 5

This reduced example will still allocate 50 cores, 5 cores per task on any number of available nodes. Note, that unless you are using MPI-aware software, you will likely prefer to always add -N, to ensure that each job worker has sufficient connectivity.

-c and -n have similar effects in Slurm in allocating cores, but -n is the number of tasks, and -c is the number of cores per task. MPI processes bind to a task, so the general rule of thumb is for MPI jobs to allocate tasks, while serial jobs allocate cores, and hybrid jobs allocate both.

See the official Slurm documentation for more information: Slurm Workload Manager - sbatch

Requesting Memory

Cores and memory are de-coupled: if you need only a single CPU core but ample memory, you can do so like this:

#SBATCH -c 1
#SBATCH -N 1
#SBATCH --mem=120G
or
interactive -N 1 -c 1 --mem=120G

If you do not specify --mem, you will be allocated 2GiB per CPU core OR 24GiB per GPU.

To request more than 512GiB of memory, you will need to use the highmem partition.

#SBATCH -p highmem
#SBATCH --mem=1400G

To request all available memory on a node:

❌ This will allocate all CPU cores memory (up to 2TiB depending on the node) to your job. This will prevent any other jobs from landing on this node. Only use this if you truly need that much memory.

#SBATCH --exclusive
#SBATCH --mem=0

Requesting GPUs

To request a GPU, you can specify the -G option within your job request.

This will allocate the first available GPU that fits your job request:

#SBATCH -G 1
or 
interactive -G 1

To request multiple GPUs specify a number greater than 1:

#SBATCH -G 4
or 
interactive -G 4

To request a specific number of GPUs per node when running multi-node:

#SBATCH -N 2              # Request two nodes
#SBATCH --gpus-per-node=2 #Four total GPUs, two per node

To request a specific type of GPU (a100 for example):

#SBATCH -G a100:1
or
interactive -G a100:1

GPU Varieties Available

Below is a table demonstrating the available GPU instance sizes you can allocate:

GPU Name	GPU Memory	Slice Count
a100	80GB, 40GB	4 per node, NVLINKed
a30	24GB	4 per node, NVLINKed
1g.20gb	20GB	4 per node
2g.20gb	20GB	12 per node
h100	96GB	4-8 per node, NVLINKed (Privately Owned)
mi200	64GB	2 per node

The a100s can come in two varieties, as seen above.

To guarantee a 80GB a100, include this feature: #SBATCH -C a100_80. This can be done also with interactive -C a100_80 (a100_40 is also provided). To request more than one a100s while specifying the variety:

$ interactive -G a100:2 -C a100_80
or
#SBATCH -G a100:2
#SBATCH -C a100_80

Using the MI200

$ salloc -G mi200:1 -q public -Lmi200 -p general
$ ml mamba
$ source activate pytorch-2.5.1-rocm
python
>>> import torch
>>> torch.cuda.get_device_name(0)
'AMD Instinct MI200'
>>> torch.cuda.is_available()
True

ℹ️ MI200 is not CUDA-compatible. However, some software packages such as pytorch offer compatibility with this GPU by using the cuda interface, as shown in the above example of a ROCM-built mamba environment.

Requesting FPGAs

Sol has two nodes with a Field Programmable Gate Array (FPGA) accelerator. One is an Intel-based node with a Bitaware 520N-MX FPGA, the other is an AMD-based node with a Xilinx U280. Because there is only FPGA per node, it is recommended to allocate the entire node.

Bitware:

#SBATCH --exclusive
#SBATCH -L bittware

or

interactive --exclusive -L bittware

Xilinx:

#SBATCH --exclusive
#SBATCH -L xilinx

or

interactive --exclusive -L xilinx

NEC Vector Engine:

#SBATCH -L vector
#SBATCH -G ve
#SBATCH -p general
#SBATCH -q public
#SBATCH --mem=0
#SBATCH -c 48

or

interactive -Lvector -c 48 --mem=0 -G ve -p general -q public

Or via the web portal using the “Additional Sbatch options” section:

Note there should not be a space between “-L” and the FPGA name on the web portal.

Naming Convention

Requesting the Grace Hopper ARM

#SBATCH --exclusive
#SBATCH -p highmem
#SBATCH -L gracehopper
#SBATCH -G 1

interactive --exclusive -L gracehopper -G 1 -p highmem

This node uses ARM architecture (aarch64) and is not compatible with x86_64 binaries.

Additional Help

If you require further assistance, contact the Research Computing Team:

Ticket-based support via RTO Request Help Portal.
Slack support via the #rc-support channel in the ASU Research Computing workspace.
Weekly office hours for one-on-one assistance.

We also offer Educational Opportunities and Workshops.

Overview​

Requesting Resources​

Requesting CPUs​

Requesting Memory​

Requesting GPUs​

GPU Varieties Available​

Using the MI200​

Requesting FPGAs​

Requesting the Grace Hopper ARM​

Additional Help​