MPI Jobs on GRAVITON
GRAVITON supports MPI-based parallel execution for both single-node and multi-node jobs. Depending on the number of cores requested, the job will be automatically directed to the appropriate partition.
Partition and QOS Mapping
Jobs requesting 56 cores or fewer are considered single-node jobs. These jobs are automatically assigned to the
serial
partition.To run this type of job, you must specify one of the following QOS options:
cosmo
hep
std
Each QOS corresponds to a specific user group or scientific domain. See the architecture section for details on their intended use and limits.
Jobs requesting more than 56 cores are treated as multi-node jobs, and must explicitly request the
lattice
QOS to access theparallel
partition, which enables high-speed inter-node communication via InfiniBand.
The system will determine the partition and resource allocation based on your QOS and core count. Users must not manually specify ``–nodes`` or ``–partition``, as these are automatically managed.
Example 1: MPI Job on a Single Node
This example runs an MPI job using up to 56 cores on a single node.
#!/bin/bash
#SBATCH --job-name=mpi_single
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=hep
#SBATCH --ntasks=32
#SBATCH --cpus-per-task=1
#SBATCH --time=01:00:00
export PATH=/usr/mpi/gcc/openmpi-4.1.7rc1/bin:$PATH
export LD_LIBRARY_PATH=/usr/mpi/gcc/openmpi-4.1.7rc1/lib:$LD_LIBRARY_PATH
srun ./my_mpi_program
Explanation
--ntasks=32
: 32 MPI processes will be launched on a single node.--qos=hep
: Suitable for serial-partition jobs (≤ 56 cores).srun
: Preferred launcher for MPI on GRAVITON.
Example 2: MPI Job on Multiple Nodes
This example shows how to launch a multi-node MPI job using more than 56 cores.
#!/bin/bash
#SBATCH --job-name=mpi_multi
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=lattice
#SBATCH --ntasks=128
#SBATCH --cpus-per-task=1
#SBATCH --time=02:00:00
export PATH=/usr/mpi/gcc/openmpi-4.1.7rc1/bin:$PATH
export LD_LIBRARY_PATH=/usr/mpi/gcc/openmpi-4.1.7rc1/lib:$LD_LIBRARY_PATH
srun ./my_mpi_program
Explanation
--ntasks=128
: The job will span multiple nodes with 128 MPI processes.--qos=lattice
: Required for access to theparallel
partition, which enables InfiniBand-based high-performance communication across nodes.srun
: Ensures tight integration with SLURM’s resource allocation.
Guidelines
Use
--qos=hep
for small/medium jobs that can fit within a single node (≤ 56 cores).Use
--qos=lattice
for large-scale MPI jobs requiring multiple nodes.Never manually specify
--nodes
or--partition
.Use
srun
instead ofmpirun
for better SLURM integration.
Using srun
vs mpirun
On GRAVITON, jobs that use MPI can be launched either using srun
(the native SLURM launcher) or mpirun
(the MPI runtime). However, it is strongly recommended to use ``srun`` whenever possible, as it provides better integration with the SLURM scheduler and resource allocation system.
Recommended usage:
srun ./my_program
This command will automatically launch the correct number of processes based on the value of --ntasks
and manage the communication environment accordingly.
Why prefer ``srun``?
srun
integrates directly with SLURM’s resource allocation.It ensures that task placement, environment variables, and resource constraints are applied correctly.
It avoids potential mismatches between what SLURM allocated and what MPI tries to use.
It supports better logging and accounting within SLURM.
When to use ``mpirun``?
In some advanced cases, or for programs compiled with specific MPI implementations that tightly couple to their own runtime (mpirun/mpiexec), it may still be necessary to use:
mpirun ./my_program
If using mpirun
, make sure it is from the same OpenMPI version provided by GRAVITON, and that your environment is correctly set:
export PATH=/usr/mpi/gcc/openmpi-4.1.7rc1/bin:$PATH
export LD_LIBRARY_PATH=/usr/mpi/gcc/openmpi-4.1.7rc1/lib:$LD_LIBRARY_PATH
In both cases, SLURM will still enforce the resources defined in your sbatch
script.
Summary
Launcher |
Recommended Use Case |
---|---|
|
Preferred for most MPI jobs on GRAVITON |
|
Use only if your program requires it or
has issues when launched via |