Non-MPI Jobs
In GRAVITON, non-MPI jobs refer to those that do not require inter-node communication or distributed memory execution. These are typically serial jobs (single process), or multi-process jobs that run entirely within a single node or across isolated CPU cores.
Examples include: - Serial programs that run on a single core - Shared-memory parallel jobs using OpenMP or multi-threading (e.g., Python multiprocessing, ROOT TThread)
Recommended QOS
To submit non-MPI jobs, you must use one of the QOS types reserved for this purpose: s6h
(normal priority) or s24h
(low priority). These QOS values determine the job’s scheduling behavior and resource limits. The differences between them are described in the architecture section of the documentation.
Example 1: Multiple Independent Tasks
Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job:
#!/bin/bash
#SBATCH --job-name=myjob
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=s6h
#SBATCH --ntasks=15
#SBATCH --cpus-per-task=1
#SBATCH --time=5:00:00
./my_program
Explanation
--ntasks=15
: Requests 15 independent processes, each running./my_program
once (useful for embarrassingly parallel workloads).--cpus-per-task=1
: Each task will use a single CPU core.--qos=s6h
: Assigns the job to a serial-friendly partition.--time=5:00:00
: Defines the job’s maximum runtime.
In this example, SLURM will allocate enough resources (e.g., a node or part of a node) to launch 15 tasks, and execute ./my_program
15 times concurrently (one per task).
Submitting the job:
sbatch job.sh
SLURM will manage the task distribution internally, and all standard output/error will be written to the files specified with --output
and --error
.
Example 2: Single Task Using Multiple CPU Cores (Multithreaded or OpenMP)
Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job that uses multiple CPU cores within a single task (e.g., via OpenMP, multithreading, or Python multiprocessing):
#!/bin/bash
#SBATCH --job-name=my_multicore_job
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=s24h
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=25
#SBATCH --time=10:00:00
./my_program
Explanation
--ntasks=1
: Only one task is launched.--cpus-per-task=25
: That single task is allocated 25 CPU cores to run a multi-threaded or parallelized workload.--qos=s24h
and--time=
as in previous examples.
This pattern is ideal for programs that:
Use multi-threading (OpenMP, TBB, etc.)
Spawn multiple processes within one job (e.g., Python multiprocessing)
Internally manage task parallelism (e.g., MadGraph, Pythia with internal multicore support)
Job submission remains the same:
sbatch job.sh
Example 3: Double memory
Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job that uses multiple CPU cores and requests double memory per core:
#!/bin/bash
#SBATCH --job-name=my_highmem_job
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=s24h
#SBATCH --ntasks=25
#SBATCH --cpus-per-task=1
#SBATCH --constraint=double_mem
#SBATCH --time=10:00:00
./my_program
Explanation
--ntasks=25
and--cpus-per-task=1
: Launches a single task with 25 CPU cores.--constraint=double_mem
: Requests double the default memory per core, which in this case results in 7.6 GB × 25 cores = 190 GB.--qos=s24h
: Indicates the job will run in the serial partition, since the core count does not exceed 56.
Submit the job as usual:
sbatch job.sh