Non-MPI Jobs
In GRAVITON, non-MPI jobs refer to those that do not require inter-node communication or distributed memory execution. These are typically serial jobs (single process), or multi-process jobs that run entirely within a single node or across isolated CPU cores.
Examples include: - Serial programs that run on a single core - Shared-memory parallel jobs using OpenMP or multi-threading (e.g., Python multiprocessing, ROOT TThread)
Recommended QOS
To submit non-MPI jobs, you must use one of the QOS types reserved for this purpose: cosmo
, hep
, or std
. These QOS values determine the job’s scheduling behavior and resource limits. The differences between them are described in the architecture section of the documentation.
Example 1: Multiple Independent Tasks
Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job:
#!/bin/bash
#SBATCH --job-name=myjob
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=cosmo
#SBATCH --ntasks=25
#SBATCH --cpus-per-task=1
#SBATCH --time=10:00:00
./my_program
Explanation
--ntasks=25
: Requests 25 independent processes, each running./my_program
once (useful for embarrassingly parallel workloads).--cpus-per-task=1
: Each task will use a single CPU core.--qos=cosmo
: Assigns the job to a serial-friendly partition.--time=
: Defines the job’s maximum runtime.
In this example, SLURM will allocate enough resources (e.g., a node or part of a node) to launch 25 tasks, and execute ./my_program
25 times concurrently (one per task).
Submitting the job:
sbatch job.sh
SLURM will manage the task distribution internally, and all standard output/error will be written to the files specified with --output
and --error
.
Example 2: Single Task Using Multiple CPU Cores (Multithreaded or OpenMP)
Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job that uses multiple CPU cores within a single task (e.g., via OpenMP, multithreading, or Python multiprocessing):
#!/bin/bash
#SBATCH --job-name=my_multicore_job
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=hep
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=25
#SBATCH --time=10:00:00
./my_program
Explanation
--ntasks=1
: Only one task is launched.--cpus-per-task=25
: That single task is allocated 25 CPU cores to run a multi-threaded or parallelized workload.--qos=hep
and--time=
as in previous examples.
This pattern is ideal for programs that:
Use multi-threading (OpenMP, TBB, etc.)
Spawn multiple processes within one job (e.g., Python multiprocessing)
Internally manage task parallelism (e.g., MadGraph, Pythia with internal multicore support)
Job submission remains the same:
sbatch job.sh
Example 3: Double memory
Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job that uses multiple CPU cores and requests double memory per core:
#!/bin/bash
#SBATCH --job-name=my_highmem_job
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=hep
#SBATCH --ntasks=25
#SBATCH --cpus-per-task=1
#SBATCH --constraint=double_mem
#SBATCH --time=10:00:00
./my_program
Explanation
--ntasks=25
and--cpus-per-task=1
: Launches a single task with 25 CPU cores.--constraint=double_mem
: Requests double the default memory per core, which in this case results in 7.6 GB × 25 cores = 190 GB.--qos=hep
: Indicates the job will run in the serial partition, since the core count does not exceed 56.
Submit the job as usual:
sbatch job.sh