Non-MPI Jobs

In GRAVITON, non-MPI jobs refer to those that do not require inter-node communication or distributed memory execution. These are typically serial jobs (single process), or multi-process jobs that run entirely within a single node or across isolated CPU cores.

Examples include: - Serial programs that run on a single core - Shared-memory parallel jobs using OpenMP or multi-threading (e.g., Python multiprocessing, ROOT TThread)

Example 1: Multiple Independent Tasks

Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job:

#!/bin/bash
#SBATCH --job-name=myjob
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=cosmo
#SBATCH --ntasks=25
#SBATCH --cpus-per-task=1
#SBATCH --time=10:00:00

./my_program

Explanation

  • --ntasks=25: Requests 25 independent processes, each running ./my_program once (useful for embarrassingly parallel workloads).

  • --cpus-per-task=1: Each task will use a single CPU core.

  • --qos=cosmo: Assigns the job to a serial-friendly partition.

  • --time=: Defines the job’s maximum runtime.

In this example, SLURM will allocate enough resources (e.g., a node or part of a node) to launch 25 tasks, and execute ./my_program 25 times concurrently (one per task).

Submitting the job:

sbatch job.sh

SLURM will manage the task distribution internally, and all standard output/error will be written to the files specified with --output and --error.

Example 2: Single Task Using Multiple CPU Cores (Multithreaded or OpenMP)

Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job that uses multiple CPU cores within a single task (e.g., via OpenMP, multithreading, or Python multiprocessing):

#!/bin/bash
#SBATCH --job-name=my_multicore_job
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=hep
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=25
#SBATCH --time=10:00:00

./my_program

Explanation

  • --ntasks=1: Only one task is launched.

  • --cpus-per-task=25: That single task is allocated 25 CPU cores to run a multi-threaded or parallelized workload.

  • --qos=hep and --time= as in previous examples.

This pattern is ideal for programs that:

  • Use multi-threading (OpenMP, TBB, etc.)

  • Spawn multiple processes within one job (e.g., Python multiprocessing)

  • Internally manage task parallelism (e.g., MadGraph, Pythia with internal multicore support)

Job submission remains the same:

sbatch job.sh

Example 3: Double memory

Below is a GRAVITON-compliant SLURM batch script for running a non-MPI job that uses multiple CPU cores and requests double memory per core:

#!/bin/bash
#SBATCH --job-name=my_highmem_job
#SBATCH --output=slurm_logs/job_%j.out
#SBATCH --error=slurm_logs/job_%j.err
#SBATCH --qos=hep
#SBATCH --ntasks=25
#SBATCH --cpus-per-task=1
#SBATCH --constraint=double_mem
#SBATCH --time=10:00:00

./my_program

Explanation

  • --ntasks=25 and --cpus-per-task=1: Launches a single task with 25 CPU cores.

  • --constraint=double_mem: Requests double the default memory per core, which in this case results in 7.6 GB × 25 cores = 190 GB.

  • --qos=hep: Indicates the job will run in the serial partition, since the core count does not exceed 56.

Submit the job as usual:

sbatch job.sh