Worker Nodes use
This guide is intended as a comprehensive resource that provides both an overview and a practical introduction to using the SLURM job scheduler on the GRAVITON cluster.
In high-performance computing (HPC), efficient job scheduling is critical to maximizing resource usage and throughput. GRAVITON employs SLURM (Simple Linux Utility for Resource Management), a highly scalable and widely adopted open-source workload manager designed to allocate compute resources, manage job queues, and coordinate the execution of batch and parallel jobs across large clusters.
SLURM is particularly well-suited for scientific workloads requiring fine-grained control over resource allocation, parallel execution, and performance monitoring. It offers users powerful tools to submit, monitor, and manage jobs effectively, while providing system administrators with robust control over scheduling policies and cluster usage.
This documentation covers the key commands, job submission techniques, and best practices for working with SLURM on GRAVITON, from simple batch jobs to multi-node parallel workloads.