All jobs on the general purpose cluster request resources via SLURM. SLURM, is open source software that allocates resources to users for their computations, provides a framework for starting, executing and monitoring compute jobs, and arbitrates contention for resources by managing a queue of pending work. SLURM is widely used in the high performance computing (HPC) landscape and it is likely you will encounter it outside of our systems.
Table of Contents |
---|
General Purpose Computing
...
head.arcc.albany.edu
headnode7.arcc.albany.edu
Or from the large memory machine:
lmm.rit.albany.edu
Resource information
All users have access to the "batch" partition for general purpose computing.
Info |
---|
The batch partition is comprised of 544 CPUs and 21 compute nodes. Note that a job can only request 3 nodes and may only be active for 14 days. If you need an exception to this, please contact arcc@albany.edu. |
Code Block | ||
---|---|---|
| ||
$ sinfo -p batch -o "%n, %c, %m" | sort PARTITION, HOSTNAMES, CPUS, MEMORY batch*, rhea-01, 24, 64133 batch*, rhea-02, 24, 64133 batch*, rhea-03, 24, 64133 batch*, rhea-04, 32, 96411 batch*, rhea-05, 32, 96411 batch*, rhea-06, 32, 96411 batch*, rhea-07, 40, 128619 batch*, rhea-08, 40, 128619 batch*, rhea-09, 48, 257627 batch*, rhea-10, 48, 257566 batch*, uagc12-01, 12, 64166 batch*, uagc12-02, 12, 64166 batch*, uagc12-03, 12, 64166 batch*, uagc12-04, 12, 64166 batch*, uagc12-05, 32, 128703 batch*, uagc19-01, 20, 94956 batch*, uagc19-02, 20, 94956 batch*, uagc19-03, 20, 94956 batch*, uagc19-04, 20, 94956 batch*, uagc19-05, 20, 94956 batch*, uagc19-06, 20, 94956 |
...
Info |
---|
This job ran on rhea-09, and it's max memory size was ~52 GB. That that I requested 60000MB, so I could refine this job to request slightly less memory. It ran for 14:50:14 and used about 350 CPU hours. |
Can I restrict my job to a certain CPU architecture?
Yes! Use the --constraint flag in #SBATCH. To few available architecture on individual nodes use scontrol show node
...