High Performance Computing

This guide provides documentation that will get you up and running on our systems, as soon as possible. 

batch is a compilation of 60 compute nodes of varying age, CPU and Memory structure, ranging from 12 - 48 CPUs per node, and ~64 - 256GB of ram. CPU architecture also spans  sse4_2, avc, av2, and avx512. Therefore, certain programs may run incorrectly or fail to spawn on different nodes which should be taken into consideration when submitting a script. Please contact us if you are having trouble getting a program to compile or execute on a specific node. 

Gaining Access

To get access to the computing and storage resources, send an email with your NetID to askIT@albany.edu 

Operating System

batch runs the Oracle RHEL 7 linux operating system, so all software used must be compatible with linux to use this system. All compute nodes are using the same linux image and base packages. If you are not familiar with linux please schedule a consultation or follow one of the many introduction to linux courses or videos available on youtube. 

Connecting to batch

Access to batch is provided by first connecting to a headnode over Secure SHell (SSH). SSH can also be used to copy data to or from a given directory on batch, using scp or sftp. Two headnodes are provided for accessing storage and computational resources: head.arcc.albany.edu and headnode7.arcc.albany.edu. The headnodes should not be used to run computations, all computations must be submitted to batch via the scheduler. On a mac or linux operating system, you can connect via SSH through the terminal, and on windows you can use PuTTY or X2GO. For more information, see the How-to: Connect via SSH (PuTTY, macOS terminal) guide.  

batch is only accessible from within the University at Albany network. To connect from an external network, you must use the VPN. Please read more at the VPN (Virtual Private Network) page. 

For a live view of cluster usage, please see: https://computing.app.arcc.albany.edu/

Storage Space

Each user has a $HOME (/network/rit/home/[netid]) directory, with a quota of 10GB. The $HOME directory is backed up with "snapshot", which are daily for 21 days and hourly for the past 23 hours. If you need to restore a file, please see the How-to: Restoring a deleted file or directory guide.

Each research faculty member can apply for a $LAB (/network/rit/lab/[lab_name]) directory. The $LAB directory has a default quota of 10TB, and can be shared with members in the university as well as external users, by request of the directory owner. The $LAB directory is also backed up with "snapshot",  which are daily for 21 days and hourly for the past 23 hours. To request a $LAB directory, please fill out the Research Storage Request form. To see your storage footprint including your snapshot space, please visit https://storage.app.arcc.albany.edu.

If you would like to access your storage drives from off campus, please see How-to: Mapping to a network drive

Note that snapshot space does impact the overall directory quota. If you need snapshots purged, please contact askIT@albany.edu

Running jobs

To run code on batch, you will have to submit a job to the scheduler, no computations should be performed on the headnodes. batch uses the SLURM scheduler to assign resources to each job, and manage the job queue. When requesting resources, the smallest unit that can be requested is 1 CPU and 100mb of memory. The maximum request is up to three compute nodes. If you need help optimizing your job scheduling, please contact askIT@albany.edu. For more information, please see How-to: Scheduling via SLURM.

 

Â