April 2016
Intermediate to advanced
170 pages
3h 48m
English
As mentioned in the previous section, you cannot typically run code directly on an HPC cluster but rather must submit a request to run that code to a job scheduler. The job scheduler identifies appropriate compute resources for our application and runs our code on those nodes.
This level of indirection introduces some overhead but also guarantees that every user gets a fair share of the supercomputer time, job priorities are enforced, and that the many cores are kept busy.
The following figure shows the basic components of a job scheduler (for example, PBS or HTCondor) as well as the sequence of events from job submission to execution:
First, let's look at a few definitions: