How Specify Resources when Running Jobs on Roar
Jobs will run when dedicated resources are available on the compute nodes. Roar uses Moab and Torque for the scheduler and resource manager. Jobs can be either run in batch or interactive modes. Both are submitted using the qsub command.
Both batch and interactive jobs are required to provide a list of requested resources to the scheduler in order to be placed on a compute node with the correct resources available. These are given either in the submission script or on the command line. If these are given in a submission script, they must come before any non-PBS command.
Typical PBS Directives
PBS Directive | Description | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
#PBS -l walltime=HH:MM:SS | This specifies the maximum wall time (real time, not CPU time) that a job should take. If this limit is exceeded, PBS will stop the job. Keeping this limit close to the actual expected time of a job can allow a job to start more quickly than if the maximum wall time is always requested. | |||||||||||||||
#PBS -l pmem=SIZEgb | This specifies the maximum amount of physical memory used by any processor ascribed to the job. For example, if the job would run on four processors and each would use up to 2 GB (gigabytes) of memory, then the directive would read #PBS -l pmem=2gb. The default for this directive is 1 GB of memory. | |||||||||||||||
#PBS -l mem=SIZEgb | This specifies the maximum amount of physical memory used in total for the job. This should be used for single node jobs only. | |||||||||||||||
#PBS -l nodes=N:ppn=M | This specifies the number of nodes (nodes=N) and the number of processors per node (ppn=M) that the job should use. PBS treats a processor core as a processor, so a system with eight cores per compute node can have ppn=8 as its maximum ppn request. Note that unless a job has some inherent parallelism of its own through something like MPI or OpenMP, requesting more than a single processor on a single node is usually wasteful and can impact the job start time. | |||||||||||||||
#PBS -l nodes=N:ppn=M:O | This specifies the node type (node type=O). You can only specify the node type when using the "Open Queue". Node types available on Roar:
|
|||||||||||||||
#PBS -A allocName | This identifies the account to which the resource consumption of the job should be charged (SponsorID_collab). This flag is necessary for all job submissions. For jobs being submitted to a system’s open queue you should use -A open. | |||||||||||||||
#PBS -j oe | Normally when a command runs it prints its output to the screen. This output is often normal output and error output. This directive tells PBS to put both normal output and error output into the same output file. |
Node Types Available on Roar
Node Types | Specifications |
---|---|
Basic | 2.2 GHz Intel Xeon Processor, 24 CPU/server, 128 GB RAM, 40 Gbps Ethernet |
Standard | 2.8 GHz Intel Xeon Processor, 20 CPU/server, 256 GB RAM, FDR Infiniband, 40 Gbps Ethernet |
High | 2.2 GHz Intel Xeon Processor, 40 CPU/server, 1 TB RAM, FDR Infiniband, 10 Gbps Ethernet |
GPU | 2.5 GHz Intel Xeon Processor, 2 Nvidia Tesla K80 computing modules/server, 24 CPU/server, Double Precision, FDR Infiniband, 10 Gbps Ethernet |
PBS Environmental Variables
Jobs submitted will automatically have several PBS environment variables created that can be used within the job submission script and scripts within the job. A full list of PBS environment variables can be used by viewing the output of
env | grep PBS > log.pbsEnvVars
run within a submitted job.
Variable Name | Description |
---|---|
PBS_O_WORKDIR | The directory in which the qsub command was issued. |
PBS_JOBID | The job's id. |
PBS_JOBNAME | The job's name. |
PBS_NODEFILE | A file in which all relevant node hostnames are stored for a job. |
Variable Name | Description |
---|---|
PBS_O_WORKDIR | The directory in which the qsub command was issued. |
PBS_JOBID | The job's id. |
PBS_JOBNAME | The job's name. |
PBS_NODEFILE | A file in which all relevant node hostnames are stored for a job. |