Cluster Jobs

  1. Queueing Policy
  2. Job Prioritization
  3. Limitations
  4. Partitions

Queueing Policy

All jobs are scheduled on a FIFO with backfill. The backfill scheduler will initiate a lower-priority job if doing so does not delay the expected initiation time of higher priority jobs; essentially using smaller jobs to fill holes in the resource allocation plan.

Job Prioritization

All jobs in the queue are assigned a priority that increases over time. The higher the priority the sooner a job is scheduled to run. This priority is a weighted sum of several factors. These factors include Fairshare and Job Age. The fairshare is a means of weighting jobs based on their allocated computing resources and the history of resources that have been consumed in the past. The job age is just a factor of the amount of time a job has been eligible to run but waited on resources to become available.

To view a jobs priority use the scontrol or sprio commands:

$ scontrol show job 580425
JobId=580425 Name=fac1000
  UserId=schelcj(966674) GroupId=biostat-users(1001)
  Priority=24944 Account=biostat QOS=biostat
  JobState=RUNNING Reason=None Dependency=(null)
  Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
  RunTime=00:58:06 TimeLimit=16-00:00:00 TimeMin=N/A
  SubmitTime=2012-03-22T13:02:57 EligibleTime=2012-03-22T13:02:57
  StartTime=2012-03-22T13:08:08 EndTime=2012-04-07T13:08:08
  PreemptTime=None SuspendTime=None SecsPreSuspend=0
  Partition=biostat-default AllocNode:Sid=idran:17807
  ReqNodeList=(null) ExcNodeList=(null)
  NodeList=cn029
  BatchHost=cn029
  NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:*
  MinCPUsNode=1 MinMemoryNode=2000M MinTmpDiskNode=0
  Features=(null) Gres=SAS Reservation=(null)
  Shared=OK Contiguous=0 Licenses=(null) Network=(null)
  Command=/home/schelcj/debug/sas/foo/./job.txt
  WorkDir=/home/schelcj/debug/sas/foo

$ sprio -j 580427 -l
JOBID     USER   PRIORITY        AGE  FAIRSHARE    JOBSIZE  PARTITION        QOS   NICE
580427     foo        561         49        513          0          0          0      0

There may be an additional factor applied to users whose faculty advisor has contributed nodes but at this time that is not the case. 

Limitations

  • Maximum wallclock time of 28 days for all jobs.
  • Maximum of 1000 jobs in the queue regardless of state (ie; running, pending) per user.
  • Unless requested each job defaults to one cpu core and 1GB of memory per job. You can request more with various options to salloc/srun/sbatch.
  • Time argument (–time=) is required otherwise default time of 10 minutes is applied to your job.
  • Maximum core usage, in aggregate, by a single user is 50% of the total cores available.

Partitions

Partitions in SLURM are groupings of computation nodes, similiar to queues in other schedulers, that are used for logical collections of resources. We have several partitions defined that are available for your jobs. Which partition you use depends on your job requirements or research group.

Name Nodes Access Group Description
biostat-default cn0[01–34] biostat-users Default partition for all users