abstract:mills:runjobs:queues

Each investing-entity on a cluster has four owner queues that exclusively use the investing-entity's compute nodes. (They do not use any nodes belonging to others.) Grid Engine allows those queues to be selected only by members of the investing-entity's group.

There are also node-wise queues, standby, standby-4h, spillover-24core, spillover-48core and idle. Grid Engine allows users to use nodes belonging to other investing-entities. (The idle queue is currently disabled.)

When submitting a batch job to Grid Engine, you specify the resources you need or want for your job. You don't actually specify the name of the queue. Instead, you include a set of directives that specify your job's characteristics. Grid Engine then chooses the most appropriate queue that meets those needs.

The queue to which a job is assigned depends primarily on six factors:

  • Whether the job is serial or parallel
  • Which parallel environment (e.g., openmpi, threads) is needed
  • Which or how much of a resource is needed (e.g., max clock time, max memory)
  • Whether the job can be suspended and restarted by the system.
  • Whether the job is non-interactive or interactive
  • Whether you want to use idle nodes belonging to others.

For each investing-entity, the owner-queue names start with the investing-entity's name:

«investing_entity».q+ The default queue for non-interactive serial or parallel jobs. The primary queue for long-running jobs. These jobs must be able to be suspended and restarted by Grid Engine. They can be preempted by jobs submitted to the development queue, described next. Examples: all serial (single-core) jobs, openMPI jobs, openMP jobs or other jobs using the threads parallel environment.
«investing_entity».q A special queue for non-suspendable parallel jobs, such as MPICH. These jobs will not be preempted by others' job submissions.
«investing_entity»-qrsh.q A special queue for interactive jobs only. Jobs are scheduled to this queue when you use Grid Engine's qlogin command.
standby.q A special queue that spans all nodes, at most 240 slots per user. Submissions will have a lower priority than jobs submitted to owner-queues, and standby jobs will only be started on lightly-loaded nodes. These jobs will not be preempted by others' job submissions. Jobs will be terminated with notification after running for 8 hours of elapsed (wall-clock) time. Also see the standby-4h.q entry.
You must specify –l standby=1 as a qsub option. You must also use the -notify option if your jobs traps the USR2 termination signal.
standby-4h.q A special queue that spans all nodes, at most 816 slots per user. Submissions will have a lower priority than jobs submitted to owner-queues, and standby jobs will only be started on lightly-loaded nodes. These jobs will not be preempted by others' job submissions. Jobs will be terminated with notification after running for 4 hours of elapsed (wall-clock) time.
You must specify –l standby=1 as a qsub option. And, if more than 240 slots are requested, you must also specify a maximum run-time of 4 hours or less via the -l h_rt=hh:mm:ss option. Finally, use the -notify option if your jobs traps the USR2 termination signal.
spillover-24core.q A special queue that spans all standard nodes (24 cores) and is used by Grid Engine to map jobs when requested resources are unavailable on standard nodes in owner queues, e.g., node failure or other standby jobs are using owner resources. Implemented on February 29, 2016 according to Mills End-of-Life Policy.
spillover-48core.q A special queue that spans all 4-socket nodes (48 cores) and is used by Grid Engine to map jobs when requested resources are unavailable on 48-core nodes in owner queues, e.g., node failure or other standby jobs are using owner resources. Owners of only 48-core nodes will not spillover to standard nodes. Implemented on February 29, 2016 according to Mills End-of-Life Policy.
spare.q A special queue that spans all nodes kept in reserve as replacements for failed owner-nodes. Temporary access to the spare nodes will be granted by request. When access is granted, the spare nodes will augment your owner nodes. Jobs on the spare nodes will not be preempted by others' job submissions, but may needed to be killed by IT. The owner of a job running on a spare node will be notified by email two hours before IT kills the job.
Be considerate in your use of the development queue. It may preempt 'q+' jobs being run by other users in your group if those jobs' computational resources are needed.

Most compute nodes on a Community Cluster are owned by investing entities (faculty and staff). Clusters generally contain a small number of spare nodes that act as temporary replacements for owned nodes undergoing repair or replacement. Jobs are usually not assigned to these nodes since at any time they may be needed in this capacity.

Community Cluster users can make use of these otherwise idle nodes by special request. For example, a user publishing a paper may need to quickly execute a few follow-up calculations that were prompted by the peer review process. The user has just two days in which to run the jobs. In this case, the user could send a request to IT for access to a cluster's spare nodes for the next two days.

Investing entity stakeholders can also request access to spare nodes on behalf of their entire group of users.

Of course, in that time should spare nodes be needed by IT to stand-in for offline owned nodes, jobs running on the spare nodes may need to be killed. So while the spare nodes represent on-demand resources that can be used for jobs with a deadline, the user runs the risk of jobs' being interrupted and possibly not being able to finish before that deadline.

If jobs running on spare nodes do need to be killed, IT will provide two hours notice via email to the jobs' owners.

Access to spare nodes can be requested by submitting a Research Computing Help Request specifying ''High-Performance Computing', selecting the appropriate cluster and specify the following information for the problem details.

  • the reason for requesting spare nodes
  • the cluster on which you will run your jobs
  • a brief description of the jobs that will be run
  • the date range during which the jobs will be run

For the example cited above, the user might write the following:

I am writing a paper for the Journal of Physical Chemistry that includes simulations of ammonia dissolved in water at high pressure. A reviewer has questioned my results and I need to run two more short simulations to refute his claims.

I would like to use the spare nodes on the Mills cluster starting as soon as possible and lasting two days. The simulations will run via Open MPI across four nodes and should each last about 12 hours.

Job requests are reviewed by IT before access to spare nodes is granted.

Once access is granted the spare nodes augment the owned nodes to which the user has access. No additional flags need to be specified in the user's job scripts or on the command line when the jobs are submitted. The spare nodes will behave as though they are owned nodes when the user's jobs are scheduled.

Interactive jobs will not be scheduled on spare nodes. Only batch jobs are permissible.
  • Spare node resources shall be granted on a per-request basis for a limited time period.
  • Access to spare nodes shall be granted following an IT review of the request.
  • Spare nodes augment those nodes already available to the user. Batch jobs will be scheduled on spare nodes without authorized users' having to explicitly request it.
  • Spare nodes may be repurposed at any time to replace owned nodes being repaired or replaced. In these instances, IT will e-mail users running jobs on the affected spare nodes two hours prior to killing those jobs.
  • abstract/mills/runjobs/queues.txt
  • Last modified: 2018-06-19 23:06
  • by anita