Differences

This shows you the differences between two versions of the page.

--- abstract:farber:runjobs:runjobs [2017-12-05 09:28] – sraskar
+++ abstract:farber:runjobs:runjobs [2018-08-09 15:01] (current) – [What is a Job?] anita
@@ Line 1: / Line 1: @@
-<booktoc/>
+====== Running applications on Farber ======
-====== Running applications ======
-<booktoc/>
 ====== Introduction ======
@@ Line 20: / Line 17: @@
   * a set of environment variables
-For an //[[20_interactive|interactive job]]//, the user manually types the sequence of commands once the job is eligible for execution.  If the necessary resources for the job are not immediately available, then the user must wait; when resources are available, the user must be present at his/her computer in order to type the commands.  Since the job scheduler does not care about the time of day, this could happen anytime, day or night.
+For an //[[abstract:farber:runjobs:schedule_jobs#interactive-jobs-qlogin|interactive job]]//, the user manually types the sequence of commands once the job is eligible for execution.  If the necessary resources for the job are not immediately available, then the user must wait; when resources are available, the user must be present at his/her computer in order to type the commands.  Since the job scheduler does not care about the time of day, this could happen anytime, day or night.
-By comparison, a //[[30_batch|batch job]]// does not require the user be awake and at his or her computer:  the sequence of commands is saved to a file, and that file is given to the job scheduler.  A file containing a sequence of shell commands is also known as a //script//, so in order to run batch jobs a user must become familiar with //shell scripting//.  The benefits of using batch jobs are significant:
+By comparison, a //[[abstract:farber:runjobs:schedule_jobs#batch-jobs-qsub|batch job]]// does not require the user be awake and at his or her computer:  the sequence of commands is saved to a file, and that file is given to the job scheduler.  A file containing a sequence of shell commands is also known as a //script//, so in order to run batch jobs a user must become familiar with //shell scripting//.  The benefits of using batch jobs are significant:
   * a //job script// can be reused (versus repeatedly having to type the same sequence of commands for each job)
@@ Line 37: / Line 34: @@
 When submitting a job to Grid Engine, a user can explicitly specify which queue to use:  doing so will place that queue's resource restrictions (e.g. maximum execution time, maximum memory) on the job, even if they are not appropriate.  Usually it is easier if the user specifies what resources his or her job requires and lets Grid Engine choose an appropriate queue.
-===== Getting Help =====
+===== Job scheduling system =====
-Grid Engine includes man pages for all of the commands that will be reviewed in this document.  When logged-in to a cluster, type
+A job scheduling system is used to manage and control the computing resources for all jobs submitted to a cluster. This includes load balancing, limiting resources, reconciling requests for memory and processor cores with availability of those resources, suspending and restarting jobs, and managing jobs with different priorities.
-<code bash>
+Each investing-entity's group (workgroup) has owner queues that allow the use a fixed number of slots to match the total number of cores purchased.  If a job is submitted that would use more than the slots allowed, the job will wait until enough slots are made available by completed jobs.  There is no time limit imposed on owner queue jobs.  All users can see running and waiting jobs, which allows groups to work out policies for managing purchased nodes.
-[traine@mills ~]$ man qstat
-</code>
-to learn more about a Grid Engine command (in this case, ''qstat'').  Most commands will also respond to the ''-help'' command-line option to provide a succinct usage summary:
+The standby queues are available for projects requiring more slots than purchased, or to take advantage of idle nodes when a job would have to wait in the owner queue.  Other workgroup nodes will be used, so standby jobs have a time limit, and users are limited to a total number of cores for all of their standby jobs.  Generally, users can use 10 nodes for an 8 hour standby job or 40 nodes for a 4 hour standby job.
-<code base>
+A spillover queue may be available for the case where a job is submitted to the owner queue, and there are standby jobs consuming needed slots. Instead of waiting, the jobs will be sent to the spillover queue to start on a similar idle node.
-[traine@mills ~]$ qstat -help
-usage: qstat [options]
-        [-cb]                             view additional binding specific parameters
-        [-ext]                            view additional attributes
-           :
-</code>
-//This section uses the wiki's [[00_conventions|documentation conventions]].//
-===== Introduction =====
-The Grid Engine job scheduling system is used to manage and control the computing resources for all jobs submitted to a cluster. This includes load balancing, reconciling requests for memory and processor cores with availability of those resources, suspending and restarting jobs, and managing jobs with different priorities. Grid Engine on Farber is Univa Grid Engine but still referred to as SGE.
-[[:general:jobsched:grid-engine:start|Grid Engine job scheduling system]] provides an excellent overview of Grid Engine which is the job scheduling system used on Farber.
-In order to schedule any job (interactively or batch) on a cluster, you must set your [[general/userguide/04_compute_environ?&#using-workgroup-and-directories|workgroup]] to define your cluster group or //investing-entity// compute nodes.
+==== Grid Engine ====
-See [[general:/userguide:06_runtime_environ?&#scheduling-jobs|Scheduling Jobs]] and [[general:/userguide:06_runtime_environ?&#managing-jobs|Managing Jobs]] for general information about getting started with scheduling and managing jobs on a cluster using Grid Engine.
+The Grid Engine job scheduling system is used to manage and control the computing resources for all jobs submitted to a cluster. This includes load balancing, reconciling requests for memory and processor cores with availability of those resources, suspending and restarting jobs, and managing jobs with different priorities. Grid Engine on Farber is Univa Grid Engine but still referred to as SGE.
+In order to schedule any job (interactively or batch) on a cluster, you must set your [[abstract/farber/system_access/system_access#logging-on-to-farber|workgroup]] to define your cluster group or //investing-entity// compute nodes.
+See [[abstract/farber/runjobs/schedule_jobs|Scheduling Jobs]] and [[abstract/farber/runjobs/job_status|Managing Jobs]] on the <html><span style="color:#ffffff;background-color:#2fa4e7;padding:3px 7px !important;border-radius:4px;">sidebar</span></html> for general information about getting started with scheduling and managing jobs on a cluster using Grid Engine.
 ===== Runtime environment =====
@@ Line 79: / Line 68: @@
   - include lines in the file to be submitted to the qsub.
 </note>
-===== Job scheduling system =====
-A job scheduling system is used to manage and control the computing resources for all jobs submitted to a cluster. This includes load balancing, limiting resources, reconciling requests for memory and processor cores with availability of those resources, suspending and restarting jobs, and managing jobs with different priorities.
-Each investing-entity's group (workgroup) has owner queues that allow the use a fixed number of slots to match the total number of cores purchased.  If a job is submitted that would use more than the slots allowed, the job will wait until enough slots are made available by completed jobs.  There is no time limit imposed on owner queue jobs.  All users can see running and waiting jobs, which allows groups to work out policies for managing purchased nodes.
-The standby queues are available for projects requiring more slots than purchased, or to take advantage of idle nodes when a job would have to wait in the owner queue.  Other workgroup nodes will be used, so standby jobs have a time limit, and users are limited to a total number of cores for all of their standby jobs.  Generally, users can use 10 nodes for an 8 hour standby job or 40 nodes for a 4 hour standby job.
+===== Getting Help =====
-A spillover queue may be available for the case where a job is submitted to the owner queue, and there are standby jobs consuming needed slots. Instead of waiting, the jobs will be sent to the spillover queue to start on a similar idle node.
+Grid Engine includes man pages for all of the commands that will be reviewed in this document.  When logged-in to a cluster, type
-A spare queue may be on a cluster to make spare nodes available to users, by special request.
+<code bash>
+[traine@farber ~]$ man qstat
-Each cluster is configured with a particular job scheduling system. General documentation is available for all [[:general:start#job-scheduling-systems|job scheduling systems]] currently in use.
+</code>
+to learn more about a Grid Engine command (in this case, ''qstat'').  Most commands will also respond to the ''-help'' command-line option to provide a succinct usage summary:
+<code base>
+[traine@farber ~]$ qstat -help
+usage: qstat [options]
+        [-cb]                             view additional binding specific parameters
+        [-ext]                            view additional attributes
+           :
+</code>
+//This section uses the wiki's [[http://docs-dev.hpc.udel.edu/doku.php#documentation-conventions|documentation conventions]].//