abstract:farber:runjobs:runjobs

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
abstract:farber:runjobs:runjobs [2017-12-03 18:48] sraskarabstract:farber:runjobs:runjobs [2018-08-09 15:01] (current) – [What is a Job?] anita
Line 1: Line 1:
-<booktoc/> +====== Running applications on Farber ======
-====== Running applications ======+
  
-//This section uses the wiki's [[00_conventions|documentation conventions]].// +====== Introduction ======
-===== Introduction =====+
  
-The Grid Engine job scheduling system is used to manage and control the computing resources for all jobs submitted to a clusterThis includes load balancing, reconciling requests for memory and processor cores with availability of those resources, suspending and restarting jobsand managing jobs with different prioritiesGrid Engine on Farber is Univa Grid Engine but still referred to as SGE.+The Grid Engine job scheduling system is used to manage and control the resources available to computational tasks The job scheduler considers each job's resource requests (memory, disk space, processor cores) and executes it as those resources become available.  The order in which jobs are submitted and a //scheduling priority// also dictate how soon the job will be eligible to execute The job scheduler may suspend (and later restart) some jobs in order to more quickly complete jobs with higher scheduling priority.
  
-[[:general:jobsched:grid-engine:start|Grid Engine job scheduling system]] provides an excellent overview of Grid Engine which is the job scheduling system used on Farber.+Without a job scheduler, a cluster user would need to manually search for the resources required by his or her job, perhaps by randomly logging-in to nodes and checking for other users' programs already executing thereon.  The user would have to "sign-out" the nodes he or she wishes to use in order to notify the other cluster users of resource availability((Historically, this is actually how some clusters were managed!)).  A computer will perform this kind of chore more quickly and efficiently than a human can, and with far greater sophistication.
  
-In order to schedule any job (interactively or batch) on a cluster, you must set your [[general/userguide/04_compute_environ?&#using-workgroup-and-directories|workgroup]] to define your cluster group or //investing-entity// compute nodes.+An outdated but still mostly relevant description of Grid Engine and job scheduling can be found in the first chapter of the [[http://docs.oracle.com/cd/E19957-01/820-0699/chp1-1|Sun N1™ Grid Engine 6.1 User's Guide]].
  
-See [[general:/userguide:06_runtime_environ?&#scheduling-jobs|Scheduling Jobs]] and [[general:/userguide:06_runtime_environ?&#managing-jobs|Managing Jobs]] for general information about getting started with scheduling and managing jobs on a cluster using Grid Engine. +===== What is a Job=====
  
-===== Runtime environment =====+In this context, a //job// consists of:
  
-Generally, your runtime environment (path, environment variables, etc.) should be the same as your compile-time environment. Usually, the best way to achieve this is to put the relevant VALET commands in shell scripts. You can reuse common sets of commands by storing them in shell script file that can be //sourced //from within other shell script files.+  * a sequence of commands to be executed 
 +  * list of resource requirements and other properties affecting scheduling of the job 
 +  * a set of environment variables
  
-<note important> +For an //[[abstract:farber:runjobs:schedule_jobs#interactive-jobs-qlogin|interactive job]]//the user manually types the sequence of commands once the job is eligible for execution.  If the necessary resources for the job are not immediately availablethen the user must wait; when resources are available, the user must be present at his/her computer in order to type the commands.  Since the job scheduler does not care about the time of day, this could happen anytime, day or night.
-If you are writing an executable script that does not have the **-l** option on the **bash** commandand you want to include VALET commands in your script, then you should include the line: +
-<code bash> +
-source /etc/profile.d/valet.sh +
-</code> +
-You do not need this command when you +
-  - type commandsor source the command file, +
-  - include lines in the file to be submitted to the qsub. +
-</note> +
-===== Job scheduling system =====+
  
-job scheduling system is used to manage and control the computing resources for all jobs submitted to a cluster. This includes load balancing, limiting resourcesreconciling requests for memory and processor cores with availability of those resourcessuspending and restarting jobs, and managing jobs with different priorities.+By comparison, a //[[abstract:farber:runjobs:schedule_jobs#batch-jobs-qsub|batch job]]// does not require the user be awake and at his or her computer:  the sequence of commands is saved to a file, and that file is given to the job scheduler.  A file containing a sequence of shell commands is also known as a //script//so in order to run batch jobs a user must become familiar with //shell scripting// The benefits of using batch jobs are significant:
  
-Each investing-entity's group (workgroup) has owner queues that allow the use a fixed number of slots to match the total number of cores purchased.  If a job is submitted that would use more than the slots allowed, the job will wait until enough slots are made available by completed jobs.  There is no time limit imposed on owner queue jobs.  All users can see running and waiting jobswhich allows groups to work out policies for managing purchased nodes.+  * a //job script// can be reused (versus repeatedly having to type the same sequence of commands for each job) 
 +  * when resources are granted to the job it will execute immediately (day or night)yielding increased job throughput
  
-The standby queues are available for projects requiring more slots than purchased, or to take advantage of idle nodes when a job would have to wait in the owner queue.  Other workgroup nodes will be used, so standby jobs have a time limit, and users are limited to a total number of cores for all of their standby jobs.  Generally, users can use 10 nodes for an 8 hour standby job or 40 nodes for a 4 hour standby job.+An individual's increased job throughput is good for all users of the cluster!
  
-A spillover queue may be available for the case where a job is submitted to the owner queue, and there are standby jobs consuming needed slots. Instead of waiting, the jobs will be sent to the spillover queue to start on a similar idle node.+===== Queues =====
  
-A spare queue may be on cluster to make spare nodes available to users, by special request.+At its most basic, a //queue// represents collection of computing entities (call them nodes) on which jobs can be executed.  Each queue has properties that restrict what jobs are eligible to execute within it:  a queue may not accept interactive jobs; a queue may place an upper limit on how long the job will be allowed to execute or how much memory it can use; or specific users may be granted or denied permission to execute jobs in a queue.
  
-Each cluster is configured with particular job scheduling systemGeneral documentation is available for all [[:general:start#job-scheduling-systems|job scheduling systems]] currently in use.+<note>Grid Engine uses a //cluster queue// to embody the common set of properties that define the behavior of queue The cluster queue acts as a template for the //queue instances// that exist for each node that executes jobs for the queue.  The term //queue// can refer to either of these, but in this documentation it will most often imply a //cluster queue//.</note>
  
-===== The job queues =====+When submitting a job to Grid Engine, a user can explicitly specify which queue to use:  doing so will place that queue's resource restrictions (e.g. maximum execution time, maximum memory) on the job, even if they are not appropriate.  Usually it is easier if the user specifies what resources his or her job requires and lets Grid Engine choose an appropriate queue.
  
-Each investing-entity on a cluster an //owner queue// that exclusively use the investing-entity's compute nodes. (They do not use any nodes belonging to others.) Grid Engine allows those queues to be selected only by members of the investing-entity's group.+===== Job scheduling system =====
  
-There are also node-wise queues//standby////standby-4h// and //spillover// Grid Engine allows users to use standard nodes belonging to other investing-entities.+A job scheduling system is used to manage and control the computing resources for all jobs submitted to a cluster. This includes load balancinglimiting resourcesreconciling requests for memory and processor cores with availability of those resources, suspending and restarting jobs, and managing jobs with different priorities.
  
-When submitting batch job to Grid Engine, you specify the resources you need or want for your job. **//You don't typically specify the name of the queue//**Instead, you include set of directives that specify your job's characteristicsGrid Engine then chooses the most appropriate queue that meets those needs.+Each investing-entity's group (workgroup) has owner queues that allow the use fixed number of slots to match the total number of cores purchased If job is submitted that would use more than the slots allowed, the job will wait until enough slots are made available by completed jobs There is no time limit imposed on owner queue jobs.  All users can see running and waiting jobs, which allows groups to work out policies for managing purchased nodes.
  
-The queue to which a job is assigned depends primarily on six factors:+The standby queues are available for projects requiring more slots than purchased, or to take advantage of idle nodes when a job would have to wait in the owner queue.  Other workgroup nodes will be used, so standby jobs have a time limit, and users are limited to a total number of cores for all of their standby jobs.  Generally, users can use 10 nodes for an 8 hour standby job or 40 nodes for a 4 hour standby job.
  
-  * Whether the job is serial or parallel +A spillover queue may be available for the case where a job is submitted to the owner queueand there are standby jobs consuming needed slots. Instead of waitingthe jobs will be sent to the spillover queue to start on a similar idle node.
-  * Which parallel environment (e.g.mpi, threads) is needed +
-  * Which or how much of a resource is needed (e.g.max clock time, memory requirements) +
-  * Resources your job will consume (e.g. an entire node, max memory usage) +
-  * Whether the job is non-interactive or interactive+
  
-For each investing-entity, the **owner-queue** names start with the investing-entity's name: 
  
-^   <<//investing_entity//>>''.q''  | The default queue for all jobs.| 
-^  ''standby.q''  | A special queue that spans all similar nodes.   Submissions will have a lower priority than jobs submitted to owner-queues, and standby jobs will only be started on lightly-loaded nodes.  These jobs will not be preempted by others' job submissions. Jobs will be terminated with notification after the required time limit is exceeded (wall-clock time < 8hrs ) | 
-^  ''standby-4h.q''  | A special queue that spans all similar nodes.  Submissions will have a lower priority than jobs submitted to owner-queues, and 4hr standby jobs will only be started on lightly-loaded nodes.  These jobs will not be preempted by others' job submissions. Jobs will be terminated with notification after after the required time limit.(wall-clock time < 4hrs ) | 
-^  ''spillover.q''  | A special queue that spans all standard nodes and is used by Grid Engine to map jobs when requested resources are unavailable on standard nodes in owner queues, e.g., other standby or spillover jobs are using owner resources. | 
-^  ''spare.q''  | A special queue that spans all nodes kept in reserve as replacements for failed owner-nodes. Temporary access to the spare nodes will be granted by request. When access is granted, the spare nodes will augment your owner nodes.  Jobs on the spare nodes will not be preempted by others' job submissions, but may needed to be killed by IT.| 
  
-**Details by cluster** 
  
-  * [[clusters:mills:runapps?&#the-job-queues |Mills]] +==== Grid Engine ====
-  * [[clusters:farber:runapps?&#the-job-queues |Farber]]+
  
 +The Grid Engine job scheduling system is used to manage and control the computing resources for all jobs submitted to a cluster. This includes load balancing, reconciling requests for memory and processor cores with availability of those resources, suspending and restarting jobs, and managing jobs with different priorities. Grid Engine on Farber is Univa Grid Engine but still referred to as SGE.
  
 +In order to schedule any job (interactively or batch) on a cluster, you must set your [[abstract/farber/system_access/system_access#logging-on-to-farber|workgroup]] to define your cluster group or //investing-entity// compute nodes.
  
 +See [[abstract/farber/runjobs/schedule_jobs|Scheduling Jobs]] and [[abstract/farber/runjobs/job_status|Managing Jobs]] on the <html><span style="color:#ffffff;background-color:#2fa4e7;padding:3px 7px !important;border-radius:4px;">sidebar</span></html> for general information about getting started with scheduling and managing jobs on a cluster using Grid Engine. 
  
-===== Scheduling Jobs =====+===== Runtime environment =====
  
-In order to schedule any job (interactively or batchon a cluster, you must set your [[general/userguide/04_compute_environ?&#using-workgroup-and-directories|workgroup]] to define your cluster group or //investing-entity// compute nodes.+Generally, your runtime environment (path, environment variables, etc.should be the same as your compile-time environment. Usually, the best way to achieve this is to put the relevant VALET commands in shell scripts. You can reuse common sets of commands by storing them in a shell script file that can be //sourced //from within other shell script files.
  
-==== Interactive jobs (qlogin) ==== +<note important> 
- +If you are writing an executable script that does not have the **-l** option on the **bash** command, and you want to include VALET commands in your script, then you should include the line:
-All interactive jobs should be scheduled to run on the compute nodes, not the login/head node. +
- +
-An interactive session (job) can often be made non-interactive ([[general:userguide:06_runtime_environ#batch-jobs-qsub|batch]]) by putting the input in a file, using the redirection symbols **<** and **>**, and making the entire command a line in a job script file: +
- +
-//program_name//  < //input_command_file//  > //output_command_file// +
- +
-Then the non-interactive ([[general:userguide:06_runtime_environ#batch-jobs-qsub|batch]]) job can be scheduled as a batch job. +
- +
-== Starting an interactive session == +
- +
-Remember you must specify your [[/general/userguide/04_compute_environ?&#using-workgroup-and-directories|workgroup]] to define your cluster group or //investing-entity// compute nodes before submitting any joband this includes starting an interactive session. Now use the Grid Engine command **qlogin** on the login (head) node. Grid Engine will look for a node with a free //scheduling slot// (processor core) and a sufficiently light load, and then assign your session to it. If no such node becomes available, your **qlogin** request will eventually time out. The **qlogin** command results in a job in the workgroup interactive serial queue, **<**//**investing_entity**//**>-****qrsh.q**.  +
- +
-Type+
 <code bash> <code bash>
-    workgroup -g //investing-entity//+source /etc/profile.d/valet.sh
 </code> </code>
- +You do not need this command when you 
-Type +  type commandsor source the command file, 
-<code bash> +  - include lines in the file to be submitted to the qsub.
-    qlogin +
-</code> +
- +
-to reserve one scheduling slot and start an interactive shell on one of your workgroup //investing-entity// compute nodes. +
- +
-Type +
-<code bash> +
-    qlogin –pe threads 12 +
-</code> +
- +
-to reserve 12 scheduling slots and start an interactive shell on one your workgroup //investing-entity// compute node. +
- +
-Type +
-<code bash> +
-    exit +
-</code> +
- +
-to terminate the interactive shell and release the scheduling slot(s). +
- +
-== Acceptable nodes for interactive sessions == +
- +
-Use the login (head) node for interactive program development including FortranC, and C++ program compilation. Use Grid Engine (**qlogin**) to start interactive shells on your workgroup //investing-entity// compute nodes. +
- +
-==== Batch jobs (qsub) ==== +
- +
-Grid Engine provides the **qsub** command for scheduling batch jobs: +
- +
-^ command ^ Action ^ +
-| ''qsub'' <<//command_line_options//>> <<//job_script//>> | Submit job with script command in the file <<//job_script//>>+
- +
-For example, +
- +
-   qsub myproject.qs +
- +
-or to submit a standby job that waits for idle nodes (up to 240 slots for 8 hours), +
- +
-   qsub -l standby=1 myproject.qs +
- +
-or to submit a standby job that waits for idle 48-core nodes (if you are using a cluster with 48-core nodes like Mills) +
- +
-   qsub -l standby=1 -q standby.q@@48core myproject.qs +
-    +
-or to submit a standby job that waits for idle 24-core nodes, (would not be assigned to any 48-core nodes; important for consistency of core assignment) +
- +
-   qsub -l standby=1 -q standby.q@@24core myproject.qs +
- +
-or to submit to the four hour standby queue (up to 816 slots spanning all nodes) +
- +
-   qsub -l standby=1,h_rt=4:00:00 myproject.qs +
- +
-or to submit to the four hour standby queue spanning just the 24-core nodes. +
- +
-   qsub -l standby=1,h_rt=4:00:00 -q standby-4h.q@@24core myproject.qs +
- +
-This file ''myproject.qs'' will contain bash shell commands and **qsub** statements that include **qsub** options and resource specifications. The **qsub** statements begin with #$. +
- +
-<note tip> +
-We strongly recommend that you use a script file that you pattern after the prototypes in **/opt/templates** and save your job script files within a **$WORKDIR** (private work) directory. +
- +
-Reusable job scripts help you maintain a consistent batch environment across runs. The optional **.qs** filename suffix signifies a **q**ueue-**s**ubmission script file.+
 </note> </note>
  
-<note important>See also [[general:userguide:06_runtime_environ?&#resource-management-options|resource options]] to specify memory free and/or available, [[technical/gridengine/exclusive-alloc|exclusive]] access, and requesting specific [[software:matlab#license-information|Matlab licenses]].</note> 
  
  
 +===== Getting Help =====
  
-=== Grid Engine environment variables ===+Grid Engine includes man pages for all of the commands that will be reviewed in this document.  When logged-in to a cluster, type
  
-In every batch session, Grid Engine sets environment variables that are useful within job scripts. Here are some common examples. The rest appear in the ENVIRONMENTAL VARIABLES section of the **qsub**** man** page.//// 
- 
-^ Environment variable ^ Contains ^ 
-| **HOSTNAME** | Name of the execution (compute) node | 
-| **JOB_ID** | Batch job id assigned by Grid Engine | 
-| **JOB_NAME** | Name you assigned to the batch job (See [[#command-options-for-qsub|Command options for qsub]]) | 
-| **NSLOTS** | Number of //scheduling slots// (processor cores) assigned by Grid Engine to this job | 
-| **SGE_TASK_ID** | Task id of an array job sub-task (See [[#array-jobs|Array jobs]]) | 
-| **TMPDIR** | Name of directory on the (compute) node scratch filesystem | 
- 
-When Grid Engine assigns one of your job's tasks to a particular node, it creates a temporary work directory on that node's 1-2 TB local scratch disk. And when the task assigned to that node is finished, Grid Engine removes the directory and its contents. The form of the directory name is 
- 
-**/scratch/[$JOB_ID].[$SGE_TASK_ID].<<//queue_name//>>** 
- 
-For example after ''qlogin'' type 
 <code bash> <code bash>
-    echo $TMPDIR+[traine@farber ~]man qstat
 </code> </code>
-to see the name of the node scratch directory for this interactive job. 
-<file> 
-/scratch/71842.1.it_css-qrsh.q 
-</file> 
  
-See [[:clusters:mills:filesystems|Filesystems]] and [[:general:userguide:04_compute_environ|Computing environment]] for more information about the node scratch filesystem and using environment variables. +to learn more about a Grid Engine command (in this case, ''qstat'').  Most commands will also respond to the ''-help'' command-line option to provide succinct usage summary:
- +
-Grid Engine uses these environment variables' values when creating the job's output files:  +
- +
-^ File name patter ^ Description ^ +
-| [$JOB_NAME].o[$JOB_ID] | Default **output** filename | +
-| [$JOB_NAME].e[$JOB_ID] | **error** filename (when not joined to output) | +
-| [$JOB_NAME].po[$JOB_ID] | Parallel job **output** output (Empty for most queues) | +
-| [$JOB_NAME].pe[$JOB_ID] | Parallel job **error** filename (Usually empty) | +
- +
- +
-=== Command options for qsub === +
- +
-The most commonly used **qsub** options fall into two categories: //operational// and //resource-management//. The operational options deal with naming the output files, mail notification of the processing steps, sequencing of series of jobs, and establishing the UNIX environment. The resource-management options deal with the specific system resources you desire or need, such as parallel programming environments, number of processor cores, maximum CPU time, and virtual memory needed. +
- +
-The table below lists **qsub**'s common operational options. +
- +
-^ Option / Argument ^ Function ^ +
-| ''-N'' <<//job_name//>> | Names the job <//job_name//>. Default: the job script's full filename. | +
-| ''-m'' {b%%|%%e%%|%%a%%|%%s%%|%%n} | Specifies when e-mail notifications of the job's status should be sent: **b**eginning, **e**nd, **a**bort, **s**uspend. Default: **n**ever | +
-| ''-M'' <<//email_address//>> | Specifies the email address to use for notifications. | +
-| ''-j'' {y%%|%%n} | Joins (redirects) the //STDERR// results to //STDOUT//. Default: **y**(//yes//) | +
-| ''-o'' <<//output_file//>> | Directs job output //STDOUT// to <//output_file//>. Default: see [[#grid-engine-environment-variables|Grid Engine environment variables]] | +
-| ''-e'' <<//error_file//>> | Directs job errors (//STDERR//) to <//error_file//>. File is only produced when the **qsub** option **–j n** is used. | +
-| ''-hold_jid'' <//job_list//>  | Holds job until the jobs named in <//job_list//> are completed. Job may be listed as a list of comma-separated numeric job ids or job names. | +
-| ''-t'' <<//task_id_range//>> | Used for //array// jobs.  See [[#array-jobs|Array jobs]] for details. | +
-^ Special notes for IT clusters: ^^ +
-| ''-cwd'' | Default. Uses current directory as the job's working directory. | +
-| ''-V'' | Ignored. Generallythe login node's environment is not appropriate to pass to a compute node. Instead, you must define the environment variables directly in the job script. | +
-''-q'' <<//queue_name//>> | Not need in most cases. Your choice of resource-management options determine the queue. | +
-^ The resource-management options for ''qsub'' have two common forms: ^^ +
-| ''-l'' <<//resource//>>''=''<<//value//>> || +
-| ''-pe ''<<//parallel_environment//>> <<//Nproc//>> || +
- +
-For example, putting the lines  +
-<file> +
-#$ -l h_cpu=1:30:00 +
-#$ –pe threads 12 +
-</file> +
-in the job script tells Grid Engine to set a hard limit of 1.5 hours on the CPU time resource for the job, and to assign 12 processors for your job. +
- +
-Grid Engine tries to satisfy all of the resource-management options you specify in a job script or as qsub command-line options. If there is a queue already defined that accepts jobs having that particular combination of requests, Grid Engine assigns your job to that queue. +
- +
-==== Array jobs ==== +
- +
-An [[:general:jobsched:grid-engine:35_parallelism#array-jobs|array job]] essentially runs the same job by generating a new repeated task many times. Each time, the environment variable **SGE_TASK_ID** is set to a sequence number by Grid Engine and its value provides input to the job submission script. +
- +
-<note tip> +
-The ''$SGE_TASK_ID'' is the key to make the array jobs useful.  Use it in your bash script, or pass it as a parameter so your program can decide how to complete the assigned task. +
-   +
-For example, the ''$SGE_TASK_ID'' sequence values of 2, 4, 6, ... , 5000 might be passed as an initial data value to 2500 repetitions of a simulation model. Alternatively, each iteration (taskof a job might use a different data file with filenames of ''data$SGE_TASK_ID'' (i.e., data1, data2, data3, ', data2000). +
-</note> +
- +
-The general form of the **qsub** option is: +
- +
--t   //start_value// - //stop_value// : //step_size// +
- +
-with a default step_size of 1. For these examples, the option would be: +
- +
--t 2-5000:    and     -t 1-2000 +
- +
-Additional simple how-to examples for [[http://wiki.gridengine.info/wiki/index.php/Simple-Job-Array-Howto|array jobs]]. +
- +
-==== Chaining jobs ==== +
- +
-If you have a multiple jobs where you want to automatically run other job(s) after the execution of another job, then you can use chaining. When you chain jobs, remember to check the status of the other job to determine if it successfully completed. This will prevent the system from flooding the scheduler with failed jobs.  Here is a simple chaining example with three job scripts ''doThing1.qs'', ''doThing2.qs'' and ''doThing3.qs''+
- +
-<code - doThing1.qs> +
- +
-#$ -N doThing1 +
-+
-# If you want an email message to be sent to you when your job ultimately +
-# finishes, edit the -M line to have your email address and change the +
-# next two lines to start with #$ instead of just # +
-# -m eas +
-# -M my_address@mail.server.com +
-+
-# Setup the environment; add vpkg_require commands after this +
-# line: +
- +
-# Now append all of your shell commands necessary to run your program +
-# after this line: +
- ./dotask1 +
-</code> +
- +
-<code - doThing2.qs> +
- +
-#$ -N doThing2 +
-#$ -hold_jid doThing1 +
-+
-# If you want an email message to be sent to you when your job ultimately +
-# finishes, edit the -M line to have your email address and change the +
-# next two lines to start with #$ instead of just # +
-# -m eas +
-# -M my_address@mail.server.com +
-+
-# Setup the environment; add vpkg_require commands after this +
-# line: +
- +
-# Now append all of your shell commands necessary to run your program +
-# after this line: +
- +
-# Here is where you should add a test to make sure +
-# that dotask1 successfully completed before running +
-# ./dotask2 +
-# You might check if a specific file(s) exists that you would +
-# expect after a successful dotask1 run, something like this +
-#  if [ -e dotask1.log ]  +
-#      then ./dotask2 +
-#  fi +
-# If dotask1.log does not exist it will do nothing. +
-# If you don't need a test, then you would run the task. +
- ./dotask2 +
-</code> +
- +
-<code - doThing3.qs> +
- +
-#$ -N doThing3 +
-#$ -hold_jid doThing2 +
-+
-# If you want an email message to be sent to you when your job ultimately +
-# finishes, edit the -M line to have your email address and change the +
-# next two lines to start with #$ instead of just # +
-# -m eas +
-# -M my_address@mail.server.com +
-+
-# Setup the environment; add vpkg_require commands after this +
-# line: +
- +
-# Now append all of your shell commands necessary to run your program +
-# after this line: +
-# Here is where you should add a test to make sure +
-# that dotask2 successfully completed before running +
-# ./dotask3 +
-# You might check if a specific file(s) exists that you would +
-# expect after a successful dotask2 run, something like this +
-#  if [ -e dotask2.log ]  +
-#      then ./dotask3 +
-#  fi +
-# If dotask2.log does not exist it will do nothing. +
-# If you don't need a test, then just run the task. +
- ./dotask3 +
-</code> +
- +
-Now submit all three job scripts. In this example, we are using account ''traine'' in workgroup ''it_css'' on Mills. +
- +
-<code> +
-[(it_css:traine)@mills ~]$ qsub doThing1.qs +
-[(it_css:traine)@mills ~]$ qsub doThing2.qs +
-[(it_css:traine)@mills ~]$ qsub doThing3.qs +
-</code> +
- +
-The basic flow is ''doThing2'' will wait until ''doThing1'' finishes, and ''doThing3'' will wait until ''doThing2'' finishes.  If you test for success, then ''doThing2'' will check to make sure that ''doThing1'' was successful before running, and ''doThing3'' will check to make sure that ''doThing2'' was successful before running. +
- +
-You might also want to have ''doThing1'' and ''doThing2'' execute at the same time, and only run ''doThing3'' after they finish.  In this case you will need to change ''doThing2'' and ''doThing3'' scripts and tests. +
- +
-<code doThing2.qs> +
- +
-#$ -N doThing2 +
-+
-# If you want an email message to be sent to you when your job ultimately +
-# finishes, edit the -M line to have your email address and change the +
-# next two lines to start with #$ instead of just # +
-# -m eas +
-# -M my_address@mail.server.com +
-+
-# Setup the environment; add vpkg_require commands after this +
-# line: +
- +
-# Now append all of your shell commands necessary to run your program +
-# after this line: +
- ./dotask2 +
-</code> +
- +
-<code - doThing3.qs> +
- +
-#$ -N doThing3 +
-#$ -hold_jid doThing1,doThing2 +
-+
-# If you want an email message to be sent to you when your job ultimately +
-# finishes, edit the -M line to have your email address and change the +
-# next two lines to start with #$ instead of just # +
-# -m eas +
-# -M my_address@mail.server.com +
-+
-# Setup the environment; add vpkg_require commands after this +
-# line: +
- +
-# Now append all of your shell commands necessary to run your program +
-# after this line: +
-# Here is where you should add test to make sure +
-# that dotask1 and dotask2 successfully completed before running +
-# ./dotask3 +
-# You might check if a specific file(s) exists that you would +
-# expect after a successful dotask1 and dotask2 run, something like this +
-#  if [ -e dotask1.log -a -e dotask2.log ]; +
-#      then ./dotask3 +
-#  fi +
-# If both files do not exist it will do nothing. +
-# If you don't need a test, then just run the task. +
- ./dotask3 +
-</code> +
- +
-Now submit all three jobs again. However this time ''doThing1'' and ''doThing2'' will run at the same time, and only when they are both finished, will ''doThing3'' run.  ''doThing3'' will check to make sure ''doThing1'' and ''doThing2'' are successful  +
-before running. +
- +
-==== Resource-management options ==== +
- +
-Any large cluster will have many nodes with perhaps differing resources, e.g., cores, memory, disk space and accelerators. +
-The ones you can request come in three categories. +
- +
-  - Fixed resources by the configuration - slots and installed memory, +
-  - Set by load sensor - CPU load averages, memory usage +
-  - Managed by job scheduler internal bookkeeping to ensure availability - available memory and floating software licenses.  +
- +
-**Details by cluster** +
- +
-   * [[clusters:mills:runapps#resource-management-options|Mills]] +
-   * [[clusters:farber:runapps#resource-management-options|Farber]] +
- +
-===== Managing Jobs ===== +
-==== Checking job status ==== +
- +
-Use the **qstat** command to check the status of queued jobs.  Use the ''qstat -h'' or ''man qstat'' commands on the login node to view a complete description of available options.  Some of the most often-used options are summarized here: +
- +
-^ Option ^ Result ^ +
-| ''-j'' <<//job_id_list//>> | Displays information for specified job(s) | +
-| ''-u'' <<//user_list//>> | Displays information for jobs associated with the specified user(s) | +
-| ''-ext'' | Displays extended information about jobs | +
-| ''-t'' | Shows additional information about subtasks | +
-| ''-r'' | Shows resource requirements of jobs | +
- +
-For example, to list the information for job 62900, type +
-<code> +
-qstat -j 62900 +
-</code> +
- +
-To list a table of jobs assigned to user //traine// that displays the resource requirements for each job, type +
-<code> +
-qstat -u traine -r +
-</code> +
- +
-With no options **qstat** defaults to ''qstat -u $USER'', so you get a table for your jobs.  With  the ''-u'' option the +
-**qstat** command uses //Reduced Format// with following columns. +
- +
-^ Column header ^ Description ^ +
-| ''job-ID'' | job id assigned to the job | +
-| ''user'' | user who owns the job | +
-| ''name'' | job name| +
-| ''state'' | current job status, including **qw**(aiting) , **s**(uspended), **r**(unning), **h**(old), **E**(rror), **d**(eletion) | +
-| ''submit/start at'' | submit time (waiting jobs) or start time (running jobs) | +
-| ''queue'' | name of the queue the job is assigned to (for running or suspended jobs only)| +
-| ''slots'' | number of slots assigned to the job | +
- +
-=== A more concise listing ==== +
- +
-The IT-supplied **qjobs** command provides a more convenient listing of job status. +
- +
-^ Command ^ Description ^ +
-| ''qjobs'' | Displays the status of jobs submitted by you | +
-| ''qjobs -g'' | Displays the status of jobs submitted by your research group | +
-| ''qjobs –g ''<<//investing_entity//>>'' ''| Displays the status of jobs submitted by members of the named investing-entity | +
-| ''qjobs –a'' | Displays the status of jobs submitted by **a**ll users | +
- +
-In all cases the JobID, Owner, State and Name are listed in a table. +
- +
-=== Job status is qw === +
- +
-When your job status is ''qw'' it means your job is queued and waiting to execute.  When you check with ''qstat'' you might see something like this+
  
 <code base> <code base>
-[(it_css:traine)@mills it_css]$ qstat -u traine +[traine@farber ~]$ qstat -help 
-job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID +usage: qstat [options] 
------------------------------------------------------------------------------------------------------------------ +        [-cb]                             view additional binding specific parameters 
-  99154 0.50661 openmpi-pg traine       qw    11/12/2012 14:33:49                                  144+        [-ext]                            view additional attributes 
 +           :
 </code> </code>
  
-Sometimes your job is stuck and remains in the ''qw'' state and never starts running.  You can use **qalter** to poke at the job scheduler to see why your job is not running.  For example, to see the last 10 lines of the job scheduler validation for job 99154, you can type  +//This section uses the wiki's [[http://docs-dev.hpc.udel.edu/doku.php#documentation-conventions|documentation conventions]].//
- +
-<code base> +
-[(it_css:traine)@mills it_css]$ qalter -w p 99154 | tail -10 +
-Job 99154 has no permission for cluster queue "puleo-qrsh.q" +
-Job 99154 has no permission for cluster queue "capsl.q+" +
-Job 99154 has no permission for cluster queue "spare.q" +
-Job 99154 has no permission for cluster queue "it_nss-qrsh.q" +
-Job 99154 has no permission for cluster queue "it_nss.q" +
-Job 99154 has no permission for cluster queue "it_nss.q+" +
-Job 99154 Jobs cannot run because only 72 of 144 requested slots are available +
-Job 99154 Jobs can not run in PE "openmpi" because the resource requirements can not be satified +
-verification: no suitable queues +
-</code> +
- +
-In this example, we asked for 144 slots, but only 72 slots are available for workgroup ''it_css'' nodes. +
- +
-<code base> +
-[(it_css:traine)@mills it_css]$ qstatgrp +
-CLUSTER QUEUE                   CQLOAD   USED    RES  AVAIL  TOTAL aoACDPS  cdsuE +
-it_css-dev.q                      0.00      0      0     72     72      0      0 +
-it_css-qrsh.q                     0.00      0      0     72     72      0      0 +
-it_css.q                          0.00      0      0     72     72      0      0 +
-it_css.q+                         0.00      0      0     72     72      0      0 +
-standby-4h.q                      0.27      0      0   4968   5064      0     96 +
-standby.q                         0.27     12      0   4932   5064      0    120 +
-</code> +
- +
-Use **qalter** to change the attributes of the pending job such as reducing the number of slots requested to be within the workgroup ''it_css'' nodes or change the resources specified to the [[general:jobsched:standby|standby queue]] so the job could run. For example, let'change the number of slots requested to 48 instead of 144 by using +
- +
-<code base> +
-[(it_css:traine)@mills it_css]$ qalter -pe openmpi 48 99154 +
-modified parallel environment of job 99154 +
-modified slot range of job 99154 +
-[(it_css:traine)@mills it_css]$ qstat -u traine +
-job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID +
------------------------------------------------------------------------------------------------------------------ +
-  99154 0.50661 openmpi-pg traine           11/12/2012 14:33:49                                  48 +
-</code> +
- +
-Another way to get this job running would be to change the resource for the job to run in the standby queue.  To do this you must specify all resources since ''qalter'' completely replaces any parameters previously specified for the job by that option. In this example, we alter the job to run in the standby queue by using +
-<code base> +
-[(it_css:traine)@mills it_css]$ qalter -l idle=0,standby=1 99154 +
-modified hard resource list of job 99154 +
-[(it_css:traine)@mills it_css]$ qstat -u traine +
-job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID +
------------------------------------------------------------------------------------------------------------------ +
-  99154 0.50661 openmpi-pg traine           11/12/2012 15:23:52 standby.q@n016                   144 +
-</code> +
- +
-<note important>''qalter'' can only be used to alter jobs that you own!</note> +
- +
-=== Job status is Eqw === +
- +
-When your job status is ''Eqw'' it means an error occurred when Grid Engine attempted to schedule the job, so it has been returned to the qw state When you check with ''qstat'' you might see something like this for user ''traine'' +
- +
-<code base> +
-[(it_css:traine)@mills it_css]$ qstat -u traine +
-job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID +
------------------------------------------------------------------------------------------------------------------ +
- 686924 0.50509 openmpi-pg traine       Eqw   08/12/2014 19:38:53                                                              1 +
-</code> +
- +
-If the state shows ''Eqw'', then use ''qstat -j //job_id// grep error'' to check for the error. Here is an example of what you might see +
- +
-<code base> +
-[traine@mills ~]$ qstat -j 686924 | grep error +
-error reason    1:          08/12/2014 22:08:27 [1208:60529]: error: can't chdir to /archive/it_css/traine/ex-openmpi: No such file or directory +
-</code> +
- +
-This error indicates that some directory or file (respectively) cannot be foundVerify that the file or directory in question exists, i.e., you haven't forgotten to create it and you can see it from the head node and compute node. If it appears to be okay, then the job may have suffered a transient condition such as a failed NFS automount, the NFS server was temporarily down, or some other filesystem error occurred. +
- +
-If you understand the reason and can get it fixed, use ''qmod -cj //job_id//'' to clear the error state like this: +
- +
-<code base> +
-[traine@mills ~]$ qmod -cj 686924 +
-</code> +
- +
-and it should eventually run. +
- +
-==== Checking queue status ==== +
- +
-The **qstat** command can also be used to get status of all queues on the system.  +
- +
-^ Option ^ Result ^ +
-| ''-f'' | Displays summary information for all queues | +
-| ''-ne'' | Suppresses the display of empty queues. | +
-| ''-qs'' {a%%|%%c%%|%%d%%|%%o!%%|%%s%%|%%u%%|%%A%%|%%C%%|%%D%%|%%E%%|%%S} | Selects queues to be displayed according to state | +
- +
- +
-With the ''-f'' option, **qstat** uses //full format//, which includes the following columns. +
- +
-^ Column header ^ Description ^ +
-| ''queuename'' | job id assigned to the job | +
-| ''resv/used/total'' | Number of slots reserved/used/total | +
-| ''states'' | current job status, including **a**(larm), **s**(uspended), **d**(isabled), **h**(old),\\ **E**(rror), **P**(reempted) | +
- +
-Examples: +
- +
-List all queues that are unavailable because they are disabled or the slotwise preemption limits have been reached.  +
-<code bash> +
-qstat -f -qs dP +
-</code> +
- +
-List the queues associated with the investing entity //it_css//+
-<code bash> +
-qstat -f | egrep '(queuename|it_css)' +
-</code> +
- +
-==== Checking overall queue and node information ==== +
- +
-You can determine overall queue and node information using the ''qstatgrp'', ''qconf'', ''qnodes'' and ''qhostgrp'' commands. Use a command's ''-h'' option to see its command syntax. To obtain information about a group other than your current group, use the ''-g'' option. +
- +
-^ Command ^ Illustrative example ^ +
-| ''qstatgrp''  | ''qstatgrp'' shows a summary of the status of the owner-group queues of your current workgroup.**** | +
-| ''qstatgrp -j'' | ''qstatgrp -j'' shows the status of each job in the owner-group queues that members\\ of your current workgroup submitted.**** | +
-| ''qstatgrp -g'' <<//investing_entity//>>  | ''qstatgrp -g it_css'' shows  the status of all the owner-group queues for the\\ //it_css// investing-entity. | +
-| ''qstatgrp -j -g'' <<//investing_entity//>>  | ''qstatgrp –j -g it_css'' shows the status of each job in the owner-group queues that\\ members of the //it_css//  investing-entity submitted.**** | +
-| ''qconf -sql''  | **S**hows all **q**ueues as a **l**ist. | +
-| ''qconf -sq'' <<//queue_name//>> | ''qconf -sq it_css*'' displays the configuration of each owner-group queues for the\\ //it_css// investing-entity. | +
-| ''qnodes''  | ''qnodes'' displays the names of your owner-group's nodes | +
-| ''qnodes -g'' <<//investing_entity//>>  | ''qnodes -g it_css'' displays the name of the nodes owned by the\\ //it_css// investing-entity. | +
-| ''qhostgrp''  | ''qhostgrp'' displays the current status of your owner-group's nodes | +
-| ''qhostgrp –g'' <<//investing_entity//>>  | ''qhostgrp -g it_css'' displays the current status of the nodes owned by the\\ //it_css// investing-entity. | +
-| ''qhostgrp -j -g'' <<//investing_entity//>>  | ''qhostgrp –j -g it_css'' shows all jobs running (including [[general/jobsched/standby|standby]] and spillover) in the owner-group nodes for the //it_css//  investing-entity. | +
- +
-==== Checking overall usage of resource quotas ==== +
- +
-Resource quotas are used to help control the standby and spillover queues.  Each user has a quota based on the limits set by the [[general/jobsched/standby|standby]] queue specifications for each cluster, and each workgroup has a per_workgroup quota based on the number of slots purchased by the research group. +
- +
-^ Command ^ Illustrative example ^ +
-| ''qquota -u'' <<//username//>> ''| grep standby''  | ''qquota -g traine | grep standby'' displays the current usage of slots by user\\ //traine// in the standby resources. +
-| ''qquota -u \* | grep'' <<//investing_entity//>>  | ''qquota -u \* | grep it_css'' displays the current usage of slots being used by all\\ members of the //it_css// investing-entity, the per_workgroup quota.  |  +
- +
-The example below gives a snapshot of slots being used by ''traine'' user in the standby queues and the slots being used by all members of the workgroup ''it_css'' +
- +
-<code> +
-$ qquota -u traine | grep standby +
-standby_limits/4h  slots=80/800         users traine queues standby-4h.q +
-standby_cumulative/default slots=80/800         users traine queues standby.q,standby-4h.q +
-$ qquota -u \* | grep it_css +
-per_workgroup/it_css slots=141/200        users @it_css queues it_css.q,spillover.q +
-</code> +
- +
-<note important>If there are no jobs running as part of your workgroup, then your per_workgroup quota (of 0 out of N slots) doesn't get displayed, period.</note> +
-==== Deleting a job ==== +
- +
-Use the **qdel**  <<//job_id//>> command to remove pending and running jobs from the queue. +
- +
-For example, to delete job 28000 +
-<code bash> +
-  qdel 28000 +
-</code> +
- +
-<note important>**Your job is not deleted** +
- +
-If you have a job that remains in a delete state, even after you try to delete it with the +
-**qdel** command, then try a force deletion with +
-<code bash> +
-  qdel -f 28000 +
-</code> +
-This will just forget about the job without attempting any cleanup on the node(s) being used. +
- +
-</note>+
  
  • abstract/farber/runjobs/runjobs.1512344906.txt.gz
  • Last modified: 2017-12-03 18:48
  • by sraskar