====== Computational models for running Matlab on a shared cluster ====== By default, Matlab uses multiple computational threads. From the MATLAB R2011b documentation matlab -singleCompThread limits MATLAB to a single computational thread. By default, MATLAB makes use of the multithreading capabilities of the computer on which it is running. The default, multiple computational threads, is never a good option when you are sharing a node. So either use ''-singleCompThread'' option when you start MATLAB or schedule the Matlab job using the exclusive option based on the job scheduler on that cluster such as ''-l exclusive=1'' option for Grid Engine or ''#SBATCH --exclusive'' for Slurm. Using a node with exclusive access does not mean MATLAB will use all the cores and memory. You should watch it to see memory and core requirement. To take advantage of the multiple cores you must use the built-in, matrix functions. You should see your CPU utilization as over 100% when the matrix function are being executed. Matlab can, with the distributed computing toolbox, create a parallel pool of workers to be dispatched in parallel. ===== Multiple computational threads on one node ===== Matlab makes use of the multithreading capabilities of the computer on which it is running. Matlab uses MKL as its BLAS and LAPACK backend. The versions can be determined by the Matlab commands. version -blas version -lapack To make full use of the MKL computational threads you need to use the built-in matrix functions. The work needed to execute the built-in function will be distribute to multiple cores using MKL threads, which are compatible with OpenMP threads. All the cores share the same memory, so this is also called the shared memory model for parallel computing. A simple model of how the total Matlab job performs is CPU = (p*20 + (1-p))*WALL The actual number of computational threads is not explicitly mentioned in the Unix documentation. For windows, the documentation specifies that Matlab will use all the cores on the machine. This is clearly not appropriated for Unix clusters. Observations on Mills show that Matlab may use all the cores, but average much less. To use more than one core the Matlab job must be written to use the standard high performance libraries (MKL) linked in the Matlab executable. This works well, but is not optimized for Mills processor or threading libraries. ==== Test batch jobs using GridEngine ==== Several copies of the same MATLAB script was submitted to run simultaneously. The variance was in the batch script directives. === Batch job with exclusive access (only job on node) === Part of batch script file: $ tail -4 batche.qs #$ -l exclusive=1 vpkg_require matlab/r2014b matlab -nodisplay -nojvm -r 'script' CGROUP report from batch output file: $ grep CGROUPS *.o425422 [CGROUPS] No /cgroup/memory/UGE/425422.1 exists for this job [CGROUPS] UD Grid Engine cgroup setup commencing [CGROUPS] Setting none bytes (vmem none bytes) on n171 (master) [CGROUPS] with 20 cores = [CGROUPS] done. Memory and timing results: $ qacct -h n171 -j 425422 | egrep '(start|maxvmem|maxrss|cpu|wallclock|failed)' start_time 02/16/2016 13:52:16.213 failed 0 ru_wallclock 8037.427 ru_maxrss 658584 cpu 53089.736 maxvmem 2.882G maxrss 644.949M === Batch job with 5 slots 370 MB per core (1.85 GB total) === Part of batch script file: $ tail -6 batch5.qs #$ -pe threads 5 #$ -l mem_total=1.9G #$ -l m_mem_free=370M vpkg_require matlab/r2015a matlab -nodisplay -nojvm -r 'script' CGROUP report from batch output file: $ grep CGROUPS *.o428562 [CGROUPS] UD Grid Engine cgroup setup commencing [CGROUPS] Setting 388050944 bytes (vmem none bytes) on n139 (master) [CGROUPS] with 5 cores = [CGROUPS] done. Memory and timing results: $ qacct -h n139 -j 428562 | egrep '(start|maxvmem|maxrss|cpu|wallclock|failed)' start_time 02/17/2016 18:22:54.254 failed 0 ru_wallclock 5.297 ru_maxrss 165232 cpu 3.090 maxvmem 1017.906M maxrss 155.109M === Batch job with 4 slots 1 GB per core (4 GB total) === Part of batch script file: $ cat batch.qs #$ -pe threads 4 #$ -l m_mem_free=1G vpkg_require matlab/r2014b matlab -nodisplay -nojvm -r 'script' CGROUP report from batch output file: $ grep CGROUPS *.o418695 [CGROUPS] UD Grid Engine cgroup setup commencing [CGROUPS] Setting 1073741824 bytes (vmem none bytes) on n036 (master) [CGROUPS] with 4 cores = 0 2 4 6 [CGROUPS] done. This is sharing the node with the previous job on cores 5-8. Memory and timing results: $ qacct -h n036 -j 418695 | egrep '(maxvmem|maxrss|cpu|wallclock|failed)' failed 0 ru_wallclock 826.759 ru_maxrss 595188 cpu 1629.194 maxvmem 1.801G maxrss 583.039M === Batch job with 3 slots 1 GB per core (3 GB total) === Part of batch script file: $ cat batch.qs #$ -pe threads 3 #$ -l m_mem_free=1G vpkg_require matlab/r2014b matlab -nodisplay -nojvm -r 'script' CGROUP report from batch output file: $ grep CGROUPS *.o408597 [CGROUPS] UD Grid Engine cgroup setup commencing [CGROUPS] Setting 3221225472 bytes (vmem 9223372036854775807 bytes) on n039 (master) [CGROUPS] with 3 cores = 0-2 [CGROUPS] done. Memory and timing results: $ qacct -h n039 -j 408597 | egrep '(maxvmem|maxrss|cpu|wallclock)' ru_wallclock 13877.991 ru_maxrss 2089812 cpu 90776.109 maxvmem 4.180G maxrss 0.000 === Batch job with 2 slots 3.1 GB per core (6.2 GB total) === 3.1 GB per core on a 20 core node is 62 GB, which allows 20 jobs to fit with 2 GB to spare for system overhead Part of batch script file: $ cat batch.qs # -pe threads 2 # -l m_mem_free=3.1G vpkg_require matlab/r2014b matlab -nodisplay -nojvm -r 'script' CGROUP report from batch output file: $ grep CGROUPS *.o408598 [CGROUPS] UD Grid Engine cgroup setup commencing [CGROUPS] Setting 6657200128 bytes (vmem 9223372036854775807 bytes) on n039 (master) [CGROUPS] with 2 cores = 3-4 [CGROUPS] done. This is sharing the node with the previous job, being on cores 3-4. Memory and timing results: $ qacct -h n039 -j 408598 | egrep '(maxvmem|maxrss|cpu|wallclock)' ru_wallclock 13904.972 ru_maxrss 2152212 cpu 92110.859 maxvmem 4.208G maxrss 0.000 === Batch job with 1 slots 3.1 GB per core (3.1 GB total) === 3.1 GB per core on a 20 core node is 62 GB, which allows 20 jobs to fit with 2 GB to spare for system overhead Part of batch script file: $ cat batch.qs #$ -l m_mem_free=3.1G vpkg_require matlab/r2014b matlab -nodisplay -nojvm -r 'script' CGROUP report from batch output file: $ grep CGROUPS *.o408599 [CGROUPS] UD Grid Engine cgroup setup commencing [CGROUPS] Setting 3328602112 bytes (vmem 9223372036854775807 bytes) on n036 (master) [CGROUPS] with 1 core = 0 [CGROUPS] done. Memory and timing results: $ qacct -h n036 -j 408599 | egrep '(maxvmem|maxrss|cpu|wallclock)' ru_wallclock 8607.872 ru_maxrss 1935860 cpu 51805.427 maxvmem 4.036G maxrss 0.000 ==== Table ==== ^ ^^ requested ^^ used memory and time ^^^ ^ jobid ^ host ^ cores ^ memory ^ maxvem ^ cpu ^ wallclock ^ | 408594 | n038 | all 20 | all <64GB | 4.155G | 51321.533 | 8613.132 | | 408595 | n037 | 5 | 5G | 4.043G | 86578.676 | 13051.171 | | 408596 | n037 | 4 | 4G | 4.301G | 86330.547 | 13067.863 | | 408597 | n039 | 3 | 3G | 4.180G | 90776.109 | 13877.991 | | 408598 | n039 | 2 | 6.2G | 4.208G | 92110.859 | 13904.972 | | 408599 | n031 | default 1 | 3.1G | 4.036G | 51805.427 | 8607.872 | ==== Table new spread over nodes ==== ^ ^^ requested ^^ used memory and time ^^^ ^ jobid ^ host ^ cores ^ memory ^ maxvem ^ cpu ^ wallclock ^ | 418705 | n172 | all 20 | all <64GB | 2.904G | 5553.820 | 1089.789 | | 418704 | n039 | 5 | 5G | 1.874G | 1778.309 | 804.490 | | 418695 | n036 | 4 | 4G | 1.801G | 1629.194 | 826.759 | | 418693 | n037 | 3 | 3G | 1.735G | 1475.837 | 863.386 | | 418691 | n040 | 2 | 6.2G | 1.662G | 1334.752 | 944.711 | | 418690 | n038 | default 1 | 1G | 1.536G | 1164.087 | 1173.832 | ==== Table new same node ==== ^ ^^ requested ^^ used memory and time ^^^^ ^ jobid ^ host ^ cores ^ memory ^ maxvem ^ maxrss ^ cpu ^ wallclock ^ | 418768 | n172 | all 20 | all <64GB | 3.805G | 1.633G | 5246.490 | 882.568 | | 418773 | n036 | 5 | 5G | 1.852G | 578.457M | 1953.868 |930.284 | | 418772 | n036 | 4 | 4G | 1.779G | 579.109M |1800.191 | 949.475 | | 418771 | n036 | 3 | 3G | 1.709G | 570.246M |1660.543 | 996.545 | | 418770 | n036 | 2 | 6.2G | 1.640G | 557.363M | 1543.664 | 1106.315 | | 418769 | n036 | default 1 | 1G | 1.514G | 564.840M | 1356.694 |1356.256 | ==== Graphs ==== As number of cores increases both the CPU time and memory usage increase linearly. The increased memory is easy to explain by the needed for //private memory//, memory that is not shared. Sometime parallel algorithms can achieve faster wall clock time by recalculating some values, and thus the total CPU time increases. {{:clusters:matlab:maxeigcpu.png?nolink&640|}} {{:clusters:matlab:maxeigmem.png?640|}} Both CPU time and memory are costs to running you algorithm, since they limit the number of other users that can use the node. To chart both consider a simple cost of CPU*Memory in GB hours. Thus we have two objectives: * Reduce the run time * Reduce the cost {{:clusters:matlab:maxeigcost.png?640|}} The two extremes on the Pareto optimization curve and good choices. All the nodes in the fastest run time and one node is the least costly (so you can simultaneously run 20 jobs.) The 4 core job is a good compromise. ==== Commands while running ==== $ n=n182 **''ps''** command $ ssh $n ps -eo pid,ruser,pcpu,pmem,thcount,stime,time,command | egrep '(COMMAND|matlab)' PID RUSER %CPU %MEM THCNT STIME TIME COMMAND 96970 traine 182 0.8 10 13:52 05:51:25 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm 96971 traine 160 0.8 9 13:52 05:09:03 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm 96972 traine 119 0.8 7 13:52 03:50:15 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm 96974 traine 141 0.8 8 13:52 04:33:14 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm 97005 traine 99.5 0.8 5 13:52 03:11:43 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm 97130 traine 99.4 0.8 5 13:52 03:11:27 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -singleCompThread -r script -nojvm **''ps''** command to get threads for one PID $ ssh $n ps -eLf | egrep '(PID|96970)' | grep -v ' 0 ' UID PID PPID LWP C NLWP STIME TTY TIME CMD traine 96970 96222 97281 95 10 13:52 ? 03:04:20 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm traine 96970 96222 97314 21 10 13:52 ? 00:41:58 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm traine 96970 96222 97315 21 10 13:52 ? 00:41:43 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm traine 96970 96222 97316 21 10 13:52 ? 00:40:54 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm traine 96970 96222 97317 22 10 13:52 ? 00:43:43 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm $ ssh $n ps -eLf | egrep '(PID|96971)' | grep -v ' 0 ' UID PID PPID LWP C NLWP STIME TTY TIME CMD traine 96971 96223 97283 95 9 13:52 ? 03:04:30 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm traine 96971 96223 97310 21 9 13:52 ? 00:42:07 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm traine 96971 96223 97311 21 9 13:52 ? 00:41:39 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm traine 96971 96223 97312 21 9 13:52 ? 00:42:18 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm $ ssh $n ps -eLf | egrep '(PID|96972)' | grep -v ' 0 ' UID PID PPID LWP C NLWP STIME TTY TIME CMD traine 96972 96278 97284 97 7 13:52 ? 03:09:31 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm traine 96972 96278 97308 21 7 13:52 ? 00:41:50 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm $ ssh $n ps -eLf | egrep '(PID|97005)' | grep -v ' 0 ' UID PID PPID LWP C NLWP STIME TTY TIME CMD traine 97005 96342 97275 99 5 13:52 ? 03:13:49 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -r script -nojvm $ ssh $n ps -eLf | egrep '(PID|97130)' | grep -v ' 0 ' UID PID PPID LWP C NLWP STIME TTY TIME CMD traine 97130 96443 97282 99 5 13:52 ? 03:13:53 /home/software/matlab/r2014b/bin/glnxa64/MATLAB -nodisplay -singleCompThread -r script -nojvm **''top''** command $ ssh $n top -H -b -n 1 | egrep '(COMMAND|MATLAB)' | grep -v 'S 0' PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 97281 traine 20 0 1785m 577m 73m R 101.2 0.9 185:50.23 MATLAB 97276 traine 20 0 1646m 572m 73m R 101.2 0.9 185:11.74 MATLAB 97275 traine 20 0 1452m 562m 73m R 101.2 0.9 194:31.12 MATLAB 97284 traine 20 0 1572m 562m 73m R 99.2 0.9 190:36.63 MATLAB 97282 traine 20 0 1452m 562m 73m R 99.2 0.9 194:14.58 MATLAB 97283 traine 20 0 1716m 575m 73m R 85.6 0.9 185:40.60 MATLAB 97316 traine 20 0 1785m 577m 73m S 62.3 0.9 41:28.48 MATLAB 97317 traine 20 0 1785m 577m 73m S 62.3 0.9 44:10.42 MATLAB 97315 traine 20 0 1785m 577m 73m S 60.3 0.9 42:15.26 MATLAB 97314 traine 20 0 1785m 577m 73m S 58.4 0.9 42:25.24 MATLAB 97311 traine 20 0 1716m 575m 73m S 33.1 0.9 42:02.23 MATLAB 97310 traine 20 0 1716m 575m 73m S 17.5 0.9 42:29.42 MATLAB 97312 traine 20 0 1716m 575m 73m S 17.5 0.9 42:34.32 MATLAB 97308 traine 20 0 1572m 562m 73m R 9.7 0.9 41:57.47 MATLAB **''mpstat''** command $ ssh $n mpstat -P ALL 1 2 Linux 2.6.32-504.30.3.el6.x86_64 (n182) 02/16/2016 _x86_64_ (20 CPU) 05:08:25 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 05:08:26 PM all 48.50 0.00 0.50 0.00 0.00 0.05 0.00 0.00 50.95 05:08:26 PM 0 99.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 05:08:26 PM 1 14.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 86.00 05:08:26 PM 2 54.55 0.00 0.00 0.00 0.00 0.00 0.00 0.00 45.45 05:08:26 PM 3 50.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 48.00 05:08:26 PM 4 53.47 0.00 0.99 0.00 0.00 0.00 0.00 0.00 45.54 05:08:26 PM 5 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:26 PM 6 44.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 56.00 05:08:26 PM 7 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:26 PM 8 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:26 PM 9 53.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 46.00 05:08:26 PM 10 74.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 26.00 05:08:26 PM 11 52.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 47.00 05:08:26 PM 12 9.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 87.00 05:08:26 PM 13 53.54 0.00 1.01 0.00 0.00 0.00 0.00 0.00 45.45 05:08:26 PM 14 0.99 0.00 0.99 0.00 0.00 0.00 0.00 0.00 98.02 05:08:26 PM 15 11.88 0.00 0.99 0.00 0.00 0.00 0.00 0.00 87.13 05:08:26 PM 16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 05:08:26 PM 17 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.00 05:08:26 PM 18 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 05:08:26 PM 19 99.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:26 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 05:08:27 PM all 49.50 0.00 0.55 0.00 0.00 0.00 0.00 0.00 49.95 05:08:27 PM 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:27 PM 1 12.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 88.00 05:08:27 PM 2 58.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 41.00 05:08:27 PM 3 31.68 0.00 0.99 0.00 0.00 0.00 0.00 0.00 67.33 05:08:27 PM 4 63.64 0.00 0.00 0.00 0.00 0.00 0.00 0.00 36.36 05:08:27 PM 5 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:27 PM 6 26.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 73.74 05:08:27 PM 7 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:27 PM 8 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:27 PM 9 57.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 41.00 05:08:27 PM 10 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 05:08:27 PM 11 60.40 0.00 1.98 0.00 0.00 0.00 0.00 0.00 37.62 05:08:27 PM 12 11.00 0.00 3.00 0.00 0.00 0.00 0.00 0.00 86.00 05:08:27 PM 13 57.43 0.00 0.99 0.00 0.00 0.00 0.00 0.00 41.58 05:08:27 PM 14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 05:08:27 PM 15 12.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 88.00 05:08:27 PM 16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 05:08:27 PM 17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 05:08:27 PM 18 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 05:08:27 PM 19 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle Average: all 49.00 0.00 0.53 0.00 0.00 0.03 0.00 0.00 50.45 Average: 0 99.50 0.00 0.00 0.00 0.00 0.50 0.00 0.00 0.00 Average: 1 13.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 87.00 Average: 2 56.28 0.00 0.50 0.00 0.00 0.00 0.00 0.00 43.22 Average: 3 40.80 0.00 1.49 0.00 0.00 0.00 0.00 0.00 57.71 Average: 4 58.50 0.00 0.50 0.00 0.00 0.00 0.00 0.00 41.00 Average: 5 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: 6 35.18 0.00 0.00 0.00 0.00 0.00 0.00 0.00 64.82 Average: 7 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: 8 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: 9 55.00 0.00 1.50 0.00 0.00 0.00 0.00 0.00 43.50 Average: 10 87.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 13.00 Average: 11 56.22 0.00 1.49 0.00 0.00 0.00 0.00 0.00 42.29 Average: 12 10.00 0.00 3.50 0.00 0.00 0.00 0.00 0.00 86.50 Average: 13 55.50 0.00 1.00 0.00 0.00 0.00 0.00 0.00 43.50 Average: 14 0.50 0.00 0.50 0.00 0.00 0.00 0.00 0.00 99.00 Average: 15 11.94 0.00 0.50 0.00 0.00 0.00 0.00 0.00 87.56 Average: 16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 Average: 17 0.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.50 Average: 18 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 Average: 19 99.50 0.00 0.50 0.00 0.00 0.00 0.00 0.00 0.00 **''qhost''** command $ qhost -h $n HOSTNAME ARCH NCPU NSOC NCOR NTHR NLOAD MEMTOT MEMUSE SWAPTO SWAPUS ---------------------------------------------------------------------------------------------- global - - - - - - - - - - n182 lx-amd64 20 2 20 20 0.38 62.8G 5.1G 2.0G 11.5M ===== Multiple distributed workers ===== ===== Single computational threads ===== ===== Monitoring Tools ===== There are several tools you can run on your node to monitor the computational threads on your node. In this example n093 is running several MATLAB jobs. * Ganglia (real time) ''http://mills.hpc.udel.edu/ganglia/?c=mills.hpc&h=n093'' * top * ps ==== Using top ==== dnairn@mills dnairn]$ ssh n093 top -b -n 1 | egrep '(COMMAND|MATLAB)' PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8209 matusera 20 0 12.7g 6.2g 62m S 1103.5 9.9 202:33.96 MATLAB 2622 matusera 20 0 6917m 256m 62m S 0.0 0.4 9783:37 MATLAB 4386 matusera 20 0 6928m 231m 62m S 0.0 0.4 2850:19 MATLAB 14939 matusera 20 0 6926m 230m 62m S 0.0 0.4 20139:22 MATLAB 16308 matusera 20 0 6930m 242m 62m S 0.0 0.4 24928:39 MATLAB ==== Using ps command ==== [dnairn@mills dnairn]$ ssh n093 ps -eo pid,ruser,pcpu,pmem,thcount,stime,time,command | egrep '(COMMAND|matlab)' PID RUSER %CPU %MEM THCNT STIME TIME COMMAND 2622 matusera 21.1 0.3 90 Jul29 6-19:03:37 /home/software/matlab/R2011b/bin/glnxa64/MATLAB 4386 matusera 4.7 0.3 90 Jul19 1-23:30:19 /home/software/matlab/R2011b/bin/glnxa64/MATLAB 8209 matusera 1019 6.9 90 13:18 02:34:48 /home/software/matlab/R2011b/bin/glnxa64/MATLAB 14939 matusera 27.3 0.3 90 Jul10 13-23:39:21 /home/software/matlab/R2011b/bin/glnxa64/MATLAB 16308 matusera 46.6 0.3 90 Jul24 17-07:28:38 /home/software/matlab/R2011b/bin/glnxa64/MATLAB Description of the custom column values from ps man page: pid PID process ID number of the process. ruser RUSER real user ID. This will be the textual user ID, if it can be obtained and the field width permits, or a decimal representation otherwise. %cpu %CPU cpu utilization of the process in "##.#" format. Currently, it is the CPU time used divided by the time the process has been running (cputime/realtime ratio), expressed as a percentage. It will not add up to 100% unless you are lucky. (alias pcpu). %mem %MEM ratio of the process’s resident set size to the physical memory on the machine, expressed as a percentage. (alias pmem). thcount THCNT see nlwp. (alias nlwp). number of kernel threads owned by the process. bsdstart START time the command started. If the process was started less than 24 hours ago, the output format is " HH:MM", else it is "mmm dd" (where mmm is the three letters of the month). See also lstart, start, start_time, and stime. time TIME cumulative CPU time, "[dd-]hh:mm:ss" format. (alias cputime). args COMMAND command with all its arguments as a string. Modifications to the arguments may be shown. The output in this column may contain spaces. A process marked is partly dead, waiting to be fully destroyed by its parent. Sometimes the process args will be unavailable; when this happens, ps will instead print the executable name in brackets. (alias cmd, command). See also the comm format keyword, the -f option, and the c option. When specified last, this column will extend to the edge of the display. If ps can not determine display width, as when output is redirected (piped) into a file or another command, the output width is undefined. (it may be 80, unlimited, determined by the TERM variable, and so on) The COLUMNS environment variable or --cols option may be used to exactly determine the width in this case. The w or -w option may be also be used to adjust width. ==== ps for threads ==== Select thread with PID 12035 with some activity, that is not C = 0. [dnairn@mills dnairn]$ ssh n093 ps -eLf | egrep '(PID|12035)' | grep -v ' 0 ' UID PID PPID LWP C NLWP STIME TTY TIME CMD matusera 12035 11918 12082 98 90 16:39 pts/2 00:43:21 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12132 67 90 16:39 pts/2 00:29:49 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12133 67 90 16:39 pts/2 00:29:42 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12134 67 90 16:39 pts/2 00:29:43 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12135 67 90 16:39 pts/2 00:29:34 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12136 67 90 16:39 pts/2 00:29:47 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12137 67 90 16:39 pts/2 00:29:50 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12138 67 90 16:39 pts/2 00:29:48 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12139 67 90 16:39 pts/2 00:29:45 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12140 67 90 16:39 pts/2 00:29:40 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12141 67 90 16:39 pts/2 00:29:33 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop matusera 12035 11918 12142 67 90 16:39 pts/2 00:29:32 /home/software/matlab/R2011b/bin/glnxa64/MATLAB -nosplash -nodesktop twelve of the 90 threads are doing computation. These are the computation threads.