software:globus:caviness

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
software:globus:caviness [2024-04-08 11:03] anitasoftware:globus:caviness [2024-04-08 14:49] (current) – [Request a Globus home staging point] anita
Line 14: Line 14:
  
 === Getting files off Caviness === === Getting files off Caviness ===
-  * Make files already on ''/lustre/scratch'' visible on your Globus endpoint by moving them to ''/lustre/scratch/globus/home/<uid#>'' (using "mvis very fast; duplicating the files using "cpwill take longer). +  * Make files already on ''/lustre/scratch'' visible on your Globus endpoint by moving them to ''/lustre/scratch/globus/home/<uid#>'' (using ''mv'' is very fast; duplicating the files using ''cp'' will take longer). 
-  * Copy files from home or workgroup storage to ''/lustre/scratch/globus/home/<uid#>'' (using "rsyncor "cp") to make them visible on your Globus endpoint.+  * Copy files from home or workgroup storage to ''/lustre/scratch/globus/home/<uid#>'' (using ''rsync'' or ''cp'') to make them visible on your Globus endpoint.
   * **Please note:** do not create symbolic links in ''/lustre/scratch/globus/home/<uid#>'' to files or directories that reside in your home or workgroup storage.  They will not work on the Globus endpoint.   * **Please note:** do not create symbolic links in ''/lustre/scratch/globus/home/<uid#>'' to files or directories that reside in your home or workgroup storage.  They will not work on the Globus endpoint.
  
 === Moving data to Caviness === === Moving data to Caviness ===
   * The ''/lustre/scratch/globus/home/<uid#>'' directory is writable by the Globus, so data can be copied from remote collection to your Caviness collection.   * The ''/lustre/scratch/globus/home/<uid#>'' directory is writable by the Globus, so data can be copied from remote collection to your Caviness collection.
-  * Files copied to ''/lustre/scratch/globus/home/<uid#>'' via Globus can be moved elsewhere on ''/lustre/scratch'' (again, "mvis very fast and duplication using "cpwill be slower) or copied to home and workgroup storage.+  * Files copied to ''/lustre/scratch/globus/home/<uid#>'' via Globus can be moved elsewhere on ''/lustre/scratch'' (again, ''mv'' is very fast and duplication using ''cp'' will be slower) or copied to home and workgroup storage.
  
 <WRAP center round important 60%> <WRAP center round important 60%>
Line 33: Line 33:
 </WRAP> </WRAP>
  
-===== Request a Globus home staging area ==== +===== Request a Globus home staging point ==== 
-Please submit a [[https://services.udel.edu/TDClient/32/Portal/Requests/TicketRequests/NewForm?ID=D5ZRIgFlfLw_|Research Computing High Performance Computing (HPC) Clusters Help Request]] and complete the form including Caviness and in the problem details indicate you are requesting a Globus home staging area on Caviness.+<WRAP center round alert 60%> 
 +Only available to UD accounts (UDelNet ID) **not** Guest accounts (//hpcguest//<<//uid//>>). 
 +</WRAP> 
 + 
 +Please submit a [[https://services.udel.edu/TDClient/32/Portal/Requests/ServiceDet?ID=23|Research Computing High Performance Computing (HPC) Clusters Help Request]] and and click on the green **Request Service** button, complete the form and in the problem details indicate you are requesting a Globus home staging point on Caviness.
  
 ===== Where is my Globus home? ===== ===== Where is my Globus home? =====
-You can find your uid number using the "idcommand:+You can find your uid number using the ''id'' command:
  
 <code bash> <code bash>
Line 68: Line 72:
   * Choose the **University of Delaware** as your organization and click the **Continue** button; on the next page enter your UDelNet Id and Password and continue on to the dashboard.   * Choose the **University of Delaware** as your organization and click the **Continue** button; on the next page enter your UDelNet Id and Password and continue on to the dashboard.
  
-Click the “File Manager” button **insert button image** on the left side-panel.  In the search box at the top of the page, enter "Caviness" and click the magnifying glass icon: +Click the “File Manager” button {{:software:globus:filemanagerbutton.jpg}} on the left side-panel.  In the search box at the top of the page, enter "Caviness" and click the magnifying glass icon: 
-  * <insert image>+   
 +{{:software:globus:caviness-stagingarea.jpg}}
  
 The "**UD CAVINESS Cluster - Staging Directory**" collection should be visible.  Click on the collection; you will be asked to authenticate using your UDelNet Id and Password if it is the first time you access the collection. The "**UD CAVINESS Cluster - Staging Directory**" collection should be visible.  Click on the collection; you will be asked to authenticate using your UDelNet Id and Password if it is the first time you access the collection.
  
 If successful (and you have a Globus home directory) the list of files in ''/lustre/scratch/globus/home/<uid#>'' should appear: If successful (and you have a Globus home directory) the list of files in ''/lustre/scratch/globus/home/<uid#>'' should appear:
-  * <insert image>+   
 +{{:software:globus:caviness-filemanager.jpg}}
  
 ===== Guest collections ===== ===== Guest collections =====
 To share your files with others via Globus, a user can create a Guest Collection, which enables sharing a directory in a user’s Globus Home directory. To share your files with others via Globus, a user can create a Guest Collection, which enables sharing a directory in a user’s Globus Home directory.
 To create a Guest Collection, you need to first go to the collection details. You can either click the three dots next to the collection: To create a Guest Collection, you need to first go to the collection details. You can either click the three dots next to the collection:
-  * <insert image>+ 
 +{{:software:globus:caviness-guestcollections-1.jpg}}
  
 Or right click on the directory you would like to share and select “Share” Or right click on the directory you would like to share and select “Share”
-  * <insert image>+ 
 +{{:software:globus:caviness-guestcollections-2.jpg}}
  
 Click “Add Guest Collection” in the “Collections” tab Click “Add Guest Collection” in the “Collections” tab
-  * <insert image>+ 
 +{{:software:globus:caviness-addguestcollection-1.jpg}}
  
 Make sure you at least set the “Directory” and “Display Name” properly. It’s a good idea to fill in the other fields as well: Make sure you at least set the “Directory” and “Display Name” properly. It’s a good idea to fill in the other fields as well:
-  * <insert image>+ 
 +{{:software:globus:caviness-addguestcollection-2.jpg}}
  
 Once the guest collection is created, you may want to set the permissions of this guest collection, e.g. sharing with specific individuals, making it public, allowing read permissions only, etc. Once the guest collection is created, you may want to set the permissions of this guest collection, e.g. sharing with specific individuals, making it public, allowing read permissions only, etc.
Line 98: Line 108:
 </WRAP> </WRAP>
  
 +===== Examples =====
  
 +==== Data Upload Repository ====
 +In this example, a directory is created under the ''$GLOBUS_HOME'' that will hold a dataset comprised of many small (less than 4 GB) files.  Lustre will by default assign the files to unique Object Storage Targets (OSTs) as they are created, effectively spreading them across the Lustre storage pools:
 +
 +<code bash>
 +$ mkdir -p $GLOBUS_HOME/incoming/20200103T1535
 +</code>
 +
 +A series of transfers are scheduled in the Globus web interface, placing the files in this directory.
 +
 +After the files have been uploaded, the user moves them from this staging directory to a shared location his workgroup will be able to access:
 +<code bash>
 +$ mkdir -p /lustre/scratch/it_nss/datasets
 +$ mv $GLOBUS_HOME/incoming/20200103T1535 /lustre/scratch/it_nss/datasets
 +</code>
 +
 +Since the source and destination paths are both on the /lustre/scratch filesystem, the ''mv'' is nearly instantaneous.
 +
 +==== Moving to Workgroup Storage ====
 +If the final destination of the incoming dataset is the workgroup storage (e.g. a path under ''$WORKDIR''), the ''cp'' or ''rsync'' commands should be used.  The ''rsync'' command is especially useful if you are updating a previously-copied dataset with new data:
 +<code bash>
 +$ rsync -ar $GLOBUS_HOME/incoming/20200103T1535/ $WORKDIR/datasets/2019-2020/
 +</code>
 +
 +The trailing slashes ("''/''") on the source and destination directories are significant; do not forget them.  Any files present in both source and destination that have not been modified will be skipped, and any files in the source directory that are not present in the destination directory will be fully copied to the destination directory.  Files present in the destination directory that are not present in the source directory will NOT be removed (add the ''%%--%%delete'' option to remove them).
 +
 +See the man page for ''rsync'' for additional options.
 +
 +==== Data Upload Repository (Large Files) ====
 +
 +In the previous example the dataset consisted of many small (4 GB or less) files.  For very large files, performance can be increased on Lustre by splitting the file itself across multiple OSTs:  the first 1 GB goes to OST 1, the next 1 GB to OST 2, etc.  Configuring file striping happens at the directory level, with all files created inside that directory inheriting the parent directory's striping behavior.
 +
 +The default striping on ''/lustre/scratch'' uses a single randomly-chosen OST for all of the file's content.  In this example, we will create a directory whose files will stripe across two OSTs in 1 GB chunks:
 +
 +<code bash>
 +$ cd $GLOBUS_HOME/incoming
 +$ mkdir 20200103T1552-2x1G
 +$ lfs setstripe --stripe-count 2 --stripe-size 1G 20200103T1552-2x1G
 +$ ls -ld 20200103T1552-2x1G
 +drwxr-xr-x 2 frey everyone 95744 Jan  3 15:55 20200103T1552-2x1G
 +$ lfs getstripe 20200103T1552-2x1G
 +20200103T1552-2x1G
 +stripe_count:  2 stripe_size:   1073741824 stripe_offset: -1
 +</code>
 +
 +The choice of 1 GB was somewhat arbitrary; in reality, the stripe size is often chosen to reflect an inherent record size associated with the file format.
 +
 +Globus is used to transfer a data file to this directory.  To see how the file was broken across OSTs the ''lfs getstripe'' command can be used:
 +
 +<code bash>
 +$ cd $GLOBUS_HOME/incoming/20200103T1552-2x1G
 +$ ls -l
 +total 38
 +-rw-r--r-- 1 frey everyone 10485760000 Jan  3 15:57 db-20191231.sqlite3db
 +$ lfs getstripe db-20191231.sqlite3db 
 +db-20191231.sqlite3db
 +lmm_stripe_count:  2
 +lmm_stripe_size:   1073741824
 +lmm_pattern:       1
 +lmm_layout_gen:    0
 +lmm_stripe_offset: 2
 + obdidx objid objid group
 +      2      611098866    0x246ca0f2              0
 +      3      607328154    0x2433179a              0
 +</code>
 +
 +When this file (or directory) is moved to another directory on ''/lustre/scratch'', the striping is retained:
 +
 +<code bash>
 +$ mv $GLOBUS_HOME/incoming/20200103T1552-2x1G/db-20191231.sqlite3db \
 +>    /lustre/scratch/it_nss/datasets
 +$ lfs getstripe /lustre/scratch/it_nss/datasets/db-20191231.sqlite3db
 +/lustre/scratch/it_nss/datasets/db-20191231.sqlite3db
 +lmm_stripe_count:  2
 +lmm_stripe_size:   1073741824
 +lmm_pattern:       1
 +lmm_layout_gen:    0
 +lmm_stripe_offset: 2
 + obdidx objid objid group
 +      2      611098866    0x246ca0f2              0
 +      3      607328154    0x2433179a              0
 +</code>
  • software/globus/caviness.1712588592.txt.gz
  • Last modified: 2024-04-08 11:03
  • by anita