Differences
This shows you the differences between two versions of the page.
abstract:mills:filesystems [2017-10-24 09:57] – created sraskar | abstract:mills:filesystems [2017-12-07 18:26] (current) – removed anita | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Using Mills' filesystems ====== | ||
- | ===== Permanent filesystems ===== | ||
- | ==== Home ==== | ||
- | |||
- | Each user has 2 GB of disk space reserved for personal use on the home file system. Users' home directories are in /home (e.g., ''/ | ||
- | Use the [[/ | ||
- | |||
- | An 8 TB permanent filesystem is provided on the login node (head node), mills.hpc.udel.edu. The filesystem' | ||
- | |||
- | ==== Archive ==== | ||
- | |||
- | Each research group has 1 TB of shared group storage on the archive filesystem (/archive). The directory is identified by the research-group identifier <<// | ||
- | |||
- | The 60 TB permanent archive filesystem uses 3 TB enterprise class SATA drives in a triple-parity RAID configuration for high reliability and availability. The filesystem is accessible to the head node via 10 Gbit/s Ethernet and to the compute nodes via 1 Gbit/s Ethernet. ([[https:// | ||
- | |||
- | ===== High-performance filesystem ===== | ||
- | |||
- | ==== Lustre ==== | ||
- | |||
- | User storage is available on a high-performance Lustre-based filesystem having 172 TB of usable | ||
- | space. This is used for input files, supporting data files, work files, and output files, source | ||
- | code and executables associated with computational tasks run on the cluster. The filesystem is | ||
- | accessible to all of the processor cores via QDR InfiniBand. | ||
- | |||
- | The Lustre filesystem is not backed up. However, it is a robust RAID-6 system. Thus, the filesystem | ||
- | can survive a concurrent disk failure of two independent hard drives and still rebuild its contents | ||
- | automatically. | ||
- | |||
- | The /lustre filesystem is partitioned as shown below: | ||
- | |||
- | ^ Directory ^ Description ^ | ||
- | | work | Private work directories for individual investor-groups | | ||
- | | scratch | Public scratch space for all users | | ||
- | | sysadmin | System administration use | | ||
- | |||
- | |||
- | Each investing-entity has a private work directory (/ | ||
- | group-writable. This is where you should create and store most of your files. Each | ||
- | investing-entity' | ||
- | does **//not//** automatically delete files from these directories. The default group-ownership for | ||
- | a file created in a private work directory is the investing-entity' | ||
- | permissions are 644. | ||
- | |||
- | Anyone may use the public scratch directory (/ | ||
- | as needed to purge aged files or directories in / | ||
- | performance. **////** | ||
- | |||
- | **Note**: A full filesystem inhibits use for everyone. | ||
- | |||
- | |||
- | ===== Local filesystem ===== | ||
- | |||
- | ==== Node scratch ==== | ||
- | |||
- | Each compute node has its own 1-2 TB local hard drive, which is needed for time-critical tasks such as managing virtual memory. | ||
- | |||
- | ===== Quotas and usage ===== | ||
- | |||
- | To help users maintain awareness of quotas and their usage on ''/ | ||
- | |||
- | For example, | ||
- | |||
- | < | ||
- | $ my_quotas | ||
- | Type Path | ||
- | ----- --------------------------- ------------- ----------------- ---- | ||
- | user / | ||
- | group / | ||
- | group / | ||
- | </ | ||
- | ==== Home ==== | ||
- | Each user's home directory has a hard quota limit of 2 GB. | ||
- | |||
- | ==== Archive ==== | ||
- | Each group' | ||
- | ==== Lustre ==== | ||
- | |||
- | Each investing-entity originally had an informal quota for its private work directory in ''/ | ||
- | |||
- | To determine usage for user '' | ||
- | |||
- | < | ||
- | [traine@mills ~]$ my_quotas | ||
- | Type Path In-use / kiB Available / kiB Pct | ||
- | ----- ------------------- ----------- ------------ ---- | ||
- | user / | ||
- | group / | ||
- | group / | ||
- | </ | ||
- | |||
- | To determine all usage on ''/ | ||
- | |||
- | < | ||
- | [traine@mills ~]$ df -H / | ||
- | Filesystem | ||
- | mds1-ib@o2ib: | ||
- | [traine@mills ~]$ | ||
- | </ | ||
- | |||
- | <note important> | ||
- | Files are automatically cleaned up for ''/ | ||
- | </ | ||
- | |||
- | <note warning> | ||
- | |||
- | ==== Node scratch ==== | ||
- | The node scratch is mounted on ''/ | ||
- | |||
- | For example, the command | ||
- | < | ||
- | ssh n017 df -h /scratch | ||
- | </ | ||
- | shows 197 MB used from the total filesystem size of 793 GB. | ||
- | < | ||
- | Filesystem | ||
- | / | ||
- | </ | ||
- | This node '' | ||
- | |||
- | <note warning> | ||
- | </ | ||
- | |||
- | We strongly recommend that you refer to the node scratch by using the environment variable, '' | ||
- | ===== Recovering files ===== | ||
- | ==== Home backups ==== | ||
- | |||
- | Files in your home directory and all sub-directories are backed up using the campus backup system. | ||
- | **recover** command is for browsing the index of saved files and recovering selected files from the backup system. | ||
- | |||
- | * Go to the original directory using **cd** command. | ||
- | * Start an interactive recover session using the **recover** command. | ||
- | * Type the recover command: '' | ||
- | * Schedule the file recovery with the command: '' | ||
- | |||
- | Here is a sample session where the file '' | ||
- | < | ||
- | [traine@mills ex0]$ rm sourceme-gcc | ||
- | [traine@mills ex0]$ recover | ||
- | Current working directory is / | ||
- | recover> add sourceme-gcc | ||
- | / | ||
- | 1 file(s) marked for recovery | ||
- | recover> recover | ||
- | Recovering 1 file into its original location | ||
- | Volumes needed (all on-line): | ||
- | d08.RO at / | ||
- | Total estimated disk space needed for recover is 4 KB. | ||
- | Requesting 1 file(s), this may take a while... | ||
- | Requesting 1 recover session(s) from server. | ||
- | ./ | ||
- | Received 1 file(s) from NSR server `owell-3.nss.udel.edu' | ||
- | Recover completion time: Mon 20 Aug 2012 02:54:59 PM EDT | ||
- | recover> quit | ||
- | [dnairn@mills ex0]$ head -1 sourceme-gcc | ||
- | example=' | ||
- | </ | ||
- | |||
- | |||
- | ==== Archive snapshots ==== | ||
- | |||
- | Snapshots are read-only images of the filesystem at the time the snapshot is taken. They are available under the '' | ||
- | |||
- | |||
- | When an initial snapshot is taken, no space is used as it is a read-only reference for the current filesystem image. However, as the filesystem changes, copy-on-write of data blocks is done and will cause snapshots to use space. These new blocks used by snapshots do not count against the 1TB limit that the group' | ||
- | |||
- | Some example uses of snapshots for users are: | ||
- | |||
- | * If a file is deleted or modified during the afternoon you can go to the '' | ||
- | * If a file was deleted on Friday and you do not realize until Monday you can use the '' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | ===== Recommended practices ===== | ||
- | |||
- | Generally, the /lustre filesystem provides better overall performance than the /home and /archive filesytems. | ||
- | This is especially true for input and output files needed or generated by jobs. The /lustre | ||
- | filesystem is accessible to all processor cores via (40 Gb/s) QDR InfiniBand. In comparison, the | ||
- | compute nodes access the /home and /archive filesystems over 1 Gb/s Ethernet. | ||
- | |||
- | The /archive filesystem has less space available for your group, but has both regular snapshots and off-site duplication for recovering storage. The filesystem is especially useful for building applications for your group to use. The main compilers and building tools are available on the head node, and the head node can access the /archive filesystem over 10 Gb/s Ethernet. | ||
- | |||
- | The /home filesystem in limited in storage, and it is used by many applications to store user preference files and caches. | ||
- | |||
- | **Private work directories** | ||
- | |||
- | All members of an investing-entity share their group-writable, | ||
- | / | ||
- | have full access to add, move (rename) or remove directories and files in these group-writable directories. Be careful not to | ||
- | move or remove any files or directories you do not own (you own directories you created). | ||
- | Your fellow researchers will appreciate your good " | ||
- | |||
- | You should create a personal subdirectory within any group-writable directory for your own group-related files. That will reduce the chance of others accidentally modifying or deleting your files. | ||
- | You will own this new personal directory with full access to you, and read-only access to your group. | ||
- | fellow researchers can copy your files, but not modify them. Researchers not in your group can never see or copy your work, because the investing-entity work directory is only open to your group. | ||
- | |||
- | <note important> | ||
- | This describes the way ownership and access is set using the default shell environment. You can make some changes by standard UNIX commands such as **chmod**, if you must. However, you can't give users access to your files outside of your group. Use the public scratch directories for sharing files. | ||
- | </ | ||
- | |||
- | **Public scratch directories** | ||
- | |||
- | All members of the cluster community share a world-writable, | ||
- | / | ||
- | only the file owner or the directory' | ||
- | You should create a personal directory that will be owned by you with full access to you, and read-only to every user. | ||
- | This is where you store files you are willing to share. This is also where you store files that require a large amount of disk space for a short period. | ||
- | |||
- | **Work directory structure** | ||
- | |||
- | Your group should initially consider how to organize the group' | ||
- | match the group' | ||
- | more project-oriented; | ||
- | some combination of these. One possibility for the lustre work might look like this: | ||
- | |||
- | <code text> | ||
- | / | ||
- | projects/ | ||
- | | ||
- | | ||
- | users/ | ||
- | | ||
- | | ||
- | | ||
- | </ | ||
- | |||
- | where the '' | ||
- | |||
- | If your research group chooses to follow this structure, we suggest the following procedure: The | ||
- | stakeholder should first create the **projects** and **users** directories that is group-writable and has the sticky bit set (mode 1770). | ||
- | |||
- | <code bash> | ||
- | cd / | ||
- | mkdir -m 1770 projects users | ||
- | </ | ||
- | |||
- | ==== Summary of usage recommendations ==== | ||
- | |||
- | [[# | ||
- | |||
- | **Private work directory**: | ||
- | your job such as your applications, | ||
- | record of what you did, but for interactive work, you need to take notes as a record of your work. Use [[# | ||
- | |||
- | **Private archive directory**: | ||
- | the [[: | ||
- | |||
- | **Public scratch directory**: | ||
- | |||
- | **Home directory**: | ||
- | cache files in your home directory. Generally, keep this directory free and only use it for files needed to configure your environment. For example, add [[http:// | ||
- | |||
- | **Node scratch directory**: | ||
- | each job's temporary files. This is done on each node assigned to the job. When the job is complete, the | ||
- | subdirectory and its contents are deleted. This process automatically frees up the local scratch | ||
- | storage that others may need. Files in node scratch directories are not available to the head node, or other compute nodes. |