Both sides previous revision Previous revision Next revision | Previous revision |
abstract:farber:filesystems:filesystems [2018-05-21 18:33] – [Node scratch] anita | abstract:farber:filesystems:filesystems [2021-10-14 11:20] (current) – [Lustre storage] anita |
---|
| |
User storage is available on a high-performance Lustre-based filesystem having 257 TB of usable | User storage is available on a high-performance Lustre-based filesystem having 257 TB of usable |
space. This is used for temporary input files, supporting data files, work files, and output files associated with computational tasks run on the cluster. The filesystem is accessible to all of the processor cores via 56 Gbps (FDR) Infiniband. | space. This is used for temporary input files, supporting data files, work files, and output files associated with computational tasks run on the cluster. The filesystem is accessible to all of the processor cores via 56 Gbps (FDR) Infiniband. The default stripe count is set to 1 and the default striping is a single stripe distributed across all available OSTs on Lustre. See [[https://www.nas.nasa.gov/hecc/support/kb/lustre-best-practices_226.html|Lustre Best Practices]] from Nasa. |
| |
<note warning>Source code and executables must be stored in and executed from Home (''$HOME'') or Workgroup (''$WORKDIR'') storage. No executables are not permitted to run from the Lustre filesystem.</note> | <note warning>Source code and executables must be stored in and executed from Home (''$HOME'') or Workgroup (''$WORKDIR'') storage. No executables are permitted to run from the Lustre filesystem.</note> |
| |
The Lustre filesystem is not backed up. However, it is a robust RAID-6 system. Thus, the filesystem | The Lustre filesystem is not backed up. However, it is a robust RAID-6 system. Thus, the filesystem |
<note important>Remember all of the Lustre filesystem is temporary disk storage. Your workflow should start | <note important>Remember all of the Lustre filesystem is temporary disk storage. Your workflow should start |
by copying needed data files to the high performance Lustre filesystem, ''/lustre/scratch'', and finish by copying results back to your private ''/home'' or shared ''/home/work'' directory. Please clean up (delete) all of the | by copying needed data files to the high performance Lustre filesystem, ''/lustre/scratch'', and finish by copying results back to your private ''/home'' or shared ''/home/work'' directory. Please clean up (delete) all of the |
remaining files in ''/lustre/scratch'' no longer needed. If you do not clean up properly, then files will be purged from ''/lustre/scratch'' by the regular cleanup procedures.</note> | remaining files in ''/lustre/scratch'' no longer needed by using the custom [[abstract:farber:filesystems:lustre#lustre-utilities|Lustre utilities]]. If you do not clean up properly, then files will be purged from ''/lustre/scratch'' by the regular cleanup procedures.</note> |
| |
| **Note**: A full filesystem inhibits use for everyone. |
===== Local filesystem ===== | ===== Local filesystem ===== |
| |
===== Filesystem mount points ===== | ===== Filesystem mount points ===== |
| |
A //mount point// is the term that describes where the computer puts the files in a hierarchical file system on the cluster. All the filesystems appear as one unified filesystem structure. Here is a list of typical mount points: | A //mount point// is the term that describes where the computer puts the files in a hierarchical file system on the cluster. All the filesystems appear as one unified filesystem structure. Here is a list of mount points on Farber: |
| |
^ Mount point ^ Backed up ^Description ^ | ^ Mount point ^ Backed up ^Description ^ |
Type Path In-use / kiB Available / kiB Pct | Type Path In-use / kiB Available / kiB Pct |
----- -------------------------- ------------ ------------ ---- | ----- -------------------------- ------------ ------------ ---- |
user /home/1006 1691648 20971520 8% | user /home/1201 1691648 20971520 8% |
group /home/work/it_css 39649280 1048576000 4% | group /home/work/it_css 39649280 1048576000 4% |
| |
<code> | <code> |
[traine@farber ~]$ workgroup -g it_css | [traine@farber ~]$ workgroup -g it_css |
[(it_css:dnairn)@farber ~]$ df -h $WORKDIR | [(it_css:traine)@farber ~]$ df -h $WORKDIR |
Filesystem Size Used Avail Use% Mounted on | Filesystem Size Used Avail Use% Mounted on |
storage-nfs1:/export/work/it_css | storage-nfs1:/export/work/it_css |
cache files in your home directory. Generally, keep this directory free and use it for files needed to configure your environment. For example, add [[http://en.wikipedia.org/wiki/Symbolic_link#POSIX_and_Unix-like_operating_systems|symbolic links]] in your home directory to point to files in any of the other directory. The ''/home/'' filesystem is backed-up with [[#home-workgroup-snapshots|snapshots]]. | cache files in your home directory. Generally, keep this directory free and use it for files needed to configure your environment. For example, add [[http://en.wikipedia.org/wiki/Symbolic_link#POSIX_and_Unix-like_operating_systems|symbolic links]] in your home directory to point to files in any of the other directory. The ''/home/'' filesystem is backed-up with [[#home-workgroup-snapshots|snapshots]]. |
| |
**Workgroup directory**: Use the [[#workgroup|workgroup]] directory (/home/work/<<//investing_entity//>>) to build applications for you or your group to use as well as important data, modified source or any other files need to be shared by your research group. See the [[abstract:farber:runjobs:prog_env|Application development]] section for information on building applications. You should create a VALET package for your fellow researchers to access applications you want to share. A typical workflow is to copy the files needed from ''/home/work'' to ''/lustre/scratch'' for the actual run. The ''/home/work'' system is backed-up with [[#home-workgroup-snapshots|snapshots]]. | **Workgroup directory**: Use the [[#workgroup|workgroup]] directory (/home/work/<<//investing_entity//>>) to build applications for you or your group to use as well as important data, modified source or any other files need to be shared by your research group. See the [[:abstract:farber:app_dev:app_dev|Application development]] section for information on building applications. You should create a VALET package for your fellow researchers to access applications you want to share. A typical workflow is to copy the files needed from ''/home/work'' to ''/lustre/scratch'' for the actual run. The ''/home/work'' system is backed-up with [[#home-workgroup-snapshots|snapshots]]. |
| |
**Public scratch directory**: Use the public [[#lustre|Lustre]] scratch directory (/lustre/scratch) for files where high performance is required. Store files produced as intermediate work files, and remove them when your current project is done. That will free up the public scratch workspace others also need. This is also a good place for sharing files and | **Public scratch directory**: Use the public [[#lustre|Lustre]] scratch directory (/lustre/scratch) for files where high performance is required. Store files produced as intermediate work files, and remove them when your current project is done. That will free up the public scratch workspace others also need. This is also a good place for sharing files and |
and data with all users. Files in this directory are not backed up, and subject to removal. Use [[#lustre-utilities|Lustre utilities]] from a compute node to check disk usage and remove files no longer needed. | and data with all users. Files in this directory are not backed up, and subject to removal. Use [[abstract:farber:filesystems:lustre#lustre-utilities|Lustre utilities]] from a compute node to check disk usage and remove files no longer needed. |
| |
**Node scratch directory**: Use the [[#node-scratch|node scratch]] directory (/scratch) for temporary files. The job scheduler software (Grid Engine) creates a subdirectory in /scratch specifically for each job's temporary files. This is done on each node assigned to the job. When the job is complete, the subdirectory and its contents are deleted. This process automatically frees up the local scratch storage that others may need. Files in node scratch directories are not available to the head node, or other compute nodes. | **Node scratch directory**: Use the [[#node-scratch|node scratch]] directory (/scratch) for temporary files. The job scheduler software (Grid Engine) creates a subdirectory in /scratch specifically for each job's temporary files. This is done on each node assigned to the job. When the job is complete, the subdirectory and its contents are deleted. This process automatically frees up the local scratch storage that others may need. Files in node scratch directories are not available to the head node, or other compute nodes. |
**Lustre workgroup directory**: Use the [[#lustre|Lustre]] work directory (/lustre/work/<<//investing_entity//>>) if purchased by your research group for files where high performance is required. Keep just the files needed for | **Lustre workgroup directory**: Use the [[#lustre|Lustre]] work directory (/lustre/work/<<//investing_entity//>>) if purchased by your research group for files where high performance is required. Keep just the files needed for |
your job such as scripts and large data sets used for input or created for output. Remember the disk is not backed up and subject to removal only if needed, so be prepared to rebuild the files if necessary. With batch jobs, the queue script is a | your job such as scripts and large data sets used for input or created for output. Remember the disk is not backed up and subject to removal only if needed, so be prepared to rebuild the files if necessary. With batch jobs, the queue script is a |
record of what you did, but for interactive work, you need to take notes as a record of your work. Use [[#lustre-utilities|Lustre utilities]] from a compute node to check disk usage and remove files no longer needed. | record of what you did, but for interactive work, you need to take notes as a record of your work. Use [[abstract:farber:filesystems:lustre#lustre-utilities|Lustre utilities]] from a compute node to check disk usage and remove files no longer needed. |