abstract:caviness:filesystems:lustre

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
abstract:caviness:filesystems:lustre [2020-03-06 09:42] – [A Storage Node] anitaabstract:caviness:filesystems:lustre [2020-05-29 12:23] frey
Line 9: Line 9:
 With four disks being used in parallel (example (b) above), the block writing overlaps and takes just 8 cycles to complete. With four disks being used in parallel (example (b) above), the block writing overlaps and takes just 8 cycles to complete.
  
-Parallel use of multiple disks is the key behind many higher-performance disk technologies.  RAID (Redundant Array of Independent Disks) level 6 uses three or more disks to improve i/o performance while retaining //parity// copies of data((The two parity copies in RAID-6 imply that given //N// 2 TB disks, only //N-2// actually store data.  E.g. a three disk RAID-6 volume has a capacity of 2 TB.)).  Should one or two of the constituent disks fail, the missing data can be reconstructed using the parity copies.  It is RAID-6 that forms the basic building block of the Lustre filesystem on the Mills cluster.+Parallel use of multiple disks is the key behind many higher-performance disk technologies.  RAID (Redundant Array of Independent Disks) level 6 uses three or more disks to improve i/o performance while retaining //parity// copies of data((The two parity copies in RAID-6 imply that given //N// 2 TB disks, only //N-2// actually store data.  E.g. a three disk RAID-6 volume has a capacity of 2 TB.)).  Should one or two of the constituent disks fail, the missing data can be reconstructed using the parity copies.  It is RAID-6 that forms the basic building block of the Lustre filesystem on our clusters.
  
 ===== A Storage Node ===== ===== A Storage Node =====
  
-For example, the Mills cluster contains five //storage appliances// that each contain many hard disks.  For example, ''storage1'' contains 36 SATA hard disks (TB each) arranged as six 8 TB RAID-6 units:+The Caviness cluster contains multiple //object storage targets// in each rack that each contain many hard disks.  For example, ''ost0'' contains 11 SATA hard disks (TB each) managed as a ZFS storage pool and an SSD acting as a read cache:
  
-{{ osts.png |Example image of the Mills storage1 appliance.}}+{{ :abstract:caviness:filesystems:caviness-lustre-oss_ost.png?400 |Example image of Caviness Lustre OSS/OST. }}
  
 Each of the six OST (Object Storage Target) units can survive the concurrent failure of one or two hard disks at the expense of storage space:  the raw capacity of ''storage1'' is 72 TB, but the data resilience afforded by RAID-6 costs a full third of that capacity (leaving 48 TB). Each of the six OST (Object Storage Target) units can survive the concurrent failure of one or two hard disks at the expense of storage space:  the raw capacity of ''storage1'' is 72 TB, but the data resilience afforded by RAID-6 costs a full third of that capacity (leaving 48 TB).
Line 39: Line 39:
   * File system capacity is not limited by hard disk size   * File system capacity is not limited by hard disk size
  
-The capacity of a Lustre filesystem is the sum of its constituent OSTs, so a Lustre filesystem's capacity can be grown by the addition of OSTs (and possibly OSSs to serve them).  For example, should the 172 TB Lustre filesystem on Mills begin to approach its capacity, additional capacity could be added with zero downtime by buying and installing another OSS pair.+The capacity of a Lustre filesystem is the sum of its constituent OSTs, so a Lustre filesystem's capacity can be grown by the addition of OSTs (and possibly OSSs to serve them).  For example, should the 172 TB Lustre filesystem begins to reach its capacity, additional capacity could be added with zero downtime by buying and installing another OSS pair.
  
 <note important>Creating extremely large filesystems has one drawback:  traversing the filesystem takes so much time that it becomes impossible to create off-site backups for further data resilience.  For this reason Lustre filesystems are most often treated as volatile/scratch storage.</note> <note important>Creating extremely large filesystems has one drawback:  traversing the filesystem takes so much time that it becomes impossible to create off-site backups for further data resilience.  For this reason Lustre filesystems are most often treated as volatile/scratch storage.</note>
  • abstract/caviness/filesystems/lustre.txt
  • Last modified: 2020-05-29 16:19
  • by frey