diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt index af1dd600cff..f7cc2bc8417 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt +++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt @@ -1311,6 +1311,9 @@ Release 2.7.0 - UNRELEASED HDFS-7824. GetContentSummary API and its namenode implementation for Storage Type Quota/Usage. (Xiaoyu Yao via Arpit Agarwal) + HDFS-7700. Document quota support for storage types. (Xiaoyu Yao via + Arpit Agarwal) + Release 2.6.1 - UNRELEASED INCOMPATIBLE CHANGES diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md index 191b5bc7ca1..bdb051ba875 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md @@ -307,8 +307,8 @@ Usage: [-refreshNodes] [-setQuota ...] [-clrQuota ...] - [-setSpaceQuota ...] - [-clrSpaceQuota ...] + [-setSpaceQuota [-storageType ] ...] + [-clrSpaceQuota [-storageType ] ...] [-setStoragePolicy ] [-getStoragePolicy ] [-finalizeUpgrade] @@ -342,8 +342,8 @@ Usage: | `-refreshNodes` | Re-read the hosts and exclude files to update the set of Datanodes that are allowed to connect to the Namenode and those that should be decommissioned or recommissioned. | | `-setQuota` \ \...\ | See [HDFS Quotas Guide](../hadoop-hdfs/HdfsQuotaAdminGuide.html#Administrative_Commands) for the detail. | | `-clrQuota` \...\ | See [HDFS Quotas Guide](../hadoop-hdfs/HdfsQuotaAdminGuide.html#Administrative_Commands) for the detail. | -| `-setSpaceQuota` \ \...\ | See [HDFS Quotas Guide](../hadoop-hdfs/HdfsQuotaAdminGuide.html#Administrative_Commands) for the detail. | -| `-clrSpaceQuota` \...\ | See [HDFS Quotas Guide](../hadoop-hdfs/HdfsQuotaAdminGuide.html#Administrative_Commands) for the detail. | +| `-setSpaceQuota` \ `[-storageType ]` \...\ | See [HDFS Quotas Guide](../hadoop-hdfs/HdfsQuotaAdminGuide.html#Administrative_Commands) for the detail. | +| `-clrSpaceQuota` `[-storageType ]` \...\ | See [HDFS Quotas Guide](../hadoop-hdfs/HdfsQuotaAdminGuide.html#Administrative_Commands) for the detail. | | `-setStoragePolicy` \ \ | Set a storage policy to a file or a directory. | | `-getStoragePolicy` \ | Get the storage policy of a file or a directory. | | `-finalizeUpgrade` | Finalize upgrade of HDFS. Datanodes delete their previous version working directories, followed by Namenode doing the same. This completes the upgrade process. | diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsQuotaAdminGuide.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsQuotaAdminGuide.md index a1bcd78ebcf..7c15bb11ed6 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsQuotaAdminGuide.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsQuotaAdminGuide.md @@ -19,6 +19,7 @@ HDFS Quotas Guide * [Overview](#Overview) * [Name Quotas](#Name_Quotas) * [Space Quotas](#Space_Quotas) + * [Storage Type Quotas](#Storage_Type_Quotas) * [Administrative Commands](#Administrative_Commands) * [Reporting Command](#Reporting_Command) @@ -41,6 +42,17 @@ The space quota is a hard limit on the number of bytes used by files in the tree Quotas are persistent with the fsimage. When starting, if the fsimage is immediately in violation of a quota (perhaps the fsimage was surreptitiously modified), a warning is printed for each of such violations. Setting or removing a quota creates a journal entry. +Storage Type Quotas +------------------ + +The storage type quota is a hard limit on the usage of specific storage type (SSD, DISK, ARCHIVE) by files in the tree rooted at the directory. It works similar to storage space quota in many aspects but offers fine-grain control over the cluster storage space usage. To set storage type quota on a directory, storage policies must be configured on the directory in order to allow files to be stored in different storage types according to the storage policy. See the [HDFS Storage Policy Documentation](./ArchivalStorage.html) for more information. + +The storage type quota can be combined with the space quotas and name quotas to efficiently manage the cluster storage usage. For example, + +1. For directories with storage policy configured, administrator should set storage type quotas for resource constraint storage types such as SSD and leave quotas for other storage types and overall space quota with either less restrictive values or default unlimited. HDFS will deduct quotas from both target storage type based on storage policy and the overall space quota. +2. For directories without storage policy configured, administrator should not configure storage type quota. Storage type quota can be configured even though the specific storage type is unavailable (or available but not configured properly with storage type information). However, overall space quota is recommended in this case as the storage type information is either unavailable or inaccurate for storage type quota enforcement. +3. Storage type quota on DISK are of limited use except when DISK is not the dominant storage medium. (e.g. cluster with predominantly ARCHIVE storage). + Administrative Commands ----------------------- @@ -77,17 +89,40 @@ Quotas are managed by a set of commands available only to the administrator. directory, with faults reported if the directory does not exist or it is a file. It is not a fault if the directory has no quota. + +* `hdfs dfsadmin -setSpaceQuota -storageType ...` + + Set the storage type quota to be N bytes of storage type specified for each directory. + This is a hard limit on total storage type usage for all the files under the directory tree. + The storage type quota usage reflects the intended usage based on storage policy. For example, + one GB of data with replication of 3 and ALL_SSD storage policy consumes 3GB of SSD quota. N + can also be specified with a binary prefix for convenience, for e.g. 50g for 50 + gigabytes and 2t for 2 terabytes etc. Best effort for each + directory, with faults reported if N is neither zero nor a positive + integer, the directory does not exist or it is a file, or the + directory would immediately exceed the new quota. + +* `hdfs dfsadmin -clrSpaceQuota -storageType ...` + + Remove storage type quota specified for each directory. Best effort + for each directory, with faults reported if the directory does not exist or + it is a file. It is not a fault if the directory has no storage type quota on + for storage type specified. + Reporting Command ----------------- An an extension to the count command of the HDFS shell reports quota values and the current count of names and bytes in use. -* `hadoop fs -count -q [-h] [-v] ...` +* `hadoop fs -count -q [-h] [-v] [-t [comma-separated list of storagetypes]] ...` With the -q option, also report the name quota value set for each directory, the available name quota remaining, the space quota value set, and the available space quota remaining. If the directory does not have a quota set, the reported values are `none` and `inf`. The -h option shows sizes in human readable format. - The -v option displays a header line. - + The -v option displays a header line. The -t option displays the per + storage type quota set and the available quota remaining for each directory. + If specific storage types are given after -t option, only quota and remaining + quota of the types specified will be displayed. Otherwise, quota and + remaining quota of all storage types that support quota will be displayed. \ No newline at end of file