From 1364847fdd9e03de3d7fe2240b695342398552d4 Mon Sep 17 00:00:00 2001 From: PHILO-HE Date: Wed, 8 Dec 2021 20:22:40 +0800 Subject: [PATCH] HDFS-15788. Correct the statement for pmem cache to reflect cache persistence support (#3761) --- .../src/site/markdown/CentralizedCacheManagement.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/CentralizedCacheManagement.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/CentralizedCacheManagement.md index da3d45c2060..09ef0370f7f 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/CentralizedCacheManagement.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/CentralizedCacheManagement.md @@ -32,7 +32,8 @@ Centralized cache management in HDFS has many significant advantages. 4. Centralized caching can improve overall cluster memory utilization. When relying on the OS buffer cache at each DataNode, repeated reads of a block will result in all *n* replicas of the block being pulled into buffer cache. With centralized cache management, a user can explicitly pin only *m* of the *n* replicas, saving *n-m* memory. -5. HDFS supports non-volatile storage class memory (SCM, also known as persistent memory) cache in Linux platform. User can enable either memory cache or SCM cache for a DataNode. Memory cache and SCM cache can coexist among DataNodes. In the current implementation, the cache data in SCM will be cleaned up when DataNode restarts. Persistent HDFS cache support on SCM will be considered in the future. +5. HDFS supports non-volatile storage class memory (SCM, also known as persistent memory) cache in Linux platform. User can enable either DRAM cache or SCM cache for a DataNode. DRAM cache and SCM cache can coexist among DataNodes. In addition, cache persistence is supported by SCM cache. The status of cache persisted in SCM will be recovered +during the start of DataNode if `dfs.datanode.pmem.cache.recovery` is set to true. Otherwise, previously persisted cache will be dropped and data need to be re-cached. Use Cases --------- @@ -260,7 +261,7 @@ The following properties are not required, but may be specified for tuning: * dfs.datanode.pmem.cache.recovery - This parameter is used to determine whether to recover the status for previous cache on persistent memory during the start of DataNode. If it is enabled, DataNode will recover the status for previously cached data on persistent memory. Thus, re-caching data will be avoided. If this property is not enabled, DataNode will clean up the previous cache, if any, on persistent memory. This property can only work when persistent memory is enabled, i.e., `dfs.datanode.pmem.cache.dirs` is configured. + This parameter is used to determine whether to recover the status for previous cache on persistent memory during the start of DataNode. If it is enabled, DataNode will recover the status for previously cached data on persistent memory. Thus, re-caching is avoided. If this property is not enabled, DataNode will drop cache, if any, on persistent memory. This property can only work when persistent memory cache is enabled, i.e., `dfs.datanode.pmem.cache.dirs` is configured. ### OS Limits