hadoop/hadoop-ozone/docs/content/Metrics.md

7.5 KiB

title menu
Metrics main

HDFS Ozone Metrics

Overview

The container metrics that is used in HDFS Ozone.

Storage Container Metrics

The metrics for various storage container operations in HDFS Ozone.

Storage container is an optional service that can be enabled by setting 'ozone.enabled' to true. These metrics are only available when ozone is enabled.

Storage Container Metrics maintains a set of generic metrics for all container RPC calls that can be made to a datandoe/container.

Along with the total number of RPC calls containers maintain a set of metrics for each RPC call. Following is the set of counters maintained for each RPC operation.

Total number of operation - We maintain an array which counts how many times a specific operation has been performed. Eg.NumCreateContainer tells us how many times create container has been invoked on this datanode.

Total number of pending operation - This is an array which counts how many times a specific operation is waitting to be processed from the client point of view. Eg.NumPendingCreateContainer tells us how many create container requests that waitting to be processed.

Average latency of each pending operation in nanoseconds - The average latency of the operation from the client point of view. Eg. CreateContainerLatencyAvgTime - This tells us the average latency of Create Container from the client point of view.

Number of bytes involved in a specific command - This is an array that is maintained for all operations, but makes sense only for read and write operations.

While it is possible to read the bytes in update container, it really makes no sense, since no data stream involved. Users are advised to use this metric only when it makes sense. Eg. BytesReadChunk -- Tells us how many bytes have been read from this data using Read Chunk operation.

Average Latency of each operation - The average latency of the operation. Eg. LatencyCreateContainerAvgTime - This tells us the average latency of Create Container.

Quantiles for each of these operations - The 50/75/90/95/99th percentile of these operations. Eg. CreateContainerNanos60s50thPercentileLatency -- gives latency of the create container operations at the 50th percentile latency (1 minute granularity). We report 50th, 75th, 90th, 95th and 99th percentile for all RPCs.

So this leads to the containers reporting these counters for each of these RPC operations.

Name Description
NumOps Total number of container operations
CreateContainer Create container operation
ReadContainer Read container operation
UpdateContainer Update container operations
DeleteContainer Delete container operations
ListContainer List container operations
PutKey Put key operations
GetKey Get key operations
DeleteKey Delete key operations
ListKey List key operations
ReadChunk Read chunk operations
DeleteChunk Delete chunk operations
WriteChunk Write chunk operations
ListChunk List chunk operations
CompactChunk Compact chunk operations
PutSmallFile Put small file operations
GetSmallFile Get small file operations
CloseContainer Close container operations

Storage Container Manager Metrics

The metrics for containers that managed by Storage Container Manager.

Storage Container Manager (SCM) is a master service which keeps track of replicas of storage containers. It also manages all data nodes and their states, dealing with container reports and dispatching commands for execution.

Following are the counters for containers:

Name Description
LastContainerReportSize Total size in bytes of all containers in latest container report that SCM received from datanode
LastContainerReportUsed Total number of bytes used by all containers in latest container report that SCM received from datanode
LastContainerReportKeyCount Total number of keys in all containers in latest container report that SCM received from datanode
LastContainerReportReadBytes Total number of bytes have been read from all containers in latest container report that SCM received from datanode
LastContainerReportWriteBytes Total number of bytes have been written into all containers in latest container report that SCM received from datanode
LastContainerReportReadCount Total number of times containers have been read from in latest container report that SCM received from datanode
LastContainerReportWriteCount Total number of times containers have been written to in latest container report that SCM received from datanode
ContainerReportSize Total size in bytes of all containers over whole cluster
ContainerReportUsed Total number of bytes used by all containers over whole cluster
ContainerReportKeyCount Total number of keys in all containers over whole cluster
ContainerReportReadBytes Total number of bytes have been read from all containers over whole cluster
ContainerReportWriteBytes Total number of bytes have been written into all containers over whole cluster
ContainerReportReadCount Total number of times containers have been read from over whole cluster
ContainerReportWriteCount Total number of times containers have been written to over whole cluster

Key Space Metrics

The metrics for various Ozone Manager operations in HDFS Ozone.

The Ozone Manager (OM) is a service that similar to the Namenode in HDFS. In the current design of OM, it maintains metadata of all volumes, buckets and keys. These metrics are only available when ozone is enabled.

Following is the set of counters maintained for each key space operation.

Total number of operation - We maintain an array which counts how many times a specific operation has been performed. Eg.NumVolumeCreate tells us how many times create volume has been invoked in OM.

Total number of failed operation - This type operation is opposite to the above operation. Eg.NumVolumeCreateFails tells us how many times create volume has been invoked failed in OM.

Following are the counters for each of key space operations.

Name Description
VolumeCreate Create volume operation
VolumeUpdates Update volume property operation
VolumeInfos Get volume information operation
VolumeCheckAccesses Check volume access operation
VolumeDeletes Delete volume operation
VolumeLists List volume operation
BucketCreates Create bucket operation
BucketInfos Get bucket information operation
BucketUpdates Update bucket property operation
BucketDeletes Delete bucket operation
BucketLists List bucket operation
KeyAllocate Allocate key operation
KeyLookup Look up key operation
KeyDeletes Delete key operation
KeyLists List key operation