mirror of
https://github.com/apache/druid.git
synced 2025-02-17 07:25:02 +00:00
Minor processor quota computation fix + docs (#11783)
* cpu/cpuset cgroup and procfs data gathering * Renames and default values * Formatting * Trigger Build * Add cgroup monitors * Return 0 if no period * Update * Minor processor quota computation fix + docs * Address comments * Address comments * Fix spellcheck Co-authored-by: arunramani-imply <84351090+arunramani-imply@users.noreply.github.com>
This commit is contained in:
parent
42e44269be
commit
b6b42d3936
@ -65,12 +65,24 @@ public class CgroupCpuMonitor extends FeedDefiningMonitor
|
||||
emitter.emit(builder.build("cgroup/cpu/shares", cpuSnapshot.getShares()));
|
||||
emitter.emit(builder.build(
|
||||
"cgroup/cpu/cores_quota",
|
||||
cpuSnapshot.getPeriodUs() == 0
|
||||
? 0
|
||||
: ((double) cpuSnapshot.getQuotaUs()
|
||||
) / cpuSnapshot.getPeriodUs()
|
||||
computeProcessorQuota(cpuSnapshot.getQuotaUs(), cpuSnapshot.getPeriodUs())
|
||||
));
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Calculates the total cores allocated through quotas. A negative value indicates that no quota has been specified.
|
||||
* We use -1 because that's the default value used in the cgroup.
|
||||
*
|
||||
* @param quotaUs the cgroup quota value.
|
||||
* @param periodUs the cgroup period value.
|
||||
* @return the calculated processor quota, -1 if no quota or period set.
|
||||
*/
|
||||
public static double computeProcessorQuota(long quotaUs, long periodUs)
|
||||
{
|
||||
return quotaUs < 0 || periodUs == 0
|
||||
? -1
|
||||
: (double) quotaUs / periodUs;
|
||||
}
|
||||
}
|
||||
|
@ -79,4 +79,14 @@ public class CgroupCpuMonitorTest
|
||||
Assert.assertEquals("cgroup/cpu/cores_quota", coresEvent.get("metric"));
|
||||
Assert.assertEquals(3.0D, coresEvent.get("value"));
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testQuotaCompute()
|
||||
{
|
||||
Assert.assertEquals(-1, CgroupCpuMonitor.computeProcessorQuota(-1, 100000), 0);
|
||||
Assert.assertEquals(0, CgroupCpuMonitor.computeProcessorQuota(0, 100000), 0);
|
||||
Assert.assertEquals(-1, CgroupCpuMonitor.computeProcessorQuota(100000, 0), 0);
|
||||
Assert.assertEquals(2.0D, CgroupCpuMonitor.computeProcessorQuota(200000, 100000), 0);
|
||||
Assert.assertEquals(0.5D, CgroupCpuMonitor.computeProcessorQuota(50000, 100000), 0);
|
||||
}
|
||||
}
|
||||
|
@ -362,12 +362,15 @@ The following monitors are available:
|
||||
|----|-----------|
|
||||
|`org.apache.druid.client.cache.CacheMonitor`|Emits metrics (to logs) about the segment results cache for Historical and Broker processes. Reports typical cache statistics include hits, misses, rates, and size (bytes and number of entries), as well as timeouts and and errors.|
|
||||
|`org.apache.druid.java.util.metrics.SysMonitor`|Reports on various system activities and statuses using the [SIGAR library](https://github.com/hyperic/sigar). Requires execute privileges on files in `java.io.tmpdir`. Do not set `java.io.tmpdir` to `noexec` when using `SysMonitor`.|
|
||||
|`org.apache.druid.server.metrics.HistoricalMetricsMonitor`|Reports statistics on Historical processes. Available only on Historical processes.|
|
||||
|`org.apache.druid.java.util.metrics.JvmMonitor`|Reports various JVM-related statistics.|
|
||||
|`org.apache.druid.java.util.metrics.JvmCpuMonitor`|Reports statistics of CPU consumption by the JVM.|
|
||||
|`org.apache.druid.java.util.metrics.CpuAcctDeltaMonitor`|Reports consumed CPU as per the cpuacct cgroup.|
|
||||
|`org.apache.druid.java.util.metrics.JvmThreadsMonitor`|Reports Thread statistics in the JVM, like numbers of total, daemon, started, died threads.|
|
||||
|`org.apache.druid.java.util.metrics.CgroupCpuMonitor`|Reports CPU shares and quotas as per the `cpu` cgroup.|
|
||||
|`org.apache.druid.java.util.metrics.CgroupCpuSetMonitor`|Reports CPU core/HT and memory node allocations as per the `cpuset` cgroup.|
|
||||
|`org.apache.druid.java.util.metrics.CgroupMemoryMonitor`|Reports memory statistic as per the memory cgroup.|
|
||||
|`org.apache.druid.server.metrics.EventReceiverFirehoseMonitor`|Reports how many events have been queued in the EventReceiverFirehose.|
|
||||
|`org.apache.druid.server.metrics.HistoricalMetricsMonitor`|Reports statistics on Historical processes. Available only on Historical processes.|
|
||||
|`org.apache.druid.server.metrics.QueryCountStatsMonitor`|Reports how many queries have been successful/failed/interrupted.|
|
||||
|`org.apache.druid.server.emitter.HttpEmittingMonitor`|Reports internal metrics of `http` or `parametrized` emitter (see below). Must not be used with another emitter type. See the description of the metrics here: https://github.com/apache/druid/pull/4973.|
|
||||
|`org.apache.druid.server.metrics.TaskCountStatsMonitor`|Reports how many ingestion tasks are currently running/pending/waiting and also the number of successful/failed tasks per emission period.|
|
||||
|
@ -325,8 +325,8 @@ These metrics are only available if the SysMonitor module is included.
|
||||
|`sys/swap/pageOut`|Paged out swap.||Varies.|
|
||||
|`sys/disk/write/count`|Writes to disk.|fsDevName, fsDirName, fsTypeName, fsSysTypeName, fsOptions.|Varies.|
|
||||
|`sys/disk/read/count`|Reads from disk.|fsDevName, fsDirName, fsTypeName, fsSysTypeName, fsOptions.|Varies.|
|
||||
|`sys/disk/write/size`|Bytes written to disk. Can we used to determine how much paging is occurring with regards to segments.|fsDevName, fsDirName, fsTypeName, fsSysTypeName, fsOptions.|Varies.|
|
||||
|`sys/disk/read/size`|Bytes read from disk. Can we used to determine how much paging is occurring with regards to segments.|fsDevName, fsDirName, fsTypeName, fsSysTypeName, fsOptions.|Varies.|
|
||||
|`sys/disk/write/size`|Bytes written to disk. One indicator of the amount of paging occurring for segments.|`fsDevName`,`fsDirName`,`fsTypeName`, `fsSysTypeName`, `fsOptions`.|Varies.|
|
||||
|`sys/disk/read/size`|Bytes read from disk. One indicator of the amount of paging occurring for segments.|`fsDevName`,`fsDirName`, `fsTypeName`, `fsSysTypeName`, `fsOptions`.|Varies.|
|
||||
|`sys/net/write/size`|Bytes written to the network.|netName, netAddress, netHwaddr|Varies.|
|
||||
|`sys/net/read/size`|Bytes read from the network.|netName, netAddress, netHwaddr|Varies.|
|
||||
|`sys/fs/used`|Filesystem bytes used.|fsDevName, fsDirName, fsTypeName, fsSysTypeName, fsOptions.|< max|
|
||||
@ -336,3 +336,17 @@ These metrics are only available if the SysMonitor module is included.
|
||||
|`sys/storage/used`|Disk space used.|fsDirName.|Varies.|
|
||||
|`sys/cpu`|CPU used.|cpuName, cpuTime.|Varies.|
|
||||
|
||||
## Cgroup
|
||||
|
||||
These metrics are available on operating systems with the cgroup kernel feature. All the values are derived by reading from `/sys/fs/cgroup`.
|
||||
|
||||
|Metric|Description|Dimensions|Normal Value|
|
||||
|------|-----------|----------|------------|
|
||||
|`cgroup/cpu/shares`|Relative value of CPU time available to this process. Read from `cpu.shares`.||Varies.|
|
||||
|`cgroup/cpu/cores_quota`|Number of cores available to this process. Derived from `cpu.cfs_quota_us`/`cpu.cfs_period_us`.||Varies. A value of -1 indicates there is no explicit quota set.|
|
||||
|`cgroup/memory/*`|Memory stats for this process (e.g. `cache`, `total_swap`, etc.). Each stat produces a separate metric. Read from `memory.stat`.||Varies.|
|
||||
|`cgroup/memory_numa/*/pages`|Memory stats, per NUMA node, for this process (e.g. `total`, `unevictable`, etc.). Each stat produces a separate metric. Read from `memory.num_stat`.|`numaZone`|Varies.|
|
||||
|`cgroup/cpuset/cpu_count`|Total number of CPUs available to the process. Derived from `cpuset.cpus`.||Varies.|
|
||||
|`cgroup/cpuset/effective_cpu_count`|Total number of active CPUs available to the process. Derived from `cpuset.effective_cpus`.||Varies.|
|
||||
|`cgroup/cpuset/mems_count`|Total number of memory nodes available to the process. Derived from `cpuset.mems`.||Varies.|
|
||||
|`cgroup/cpuset/effective_mems_count`|Total number of active memory nodes available to the process. Derived from `cpuset.effective_mems`.||Varies.|
|
||||
|
@ -1713,6 +1713,7 @@ LoggingEmitter
|
||||
Los_Angeles
|
||||
MDC
|
||||
NoopServiceEmitter
|
||||
NUMA
|
||||
ONLY_EVENTS
|
||||
P1D
|
||||
P1W
|
||||
|
Loading…
x
Reference in New Issue
Block a user