druid/docs/development/extensions-contrib/statsd.md

76 lines
4.9 KiB
Markdown
Raw Permalink Normal View History

---
id: statsd
title: "StatsD Emitter"
---
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
2016-04-28 21:41:02 -04:00
To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `statsd-emitter` in the extensions load list.
2016-04-28 21:41:02 -04:00
## Introduction
This extension emits druid metrics to a StatsD server.
(https://github.com/etsy/statsd)
(https://github.com/armon/statsite)
## Configuration
All the configuration parameters for the StatsD emitter are under `druid.emitter.statsd`.
|property|description|required?|default|
|--------|-----------|---------|-------|
|`druid.emitter.statsd.hostname`|The hostname of the StatsD server.|yes|none|
|`druid.emitter.statsd.port`|The port of the StatsD server.|yes|none|
|`druid.emitter.statsd.prefix`|Optional metric name prefix.|no|""|
|`druid.emitter.statsd.separator`|Metric name separator|no|.|
|`druid.emitter.statsd.includeHost`|Flag to include the hostname as part of the metric name.|no|false|
|`druid.emitter.statsd.dimensionMapPath`|JSON file defining the StatsD type, and desired dimensions for every Druid metric|no|Default mapping provided. See below.|
|`druid.emitter.statsd.blankHolder`|The blank character replacement as StatsD does not support path with blank character|no|"-"|
|`druid.emitter.statsd.queueSize`|Maximum number of unprocessed messages in the message queue.|no|Default value of StatsD Client(4096)|
|`druid.emitter.statsd.poolSize`|Network packet buffer pool size.|no|Default value of StatsD Client(512)|
|`druid.emitter.statsd.processorWorkers`|The number of processor worker threads assembling buffers for submission.|no|Default value of StatsD Client(1)|
|`druid.emitter.statsd.senderWorkers`| The number of sender worker threads submitting buffers to the socket.|no|Default value of StatsD Client(1)|
Support DogStatsD style tags in statsd-emitter (#6605) * Replace StatsD client library The [Datadog package][1] is a StatsD compatible drop-in replacement for the client library, but it seems to be [better maintained][2] and has support for Datadog DogStatsD specific features, which will be made use of in a subsequent commit. The `count`, `time`, and `gauge` methods are actually exactly compatible with the previous library and the modifications shouldn't be required, but EasyMock seems to have a hard time dealing with the variable arguments added by the DogStatsD library and causes tests to fail if no arguments are provided for the last String vararg. Passing an empty array fixes the test failures. [1]: https://github.com/DataDog/java-dogstatsd-client [2]: https://github.com/tim-group/java-statsd-client/issues/37#issuecomment-248698856 * Retain dimension key information for StatsD metrics This doesn't change behavior, but allows separating dimensions from the metric name in subsequent commits. There is a possible order change for values from `dimsBuilder.build().values()`, but from the tests it looks like it doesn't affect actual behavior and the order of user dimensions is also retained. * Support DogStatsD style tags in statsd-emitter Datadog [doesn't support name-encoded dimensions and uses a concept of _tags_ instead.][1] This change allows Datadog users to send the metrics without having to encode the various dimensions in the metric names. This enables building graphs and monitors with and without aggregation across various dimensions from the same data. As tests in this commit verify, the behavior remains the same for users who don't enable the `druid.emitter.statsd.dogstatsd` configuration flag. [1]: https://www.datadoghq.com/blog/the-power-of-tagged-metrics/#tags-decouple-collection-and-reporting * Disable convertRange behavior for DogStatsD users DogStatsD, unlike regular StatsD, supports floating-point values, so this behavior is unnecessary. It would be possible to still support `convertRange`, even with `dogstatsd` enabled, but that would mean that people using the default mapping would have some of the gauges unnecessarily converted. `time` is in milliseconds and doesn't support floating-point values.
2018-11-19 12:47:57 -05:00
|`druid.emitter.statsd.dogstatsd`|Flag to enable [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/) support. Causes dimensions to be included as tags, not as a part of the metric name. `convertRange` fields will be ignored.|no|false|
|`druid.emitter.statsd.dogstatsdConstantTags`|If `druid.emitter.statsd.dogstatsd` is true, the tags in the JSON list of strings will be sent with every event.|no|[]|
|`druid.emitter.statsd.dogstatsdServiceAsTag`|If `druid.emitter.statsd.dogstatsd` and `druid.emitter.statsd.dogstatsdServiceAsTag` are true, druid service (e.g. `druid/broker`, `druid/coordinator`, etc) is reported as a tag (e.g. `druid_service:druid/broker`) instead of being included in metric name (e.g. `druid.broker.query.time`) and `druid` is used as metric prefix (e.g. `druid.query.time`).|no|false|
|`druid.emitter.statsd.dogstatsdEvents`|If `druid.emitter.statsd.dogstatsd` and `druid.emitter.statsd.dogstatsdEvents` are true, [Alert events](../../operations/alerts.md) are reported to DogStatsD.|no|false|
2016-04-28 21:41:02 -04:00
### Druid to StatsD Event Converter
Each metric sent to StatsD must specify a type, one of `[timer, counter, guage]`. StatsD Emitter expects this mapping to
be provided as a JSON file. Additionally, this mapping specifies which dimensions should be included for each metric.
StatsD expects that metric values be integers. Druid emits some metrics with values between the range 0 and 1. To accommodate these metrics they are converted
into the range 0 to 100. This conversion can be enabled by setting the optional "convertRange" field true in the JSON mapping file.
2016-04-28 21:41:02 -04:00
If the user does not specify their own JSON file, a default mapping is used. All
metrics are expected to be mapped. Metrics which are not mapped will log an error.
StatsD metric path is organized using the following schema:
`<druid metric name> : { "dimensions" : <dimension list>, "type" : <StatsD type>, "convertRange" : true/false}`
2016-04-28 21:41:02 -04:00
e.g.
`query/time" : { "dimensions" : ["dataSource", "type"], "type" : "timer"}`
For metrics which are emitted from multiple services with different dimensions, the metric name is prefixed with
the service name.
2016-04-28 21:41:02 -04:00
e.g.
`"druid/coordinator-segment/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
"druid/historical-segment/count" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" }`
2016-04-28 21:41:02 -04:00
For most use-cases, the default mapping is sufficient.