mirror of https://github.com/apache/druid.git
API reference refactor (#14372)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com>
This commit is contained in:
parent
fc08617e9e
commit
579b93f282
|
@ -1,7 +1,7 @@
|
||||||
---
|
---
|
||||||
id: api-reference
|
id: api-reference
|
||||||
title: HTTP API endpoints reference
|
title: API reference
|
||||||
sidebar_label: API endpoints reference
|
sidebar_label: Overview
|
||||||
---
|
---
|
||||||
|
|
||||||
<!--
|
<!--
|
||||||
|
@ -24,864 +24,21 @@ sidebar_label: API endpoints reference
|
||||||
-->
|
-->
|
||||||
|
|
||||||
|
|
||||||
This topic documents all of the API endpoints for each Druid service type.
|
This topic is an index to the Apache Druid API documentation.
|
||||||
|
|
||||||
## Common
|
## HTTP APIs
|
||||||
|
* [Druid SQL queries](./sql-api.md) to submit SQL queries using the Druid SQL API.
|
||||||
All processes support the following endpoints.
|
* [SQL-based ingestion](./sql-ingestion-api.md) to submit SQL-based batch ingestion requests.
|
||||||
|
* [JSON querying](./json-querying-api.md) to submit JSON-based native queries.
|
||||||
### Process information
|
* [Tasks](./tasks-api.md) to manage data ingestion operations.
|
||||||
|
* [Supervisors](./supervisor-api.md) to manage supervisors for data ingestion lifecycle and data processing.
|
||||||
`GET /status`
|
* [Retention rules](./retention-rules-api.md) to define and manage data retention rules across datasources.
|
||||||
|
* [Data management](./data-management-api.md) to manage data segments.
|
||||||
Returns the Druid version, loaded extensions, memory used, total memory, and other useful information about the process.
|
* [Automatic compaction](./automatic-compaction-api.md) to optimize segment sizes after ingestion.
|
||||||
|
* [Lookups](./lookups-api.md) to manage and modify key-value datasources.
|
||||||
`GET /status/health`
|
* [Service status](./service-status-api.md) to monitor components within the Druid cluster.
|
||||||
|
* [Dynamic configuration](./dynamic-configuration-api.md) to configure the behavior of the Coordinator and Overlord processes.
|
||||||
Always returns a boolean `true` value with a 200 OK response, useful for automated health checks.
|
* [Legacy metadata](./legacy-metadata-api.md) to retrieve datasource metadata.
|
||||||
|
|
||||||
`GET /status/properties`
|
## Java APIs
|
||||||
|
* [SQL JDBC driver](./sql-jdbc.md) to connect to Druid and make Druid SQL queries using the Avatica JDBC driver.
|
||||||
Returns the current configuration properties of the process.
|
|
||||||
|
|
||||||
`GET /status/selfDiscovered/status`
|
|
||||||
|
|
||||||
Returns a JSON map of the form `{"selfDiscovered": true/false}`, indicating whether the node has received a confirmation
|
|
||||||
from the central node discovery mechanism (currently ZooKeeper) of the Druid cluster that the node has been added to the
|
|
||||||
cluster. It is recommended to not consider a Druid node "healthy" or "ready" in automated deployment/container
|
|
||||||
management systems until it returns `{"selfDiscovered": true}` from this endpoint. This is because a node may be
|
|
||||||
isolated from the rest of the cluster due to network issues and it doesn't make sense to consider nodes "healthy" in
|
|
||||||
this case. Also, when nodes such as Brokers use ZooKeeper segment discovery for building their view of the Druid cluster
|
|
||||||
(as opposed to HTTP segment discovery), they may be unusable until the ZooKeeper client is fully initialized and starts
|
|
||||||
to receive data from the ZooKeeper cluster. `{"selfDiscovered": true}` is a proxy event indicating that the ZooKeeper
|
|
||||||
client on the node has started to receive data from the ZooKeeper cluster and it's expected that all segments and other
|
|
||||||
nodes will be discovered by this node timely from this point.
|
|
||||||
|
|
||||||
`GET /status/selfDiscovered`
|
|
||||||
|
|
||||||
Similar to `/status/selfDiscovered/status`, but returns 200 OK response with empty body if the node has discovered itself
|
|
||||||
and 503 SERVICE UNAVAILABLE if the node hasn't discovered itself yet. This endpoint might be useful because some
|
|
||||||
monitoring checks such as AWS load balancer health checks are not able to look at the response body.
|
|
||||||
|
|
||||||
## Master server
|
|
||||||
|
|
||||||
This section documents the API endpoints for the processes that reside on Master servers (Coordinators and Overlords)
|
|
||||||
in the suggested [three-server configuration](../design/processes.md#server-types).
|
|
||||||
|
|
||||||
### Coordinator
|
|
||||||
|
|
||||||
#### Leadership
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/leader`
|
|
||||||
|
|
||||||
Returns the current leader Coordinator of the cluster.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/isLeader`
|
|
||||||
|
|
||||||
Returns a JSON object with `leader` parameter, either true or false, indicating if this server is the current leader
|
|
||||||
Coordinator of the cluster. In addition, returns HTTP 200 if the server is the current leader and HTTP 404 if not.
|
|
||||||
This is suitable for use as a load balancer status check if you only want the active leader to be considered in-service
|
|
||||||
at the load balancer.
|
|
||||||
|
|
||||||
|
|
||||||
<a name="coordinator-segment-loading"></a>
|
|
||||||
|
|
||||||
#### Segment loading
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/loadstatus`
|
|
||||||
|
|
||||||
Returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/loadstatus?simple`
|
|
||||||
|
|
||||||
Returns the number of segments left to load until segments that should be loaded in the cluster are available for queries. This does not include segment replication counts.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/loadstatus?full`
|
|
||||||
|
|
||||||
Returns the number of segments left to load in each tier until segments that should be loaded in the cluster are all available. This includes segment replication counts.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/loadstatus?full&computeUsingClusterView`
|
|
||||||
|
|
||||||
Returns the number of segments not yet loaded for each tier until all segments loading in the cluster are available.
|
|
||||||
The result includes segment replication counts. It also factors in the number of available nodes that are of a service type that can load the segment when computing the number of segments remaining to load.
|
|
||||||
A segment is considered fully loaded when:
|
|
||||||
- Druid has replicated it the number of times configured in the corresponding load rule.
|
|
||||||
- Or the number of replicas for the segment in each tier where it is configured to be replicated equals the available nodes of a service type that are currently allowed to load the segment in the tier.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/loadqueue`
|
|
||||||
|
|
||||||
Returns the ids of segments to load and drop for each Historical process.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/loadqueue?simple`
|
|
||||||
|
|
||||||
Returns the number of segments to load and drop, as well as the total segment load and drop size in bytes for each Historical process.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/loadqueue?full`
|
|
||||||
|
|
||||||
Returns the serialized JSON of segments to load and drop for each Historical process.
|
|
||||||
|
|
||||||
#### Segment loading by datasource
|
|
||||||
|
|
||||||
Note that all _interval_ query parameters are ISO 8601 strings—for example, 2016-06-27/2016-06-28.
|
|
||||||
Also note that these APIs only guarantees that the segments are available at the time of the call.
|
|
||||||
Segments can still become missing because of historical process failures or any other reasons afterward.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?forceMetadataRefresh={boolean}&interval={myInterval}`
|
|
||||||
|
|
||||||
Returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster for the given
|
|
||||||
datasource over the given interval (or last 2 weeks if interval is not given). `forceMetadataRefresh` is required to be set.
|
|
||||||
* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
|
|
||||||
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
|
|
||||||
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
|
|
||||||
* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
|
|
||||||
If no used segments are found for the given inputs, this API returns `204 No Content`
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?simple&forceMetadataRefresh={boolean}&interval={myInterval}`
|
|
||||||
|
|
||||||
Returns the number of segments left to load until segments that should be loaded in the cluster are available for the given datasource
|
|
||||||
over the given interval (or last 2 weeks if interval is not given). This does not include segment replication counts. `forceMetadataRefresh` is required to be set.
|
|
||||||
* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
|
|
||||||
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
|
|
||||||
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
|
|
||||||
* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
|
|
||||||
If no used segments are found for the given inputs, this API returns `204 No Content`
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?full&forceMetadataRefresh={boolean}&interval={myInterval}`
|
|
||||||
|
|
||||||
Returns the number of segments left to load in each tier until segments that should be loaded in the cluster are all available for the given datasource over the given interval (or last 2 weeks if interval is not given). This includes segment replication counts. `forceMetadataRefresh` is required to be set.
|
|
||||||
* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
|
|
||||||
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
|
|
||||||
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
|
|
||||||
* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
|
|
||||||
|
|
||||||
You can pass the optional query parameter `computeUsingClusterView` to factor in the available cluster services when calculating
|
|
||||||
the segments left to load. See [Coordinator Segment Loading](#coordinator-segment-loading) for details.
|
|
||||||
If no used segments are found for the given inputs, this API returns `204 No Content`
|
|
||||||
|
|
||||||
#### Metadata store information
|
|
||||||
|
|
||||||
> Note: Much of this information is available in a simpler, easier-to-use form through the Druid SQL
|
|
||||||
> [`sys.segments`](../querying/sql-metadata-tables.md#segments-table) table.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/segments`
|
|
||||||
|
|
||||||
Returns a list of all segments for each datasource enabled in the cluster.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/segments?datasources={dataSourceName1}&datasources={dataSourceName2}`
|
|
||||||
|
|
||||||
Returns a list of all segments for one or more specific datasources enabled in the cluster.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus`
|
|
||||||
|
|
||||||
Returns a list of all segments for each datasource with the full segment metadata and extra fields `overshadowed` and `replicationFactor`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus&datasources={dataSourceName1}&datasources={dataSourceName2}`
|
|
||||||
|
|
||||||
Returns a list of all segments for one or more specific datasources with the full segment metadata and extra fields `overshadowed` and `replicationFactor`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources`
|
|
||||||
|
|
||||||
Returns a list of the names of datasources with at least one used segment in the cluster, retrieved from the metadata database. Users should call this API to get the eventual state that the system will be in.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources?includeUnused`
|
|
||||||
|
|
||||||
Returns a list of the names of datasources, regardless of whether there are used segments belonging to those datasources in the cluster or not.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources?includeDisabled`
|
|
||||||
|
|
||||||
Returns a list of the names of datasources, regardless of whether the datasource is disabled or not.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources?full`
|
|
||||||
|
|
||||||
Returns a list of all datasources with at least one used segment in the cluster. Returns all metadata about those datasources as stored in the metadata store.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}`
|
|
||||||
|
|
||||||
Returns full metadata for a datasource as stored in the metadata store.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments`
|
|
||||||
|
|
||||||
Returns a list of all segments for a datasource as stored in the metadata store.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments?full`
|
|
||||||
|
|
||||||
Returns a list of all segments for a datasource with the full segment metadata as stored in the metadata store.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments/{segmentId}`
|
|
||||||
|
|
||||||
Returns full segment metadata for a specific segment as stored in the metadata store, if the segment is used. If the
|
|
||||||
segment is unused, or is unknown, a 404 response is returned.
|
|
||||||
|
|
||||||
##### POST
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments`
|
|
||||||
|
|
||||||
Returns a list of all segments, overlapping with any of given intervals, for a datasource as stored in the metadata store. Request body is array of string IS0 8601 intervals like `[interval1, interval2,...]`—for example, `["2012-01-01T00:00:00.000/2012-01-03T00:00:00.000", "2012-01-05T00:00:00.000/2012-01-07T00:00:00.000"]`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments?full`
|
|
||||||
|
|
||||||
Returns a list of all segments, overlapping with any of given intervals, for a datasource with the full segment metadata as stored in the metadata store. Request body is array of string ISO 8601 intervals like `[interval1, interval2,...]`—for example, `["2012-01-01T00:00:00.000/2012-01-03T00:00:00.000", "2012-01-05T00:00:00.000/2012-01-07T00:00:00.000"]`.
|
|
||||||
|
|
||||||
<a name="coordinator-datasources"></a>
|
|
||||||
|
|
||||||
#### Datasources
|
|
||||||
|
|
||||||
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`—for example, `2016-06-27_2016-06-28`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources`
|
|
||||||
|
|
||||||
Returns a list of datasource names found in the cluster as seen by the coordinator. This view is updated every [`druid.coordinator.period`](../configuration/index.md#coordinator-operation).
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources?simple`
|
|
||||||
|
|
||||||
Returns a list of JSON objects containing the name and properties of datasources found in the cluster. Properties include segment count, total segment byte size, replicated total segment byte size, minTime, and maxTime.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources?full`
|
|
||||||
|
|
||||||
Returns a list of datasource names found in the cluster with all metadata about those datasources.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}`
|
|
||||||
|
|
||||||
Returns a JSON object containing the name and properties of a datasource. Properties include segment count, total segment byte size, replicated total segment byte size, minTime, and maxTime.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}?full`
|
|
||||||
|
|
||||||
Returns full metadata for a datasource.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals`
|
|
||||||
|
|
||||||
Returns a set of segment intervals.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals?simple`
|
|
||||||
|
|
||||||
Returns a map of an interval to a JSON object containing the total byte size of segments and number of segments for that interval.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals?full`
|
|
||||||
|
|
||||||
Returns a map of an interval to a map of segment metadata to a set of server names that contain the segment for that interval.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}`
|
|
||||||
|
|
||||||
Returns a set of segment ids for an interval.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}?simple`
|
|
||||||
|
|
||||||
Returns a map of segment intervals contained within the specified interval to a JSON object containing the total byte size of segments and number of segments for an interval.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}?full`
|
|
||||||
|
|
||||||
Returns a map of segment intervals contained within the specified interval to a map of segment metadata to a set of server names that contain the segment for an interval.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}/serverview`
|
|
||||||
|
|
||||||
Returns a map of segment intervals contained within the specified interval to information about the servers that contain the segment for an interval.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments`
|
|
||||||
|
|
||||||
Returns a list of all segments for a datasource in the cluster.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments?full`
|
|
||||||
|
|
||||||
Returns a list of all segments for a datasource in the cluster with the full segment metadata.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
|
|
||||||
|
|
||||||
Returns full segment metadata for a specific segment in the cluster.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/datasources/{dataSourceName}/tiers`
|
|
||||||
|
|
||||||
Return the tiers that a datasource exists in.
|
|
||||||
|
|
||||||
#### Note for Coordinator's POST and DELETE APIs
|
|
||||||
|
|
||||||
While segments may be enabled by issuing POST requests for the datasources, the Coordinator may again disable segments if they match any configured [drop rules](../operations/rule-configuration.md#drop-rules). Even if segments are enabled by these APIs, you must configure a [load rule](../operations/rule-configuration.md#load-rules) to load them onto Historical processes. If an indexing or kill task runs at the same time these APIs are invoked, the behavior is undefined. Some segments might be killed and others might be enabled. It's also possible that all segments might be disabled, but the indexing task can still read data from those segments and succeed.
|
|
||||||
|
|
||||||
> Avoid using indexing or kill tasks and these APIs at the same time for the same datasource and time chunk.
|
|
||||||
|
|
||||||
`POST /druid/coordinator/v1/datasources/{dataSourceName}`
|
|
||||||
|
|
||||||
Marks as used all segments belonging to a datasource. Returns a JSON object of the form
|
|
||||||
`{"numChangedSegments": <number>}` with the number of segments in the database whose state has been changed (that is,
|
|
||||||
the segments were marked as used) as the result of this API call.
|
|
||||||
|
|
||||||
`POST /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
|
|
||||||
|
|
||||||
Marks as used a segment of a datasource. Returns a JSON object of the form `{"segmentStateChanged": <boolean>}` with
|
|
||||||
the boolean indicating if the state of the segment has been changed (that is, the segment was marked as used) as the
|
|
||||||
result of this API call.
|
|
||||||
|
|
||||||
`POST /druid/coordinator/v1/datasources/{dataSourceName}/markUsed`
|
|
||||||
|
|
||||||
`POST /druid/coordinator/v1/datasources/{dataSourceName}/markUnused`
|
|
||||||
|
|
||||||
Marks segments (un)used for a datasource by interval or set of segment Ids. When marking used only segments that are not overshadowed will be updated.
|
|
||||||
|
|
||||||
The request payload contains the interval or set of segment IDs to be marked unused.
|
|
||||||
Either interval or segment IDs should be provided, if both or none are provided in the payload, the API would throw an error (400 BAD REQUEST).
|
|
||||||
|
|
||||||
Interval specifies the start and end times as IS0 8601 strings. `interval=(start/end)` where start and end both are inclusive and only the segments completely contained within the specified interval will be disabled, partially overlapping segments will not be affected.
|
|
||||||
|
|
||||||
JSON Request Payload:
|
|
||||||
|
|
||||||
|Key|Description|Example|
|
|
||||||
|----------|-------------|---------|
|
|
||||||
|`interval`|The interval for which to mark segments unused|`"2015-09-12T03:00:00.000Z/2015-09-12T05:00:00.000Z"`|
|
|
||||||
|`segmentIds`|Set of segment IDs to be marked unused|`["segmentId1", "segmentId2"]`|
|
|
||||||
|
|
||||||
`DELETE /druid/coordinator/v1/datasources/{dataSourceName}`
|
|
||||||
|
|
||||||
Marks as unused all segments belonging to a datasource. Returns a JSON object of the form
|
|
||||||
`{"numChangedSegments": <number>}` with the number of segments in the database whose state has been changed (that is,
|
|
||||||
the segments were marked as unused) as the result of this API call.
|
|
||||||
|
|
||||||
`DELETE /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}`
|
|
||||||
`@Deprecated. /druid/coordinator/v1/datasources/{dataSourceName}?kill=true&interval={myInterval}`
|
|
||||||
|
|
||||||
Runs a [Kill task](../ingestion/tasks.md) for a given interval and datasource.
|
|
||||||
|
|
||||||
`DELETE /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
|
|
||||||
|
|
||||||
Marks as unused a segment of a datasource. Returns a JSON object of the form `{"segmentStateChanged": <boolean>}` with
|
|
||||||
the boolean indicating if the state of the segment has been changed (that is, the segment was marked as unused) as the
|
|
||||||
result of this API call.
|
|
||||||
|
|
||||||
#### Retention rules
|
|
||||||
|
|
||||||
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/` as in `2016-06-27_2016-06-28`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/rules`
|
|
||||||
|
|
||||||
Returns all rules as JSON objects for all datasources in the cluster including the default datasource.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/rules/{dataSourceName}`
|
|
||||||
|
|
||||||
Returns all rules for a specified datasource.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/rules/{dataSourceName}?full`
|
|
||||||
|
|
||||||
Returns all rules for a specified datasource and includes default datasource.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/rules/history?interval=<interval>`
|
|
||||||
|
|
||||||
Returns audit history of rules for all datasources. Default value of interval can be specified by setting `druid.audit.manager.auditHistoryMillis` (1 week if not configured) in Coordinator `runtime.properties`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/rules/history?count=<n>`
|
|
||||||
|
|
||||||
Returns last `n` entries of audit history of rules for all datasources.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/rules/{dataSourceName}/history?interval=<interval>`
|
|
||||||
|
|
||||||
Returns audit history of rules for a specified datasource. Default value of interval can be specified by setting `druid.audit.manager.auditHistoryMillis` (1 week if not configured) in Coordinator `runtime.properties`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/rules/{dataSourceName}/history?count=<n>`
|
|
||||||
|
|
||||||
Returns last `n` entries of audit history of rules for a specified datasource.
|
|
||||||
|
|
||||||
`POST /druid/coordinator/v1/rules/{dataSourceName}`
|
|
||||||
|
|
||||||
POST with a list of rules in JSON form to update rules.
|
|
||||||
|
|
||||||
Optional Header Parameters for auditing the config change can also be specified.
|
|
||||||
|
|
||||||
|Header Param Name| Description | Default |
|
|
||||||
|----------|-------------|---------|
|
|
||||||
|`X-Druid-Author`| Author making the config change|`""`|
|
|
||||||
|`X-Druid-Comment`| Comment describing the change being done|`""`|
|
|
||||||
|
|
||||||
#### Intervals
|
|
||||||
|
|
||||||
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/` as in `2016-06-27_2016-06-28`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/intervals`
|
|
||||||
|
|
||||||
Returns all intervals for all datasources with total size and count.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/intervals/{interval}`
|
|
||||||
|
|
||||||
Returns aggregated total size and count for all intervals that intersect given ISO interval.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/intervals/{interval}?simple`
|
|
||||||
|
|
||||||
Returns total size and count for each interval within given ISO interval.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/intervals/{interval}?full`
|
|
||||||
|
|
||||||
Returns total size and count for each datasource for each interval within given ISO interval.
|
|
||||||
|
|
||||||
#### Dynamic configuration
|
|
||||||
|
|
||||||
See [Coordinator Dynamic Configuration](../configuration/index.md#dynamic-configuration) for details.
|
|
||||||
|
|
||||||
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
|
|
||||||
as in `2016-06-27_2016-06-28`.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/config`
|
|
||||||
|
|
||||||
Retrieves current coordinator dynamic configuration.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/config/history?interval={interval}&count={count}`
|
|
||||||
|
|
||||||
Retrieves history of changes to overlord dynamic configuration. Accepts `interval` and `count` query string parameters
|
|
||||||
to filter by interval and limit the number of results respectively.
|
|
||||||
|
|
||||||
`POST /druid/coordinator/v1/config`
|
|
||||||
|
|
||||||
Update overlord dynamic worker configuration.
|
|
||||||
|
|
||||||
#### Automatic compaction status
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/compaction/progress?dataSource={dataSource}`
|
|
||||||
|
|
||||||
Returns the total size of segments awaiting compaction for the given dataSource. The specified dataSource must have [automatic compaction](../data-management/automatic-compaction.md) enabled.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/compaction/status`
|
|
||||||
|
|
||||||
Returns the status and statistics from the auto-compaction run of all dataSources which have auto-compaction enabled in the latest run. The response payload includes a list of `latestStatus` objects. Each `latestStatus` represents the status for a dataSource (which has/had auto-compaction enabled).
|
|
||||||
|
|
||||||
The `latestStatus` object has the following keys:
|
|
||||||
* `dataSource`: name of the datasource for this status information
|
|
||||||
* `scheduleStatus`: auto-compaction scheduling status. Possible values are `NOT_ENABLED` and `RUNNING`. Returns `RUNNING ` if the dataSource has an active auto-compaction config submitted. Otherwise, returns `NOT_ENABLED`.
|
|
||||||
* `bytesAwaitingCompaction`: total bytes of this datasource waiting to be compacted by the auto-compaction (only consider intervals/segments that are eligible for auto-compaction)
|
|
||||||
* `bytesCompacted`: total bytes of this datasource that are already compacted with the spec set in the auto-compaction config
|
|
||||||
* `bytesSkipped`: total bytes of this datasource that are skipped (not eligible for auto-compaction) by the auto-compaction
|
|
||||||
* `segmentCountAwaitingCompaction`: total number of segments of this datasource waiting to be compacted by the auto-compaction (only consider intervals/segments that are eligible for auto-compaction)
|
|
||||||
* `segmentCountCompacted`: total number of segments of this datasource that are already compacted with the spec set in the auto-compaction config
|
|
||||||
* `segmentCountSkipped`: total number of segments of this datasource that are skipped (not eligible for auto-compaction) by the auto-compaction
|
|
||||||
* `intervalCountAwaitingCompaction`: total number of intervals of this datasource waiting to be compacted by the auto-compaction (only consider intervals/segments that are eligible for auto-compaction)
|
|
||||||
* `intervalCountCompacted`: total number of intervals of this datasource that are already compacted with the spec set in the auto-compaction config
|
|
||||||
* `intervalCountSkipped`: total number of intervals of this datasource that are skipped (not eligible for auto-compaction) by the auto-compaction
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/compaction/status?dataSource={dataSource}`
|
|
||||||
|
|
||||||
Similar to the API `/druid/coordinator/v1/compaction/status` above but filters response to only return information for the dataSource given.
|
|
||||||
The dataSource must have auto-compaction enabled.
|
|
||||||
|
|
||||||
#### Automatic compaction configuration
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/config/compaction`
|
|
||||||
|
|
||||||
Returns all automatic compaction configs.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/config/compaction/{dataSource}`
|
|
||||||
|
|
||||||
Returns an automatic compaction config of a dataSource.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/config/compaction/{dataSource}/history?interval={interval}&count={count}`
|
|
||||||
|
|
||||||
Returns the history of the automatic compaction config for a dataSource. Optionally accepts `interval` and `count`
|
|
||||||
query string parameters to filter by interval and limit the number of results respectively. If the dataSource does not
|
|
||||||
exist or there is no compaction history for the dataSource, an empty list is returned.
|
|
||||||
|
|
||||||
The response contains a list of objects with the following keys:
|
|
||||||
* `globalConfig`: A json object containing automatic compaction config that applies to the entire cluster.
|
|
||||||
* `compactionConfig`: A json object containing the automatic compaction config for the datasource.
|
|
||||||
* `auditInfo`: A json object that contains information about the change made - like `author`, `comment` and `ip`.
|
|
||||||
* `auditTime`: The date and time when the change was made.
|
|
||||||
|
|
||||||
`POST /druid/coordinator/v1/config/compaction/taskslots?ratio={someRatio}&max={someMaxSlots}`
|
|
||||||
|
|
||||||
Update the capacity for compaction tasks. `ratio` and `max` are used to limit the max number of compaction tasks.
|
|
||||||
They mean the ratio of the total task slots to the compaction task slots and the maximum number of task slots for compaction tasks, respectively. The actual max number of compaction tasks is `min(max, ratio * total task slots)`.
|
|
||||||
Note that `ratio` and `max` are optional and can be omitted. If they are omitted, default values (0.1 and unbounded)
|
|
||||||
will be set for them.
|
|
||||||
|
|
||||||
`POST /druid/coordinator/v1/config/compaction`
|
|
||||||
|
|
||||||
Creates or updates the [automatic compaction](../data-management/automatic-compaction.md) config for a dataSource. See [Automatic compaction dynamic configuration](../configuration/index.md#automatic-compaction-dynamic-configuration) for configuration details.
|
|
||||||
|
|
||||||
`DELETE /druid/coordinator/v1/config/compaction/{dataSource}`
|
|
||||||
|
|
||||||
Removes the automatic compaction config for a dataSource.
|
|
||||||
|
|
||||||
#### Server information
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/servers`
|
|
||||||
|
|
||||||
Returns a list of servers URLs using the format `{hostname}:{port}`. Note that
|
|
||||||
processes that run with different types will appear multiple times with different
|
|
||||||
ports.
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/servers?simple`
|
|
||||||
|
|
||||||
Returns a list of server data objects in which each object has the following keys:
|
|
||||||
* `host`: host URL include (`{hostname}:{port}`)
|
|
||||||
* `type`: process type (`indexer-executor`, `historical`)
|
|
||||||
* `currSize`: storage size currently used
|
|
||||||
* `maxSize`: maximum storage size
|
|
||||||
* `priority`
|
|
||||||
* `tier`
|
|
||||||
|
|
||||||
### Overlord
|
|
||||||
|
|
||||||
#### Leadership
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/leader`
|
|
||||||
|
|
||||||
Returns the current leader Overlord of the cluster. If you have multiple Overlords, just one is leading at any given time. The others are on standby.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/isLeader`
|
|
||||||
|
|
||||||
This returns a JSON object with field `leader`, either true or false. In addition, this call returns HTTP 200 if the
|
|
||||||
server is the current leader and HTTP 404 if not. This is suitable for use as a load balancer status check if you
|
|
||||||
only want the active leader to be considered in-service at the load balancer.
|
|
||||||
|
|
||||||
#### Tasks
|
|
||||||
|
|
||||||
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
|
|
||||||
as in `2016-06-27_2016-06-28`.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/tasks`
|
|
||||||
|
|
||||||
Retrieve list of tasks. Accepts query string parameters `state`, `datasource`, `createdTimeInterval`, `max`, and `type`.
|
|
||||||
|
|
||||||
|Query Parameter |Description |
|
|
||||||
|---|---|
|
|
||||||
|`state`|filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.|
|
|
||||||
| `datasource`| return tasks filtered by Druid datasource.|
|
|
||||||
| `createdTimeInterval`| return tasks created within the specified interval. |
|
|
||||||
| `max`| maximum number of `"complete"` tasks to return. Only applies when `state` is set to `"complete"`.|
|
|
||||||
| `type`| filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.|
|
|
||||||
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/completeTasks`
|
|
||||||
|
|
||||||
Retrieve list of complete tasks. Equivalent to `/druid/indexer/v1/tasks?state=complete`.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/runningTasks`
|
|
||||||
|
|
||||||
Retrieve list of running tasks. Equivalent to `/druid/indexer/v1/tasks?state=running`.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/waitingTasks`
|
|
||||||
|
|
||||||
Retrieve list of waiting tasks. Equivalent to `/druid/indexer/v1/tasks?state=waiting`.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/pendingTasks`
|
|
||||||
|
|
||||||
Retrieve list of pending tasks. Equivalent to `/druid/indexer/v1/tasks?state=pending`.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/task/{taskId}`
|
|
||||||
|
|
||||||
Retrieve the 'payload' of a task.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/task/{taskId}/status`
|
|
||||||
|
|
||||||
Retrieve the status of a task.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/task/{taskId}/segments`
|
|
||||||
|
|
||||||
> This API is deprecated and will be removed in future releases.
|
|
||||||
|
|
||||||
Retrieve information about the segments of a task.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/task/{taskId}/reports`
|
|
||||||
|
|
||||||
Retrieve a [task completion report](../ingestion/tasks.md#task-reports) for a task. Only works for completed tasks.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/task`
|
|
||||||
|
|
||||||
Endpoint for submitting tasks and supervisor specs to the Overlord. Returns the taskId of the submitted task.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/task/{taskId}/shutdown`
|
|
||||||
|
|
||||||
Shuts down a task.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/datasources/{dataSource}/shutdownAllTasks`
|
|
||||||
|
|
||||||
Shuts down all tasks for a dataSource.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/taskStatus`
|
|
||||||
|
|
||||||
Retrieve list of task status objects for list of task id strings in request body.
|
|
||||||
|
|
||||||
`DELETE /druid/indexer/v1/pendingSegments/{dataSource}`
|
|
||||||
|
|
||||||
Manually clean up pending segments table in metadata storage for `datasource`. Returns a JSON object response with
|
|
||||||
`numDeleted` and count of rows deleted from the pending segments table. This API is used by the
|
|
||||||
`druid.coordinator.kill.pendingSegments.on` [coordinator setting](../configuration/index.md#coordinator-operation)
|
|
||||||
which automates this operation to perform periodically.
|
|
||||||
|
|
||||||
#### Supervisors
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/supervisor`
|
|
||||||
|
|
||||||
Returns a list of strings of the currently active supervisor ids.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/supervisor?full`
|
|
||||||
|
|
||||||
Returns a list of objects of the currently active supervisors.
|
|
||||||
|
|
||||||
|Field|Type|Description|
|
|
||||||
|---|---|---|
|
|
||||||
|`id`|String|supervisor unique identifier|
|
|
||||||
|`state`|String|basic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. Check [Kafka Docs](../development/extensions-core/kafka-supervisor-operations.md) for details.|
|
|
||||||
|`detailedState`|String|supervisor specific state. See documentation of specific supervisor for details: [Kafka](../development/extensions-core/kafka-ingestion.md) or [Kinesis](../development/extensions-core/kinesis-ingestion.md)|
|
|
||||||
|`healthy`|Boolean|true or false indicator of overall supervisor health|
|
|
||||||
|`spec`|SupervisorSpec|JSON specification of supervisor|
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/supervisor?state=true`
|
|
||||||
|
|
||||||
Returns a list of objects of the currently active supervisors and their current state.
|
|
||||||
|
|
||||||
|Field|Type|Description|
|
|
||||||
|---|---|---|
|
|
||||||
|`id`|String|supervisor unique identifier|
|
|
||||||
|`state`|String|basic state of the supervisor. Available states: `UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. Check [Kafka Docs](../development/extensions-core/kafka-supervisor-operations.md) for details.|
|
|
||||||
|`detailedState`|String|supervisor specific state. See documentation of the specific supervisor for details: [Kafka](../development/extensions-core/kafka-ingestion.md) or [Kinesis](../development/extensions-core/kinesis-ingestion.md)|
|
|
||||||
|`healthy`|Boolean|true or false indicator of overall supervisor health|
|
|
||||||
|`suspended`|Boolean|true or false indicator of whether the supervisor is in suspended state|
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/supervisor/<supervisorId>`
|
|
||||||
|
|
||||||
Returns the current spec for the supervisor with the provided ID.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/supervisor/<supervisorId>/status`
|
|
||||||
|
|
||||||
Returns the current status of the supervisor with the provided ID.
|
|
||||||
|
|
||||||
`GET/druid/indexer/v1/supervisor/history`
|
|
||||||
|
|
||||||
Returns an audit history of specs for all supervisors (current and past).
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/supervisor/<supervisorId>/history`
|
|
||||||
|
|
||||||
Returns an audit history of specs for the supervisor with the provided ID.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor`
|
|
||||||
|
|
||||||
Create a new supervisor or update an existing one.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor/<supervisorId>/suspend`
|
|
||||||
|
|
||||||
Suspend the current running supervisor of the provided ID. Responds with updated SupervisorSpec.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor/suspendAll`
|
|
||||||
|
|
||||||
Suspend all supervisors at once.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor/<supervisorId>/resume`
|
|
||||||
|
|
||||||
Resume indexing tasks for a supervisor. Responds with updated SupervisorSpec.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor/resumeAll`
|
|
||||||
|
|
||||||
Resume all supervisors at once.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor/<supervisorId>/reset`
|
|
||||||
|
|
||||||
Reset the specified supervisor.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor/<supervisorId>/terminate`
|
|
||||||
|
|
||||||
Terminate a supervisor of the provided ID.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor/terminateAll`
|
|
||||||
|
|
||||||
Terminate all supervisors at once.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/supervisor/<supervisorId>/shutdown`
|
|
||||||
|
|
||||||
> This API is deprecated and will be removed in future releases.
|
|
||||||
> Please use the equivalent `terminate` instead.
|
|
||||||
|
|
||||||
Shutdown a supervisor.
|
|
||||||
|
|
||||||
#### Dynamic configuration
|
|
||||||
|
|
||||||
See [Overlord Dynamic Configuration](../configuration/index.md#overlord-dynamic-configuration) for details.
|
|
||||||
|
|
||||||
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
|
|
||||||
as in `2016-06-27_2016-06-28`.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/worker`
|
|
||||||
|
|
||||||
Retrieves current overlord dynamic configuration.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/worker/history?interval={interval}&count={count}`
|
|
||||||
|
|
||||||
Retrieves history of changes to overlord dynamic configuration. Accepts `interval` and `count` query string parameters
|
|
||||||
to filter by interval and limit the number of results respectively.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/workers`
|
|
||||||
|
|
||||||
Retrieves a list of all the worker nodes in the cluster along with its metadata.
|
|
||||||
|
|
||||||
`GET /druid/indexer/v1/scaling`
|
|
||||||
|
|
||||||
Retrieves overlord scaling events if auto-scaling runners are in use.
|
|
||||||
|
|
||||||
`POST /druid/indexer/v1/worker`
|
|
||||||
|
|
||||||
Update overlord dynamic worker configuration.
|
|
||||||
|
|
||||||
## Data server
|
|
||||||
|
|
||||||
This section documents the API endpoints for the processes that reside on Data servers (MiddleManagers/Peons and Historicals)
|
|
||||||
in the suggested [three-server configuration](../design/processes.md#server-types).
|
|
||||||
|
|
||||||
### MiddleManager
|
|
||||||
|
|
||||||
`GET /druid/worker/v1/enabled`
|
|
||||||
|
|
||||||
Check whether a MiddleManager is in an enabled or disabled state. Returns JSON object keyed by the combined `druid.host`
|
|
||||||
and `druid.port` with the boolean state as the value.
|
|
||||||
|
|
||||||
```json
|
|
||||||
{"localhost:8091":true}
|
|
||||||
```
|
|
||||||
|
|
||||||
`GET /druid/worker/v1/tasks`
|
|
||||||
|
|
||||||
Retrieve a list of active tasks being run on MiddleManager. Returns JSON list of taskid strings. Normal usage should
|
|
||||||
prefer to use the `/druid/indexer/v1/tasks` [Overlord API](#overlord) or one of it's task state specific variants instead.
|
|
||||||
|
|
||||||
```json
|
|
||||||
["index_wikiticker_2019-02-11T02:20:15.316Z"]
|
|
||||||
```
|
|
||||||
|
|
||||||
`GET /druid/worker/v1/task/{taskid}/log`
|
|
||||||
|
|
||||||
Retrieve task log output stream by task id. Normal usage should prefer to use the `/druid/indexer/v1/task/{taskId}/log`
|
|
||||||
[Overlord API](#overlord) instead.
|
|
||||||
|
|
||||||
`POST /druid/worker/v1/disable`
|
|
||||||
|
|
||||||
Disable a MiddleManager, causing it to stop accepting new tasks but complete all existing tasks. Returns JSON object
|
|
||||||
keyed by the combined `druid.host` and `druid.port`:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{"localhost:8091":"disabled"}
|
|
||||||
```
|
|
||||||
|
|
||||||
`POST /druid/worker/v1/enable`
|
|
||||||
|
|
||||||
Enable a MiddleManager, allowing it to accept new tasks again if it was previously disabled. Returns JSON object
|
|
||||||
keyed by the combined `druid.host` and `druid.port`:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{"localhost:8091":"enabled"}
|
|
||||||
```
|
|
||||||
|
|
||||||
`POST /druid/worker/v1/task/{taskid}/shutdown`
|
|
||||||
|
|
||||||
Shutdown a running task by `taskid`. Normal usage should prefer to use the `/druid/indexer/v1/task/{taskId}/shutdown`
|
|
||||||
[Overlord API](#overlord) instead. Returns JSON:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{"task":"index_kafka_wikiticker_f7011f8ffba384b_fpeclode"}
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
### Peon
|
|
||||||
|
|
||||||
`GET /druid/worker/v1/chat/{taskId}/rowStats`
|
|
||||||
|
|
||||||
Retrieve a live row stats report from a Peon. See [task reports](../ingestion/tasks.md#task-reports) for more details.
|
|
||||||
|
|
||||||
`GET /druid/worker/v1/chat/{taskId}/unparseableEvents`
|
|
||||||
|
|
||||||
Retrieve an unparseable events report from a Peon. See [task reports](../ingestion/tasks.md#task-reports) for more details.
|
|
||||||
|
|
||||||
### Historical
|
|
||||||
|
|
||||||
#### Segment loading
|
|
||||||
|
|
||||||
`GET /druid/historical/v1/loadstatus`
|
|
||||||
|
|
||||||
Returns JSON of the form `{"cacheInitialized":<value>}`, where value is either `true` or `false` indicating if all
|
|
||||||
segments in the local cache have been loaded. This can be used to know when a Historical process is ready
|
|
||||||
to be queried after a restart.
|
|
||||||
|
|
||||||
`GET /druid/historical/v1/readiness`
|
|
||||||
|
|
||||||
Similar to `/druid/historical/v1/loadstatus`, but instead of returning JSON with a flag, responses 200 OK if segments
|
|
||||||
in the local cache have been loaded, and 503 SERVICE UNAVAILABLE, if they haven't.
|
|
||||||
|
|
||||||
|
|
||||||
## Query server
|
|
||||||
|
|
||||||
This section documents the API endpoints for the processes that reside on Query servers (Brokers) in the suggested [three-server configuration](../design/processes.md#server-types).
|
|
||||||
|
|
||||||
### Broker
|
|
||||||
|
|
||||||
#### Datasource information
|
|
||||||
|
|
||||||
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
|
|
||||||
as in `2016-06-27_2016-06-28`.
|
|
||||||
|
|
||||||
> Note: Much of this information is available in a simpler, easier-to-use form through the Druid SQL
|
|
||||||
> [`INFORMATION_SCHEMA.TABLES`](../querying/sql-metadata-tables.md#tables-table),
|
|
||||||
> [`INFORMATION_SCHEMA.COLUMNS`](../querying/sql-metadata-tables.md#columns-table), and
|
|
||||||
> [`sys.segments`](../querying/sql-metadata-tables.md#segments-table) tables.
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources`
|
|
||||||
|
|
||||||
Returns a list of queryable datasources.
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources/{dataSourceName}`
|
|
||||||
|
|
||||||
Returns the dimensions and metrics of the datasource. Optionally, you can provide request parameter "full" to get list of served intervals with dimensions and metrics being served for those intervals. You can also provide request param "interval" explicitly to refer to a particular interval.
|
|
||||||
|
|
||||||
If no interval is specified, a default interval spanning a configurable period before the current time will be used. The default duration of this interval is specified in ISO 8601 duration format via: `druid.query.segmentMetadata.defaultHistory`
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources/{dataSourceName}/dimensions`
|
|
||||||
|
|
||||||
> This API is deprecated and will be removed in future releases. Please use [SegmentMetadataQuery](../querying/segmentmetadataquery.md) instead
|
|
||||||
> which provides more comprehensive information and supports all dataSource types including streaming dataSources. It's also encouraged to use [INFORMATION_SCHEMA tables](../querying/sql-metadata-tables.md)
|
|
||||||
> if you're using SQL.
|
|
||||||
>
|
|
||||||
Returns the dimensions of the datasource.
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources/{dataSourceName}/metrics`
|
|
||||||
|
|
||||||
> This API is deprecated and will be removed in future releases. Please use [SegmentMetadataQuery](../querying/segmentmetadataquery.md) instead
|
|
||||||
> which provides more comprehensive information and supports all dataSource types including streaming dataSources. It's also encouraged to use [INFORMATION_SCHEMA tables](../querying/sql-metadata-tables.md)
|
|
||||||
> if you're using SQL.
|
|
||||||
|
|
||||||
Returns the metrics of the datasource.
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources/{dataSourceName}/candidates?intervals={comma-separated-intervals}&numCandidates={numCandidates}`
|
|
||||||
|
|
||||||
Returns segment information lists including server locations for the given datasource and intervals. If "numCandidates" is not specified, it will return all servers for each interval.
|
|
||||||
|
|
||||||
#### Load Status
|
|
||||||
|
|
||||||
`GET /druid/broker/v1/loadstatus`
|
|
||||||
|
|
||||||
Returns a flag indicating if the Broker knows about all segments in the cluster. This can be used to know when a Broker process is ready to be queried after a restart.
|
|
||||||
|
|
||||||
`GET /druid/broker/v1/readiness`
|
|
||||||
|
|
||||||
Similar to `/druid/broker/v1/loadstatus`, but instead of returning a JSON, responses 200 OK if its ready and otherwise 503 SERVICE UNAVAILABLE.
|
|
||||||
|
|
||||||
#### Queries
|
|
||||||
|
|
||||||
`POST /druid/v2/`
|
|
||||||
|
|
||||||
The endpoint for submitting queries. Accepts an option `?pretty` that pretty prints the results.
|
|
||||||
|
|
||||||
`POST /druid/v2/candidates/`
|
|
||||||
|
|
||||||
Returns segment information lists including server locations for the given query..
|
|
||||||
|
|
||||||
### Router
|
|
||||||
|
|
||||||
> Note: Much of this information is available in a simpler, easier-to-use form through the Druid SQL
|
|
||||||
> [`INFORMATION_SCHEMA.TABLES`](../querying/sql-metadata-tables.md#tables-table),
|
|
||||||
> [`INFORMATION_SCHEMA.COLUMNS`](../querying/sql-metadata-tables.md#columns-table), and
|
|
||||||
> [`sys.segments`](../querying/sql-metadata-tables.md#segments-table) tables.
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources`
|
|
||||||
|
|
||||||
Returns a list of queryable datasources.
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources/{dataSourceName}`
|
|
||||||
|
|
||||||
Returns the dimensions and metrics of the datasource.
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources/{dataSourceName}/dimensions`
|
|
||||||
|
|
||||||
Returns the dimensions of the datasource.
|
|
||||||
|
|
||||||
`GET /druid/v2/datasources/{dataSourceName}/metrics`
|
|
||||||
|
|
||||||
Returns the metrics of the datasource.
|
|
|
@ -0,0 +1,91 @@
|
||||||
|
---
|
||||||
|
id: automatic-compaction-api
|
||||||
|
title: Automatic compaction API
|
||||||
|
sidebar_label: Automatic compaction
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes status and configuration API endpoints for [automatic compaction](../data-management/automatic-compaction.md) in Apache Druid.
|
||||||
|
|
||||||
|
## Automatic compaction status
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/compaction/progress?dataSource={dataSource}`
|
||||||
|
|
||||||
|
Returns the total size of segments awaiting compaction for the given dataSource. The specified dataSource must have [automatic compaction](../data-management/automatic-compaction.md) enabled.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/compaction/status`
|
||||||
|
|
||||||
|
Returns the status and statistics from the auto-compaction run of all dataSources which have auto-compaction enabled in the latest run. The response payload includes a list of `latestStatus` objects. Each `latestStatus` represents the status for a dataSource (which has/had auto-compaction enabled).
|
||||||
|
|
||||||
|
The `latestStatus` object has the following keys:
|
||||||
|
* `dataSource`: name of the datasource for this status information
|
||||||
|
* `scheduleStatus`: auto-compaction scheduling status. Possible values are `NOT_ENABLED` and `RUNNING`. Returns `RUNNING ` if the dataSource has an active auto-compaction config submitted. Otherwise, returns `NOT_ENABLED`.
|
||||||
|
* `bytesAwaitingCompaction`: total bytes of this datasource waiting to be compacted by the auto-compaction (only consider intervals/segments that are eligible for auto-compaction)
|
||||||
|
* `bytesCompacted`: total bytes of this datasource that are already compacted with the spec set in the auto-compaction config
|
||||||
|
* `bytesSkipped`: total bytes of this datasource that are skipped (not eligible for auto-compaction) by the auto-compaction
|
||||||
|
* `segmentCountAwaitingCompaction`: total number of segments of this datasource waiting to be compacted by the auto-compaction (only consider intervals/segments that are eligible for auto-compaction)
|
||||||
|
* `segmentCountCompacted`: total number of segments of this datasource that are already compacted with the spec set in the auto-compaction config
|
||||||
|
* `segmentCountSkipped`: total number of segments of this datasource that are skipped (not eligible for auto-compaction) by the auto-compaction
|
||||||
|
* `intervalCountAwaitingCompaction`: total number of intervals of this datasource waiting to be compacted by the auto-compaction (only consider intervals/segments that are eligible for auto-compaction)
|
||||||
|
* `intervalCountCompacted`: total number of intervals of this datasource that are already compacted with the spec set in the auto-compaction config
|
||||||
|
* `intervalCountSkipped`: total number of intervals of this datasource that are skipped (not eligible for auto-compaction) by the auto-compaction
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/compaction/status?dataSource={dataSource}`
|
||||||
|
|
||||||
|
Similar to the API `/druid/coordinator/v1/compaction/status` above but filters response to only return information for the dataSource given.
|
||||||
|
The dataSource must have auto-compaction enabled.
|
||||||
|
|
||||||
|
## Automatic compaction configuration
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/config/compaction`
|
||||||
|
|
||||||
|
Returns all automatic compaction configs.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/config/compaction/{dataSource}`
|
||||||
|
|
||||||
|
Returns an automatic compaction config of a dataSource.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/config/compaction/{dataSource}/history?interval={interval}&count={count}`
|
||||||
|
|
||||||
|
Returns the history of the automatic compaction config for a dataSource. Optionally accepts `interval` and `count`
|
||||||
|
query string parameters to filter by interval and limit the number of results respectively. If the dataSource does not
|
||||||
|
exist or there is no compaction history for the dataSource, an empty list is returned.
|
||||||
|
|
||||||
|
The response contains a list of objects with the following keys:
|
||||||
|
* `globalConfig`: A json object containing automatic compaction config that applies to the entire cluster.
|
||||||
|
* `compactionConfig`: A json object containing the automatic compaction config for the datasource.
|
||||||
|
* `auditInfo`: A json object that contains information about the change made - like `author`, `comment` and `ip`.
|
||||||
|
* `auditTime`: The date and time when the change was made.
|
||||||
|
|
||||||
|
`POST /druid/coordinator/v1/config/compaction/taskslots?ratio={someRatio}&max={someMaxSlots}`
|
||||||
|
|
||||||
|
Update the capacity for compaction tasks. `ratio` and `max` are used to limit the max number of compaction tasks.
|
||||||
|
They mean the ratio of the total task slots to the compaction task slots and the maximum number of task slots for compaction tasks, respectively. The actual max number of compaction tasks is `min(max, ratio * total task slots)`.
|
||||||
|
Note that `ratio` and `max` are optional and can be omitted. If they are omitted, default values (0.1 and unbounded)
|
||||||
|
will be set for them.
|
||||||
|
|
||||||
|
`POST /druid/coordinator/v1/config/compaction`
|
||||||
|
|
||||||
|
Creates or updates the [automatic compaction](../data-management/automatic-compaction.md) config for a dataSource. See [Automatic compaction dynamic configuration](../configuration/index.md#automatic-compaction-dynamic-configuration) for configuration details.
|
||||||
|
|
||||||
|
`DELETE /druid/coordinator/v1/config/compaction/{dataSource}`
|
||||||
|
|
||||||
|
Removes the automatic compaction config for a dataSource.
|
|
@ -0,0 +1,79 @@
|
||||||
|
---
|
||||||
|
id: data-management-api
|
||||||
|
title: Data management API
|
||||||
|
sidebar_label: Data management
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the data management API endpoints for Apache Druid. This includes information on how to mark segments as `used` or `unused` and delete them from Druid.
|
||||||
|
|
||||||
|
## Note for Coordinator's POST and DELETE APIs
|
||||||
|
|
||||||
|
While segments may be enabled by issuing POST requests for the datasources, the Coordinator may again disable segments if they match any configured [drop rules](../operations/rule-configuration.md#drop-rules). Even if segments are enabled by these APIs, you must configure a [load rule](../operations/rule-configuration.md#load-rules) to load them onto Historical processes. If an indexing or kill task runs at the same time these APIs are invoked, the behavior is undefined. Some segments might be killed and others might be enabled. It's also possible that all segments might be disabled, but the indexing task can still read data from those segments and succeed.
|
||||||
|
|
||||||
|
> Avoid using indexing or kill tasks and these APIs at the same time for the same datasource and time chunk.
|
||||||
|
|
||||||
|
`POST /druid/coordinator/v1/datasources/{dataSourceName}`
|
||||||
|
|
||||||
|
Marks as used all segments belonging to a datasource. Returns a JSON object of the form
|
||||||
|
`{"numChangedSegments": <number>}` with the number of segments in the database whose state has been changed (that is,
|
||||||
|
the segments were marked as used) as the result of this API call.
|
||||||
|
|
||||||
|
`POST /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
|
||||||
|
|
||||||
|
Marks as used a segment of a datasource. Returns a JSON object of the form `{"segmentStateChanged": <boolean>}` with
|
||||||
|
the boolean indicating if the state of the segment has been changed (that is, the segment was marked as used) as the
|
||||||
|
result of this API call.
|
||||||
|
|
||||||
|
`POST /druid/coordinator/v1/datasources/{dataSourceName}/markUsed`
|
||||||
|
|
||||||
|
`POST /druid/coordinator/v1/datasources/{dataSourceName}/markUnused`
|
||||||
|
|
||||||
|
Marks segments (un)used for a datasource by interval or set of segment Ids. When marking used only segments that are not overshadowed will be updated.
|
||||||
|
|
||||||
|
The request payload contains the interval or set of segment IDs to be marked unused.
|
||||||
|
Either interval or segment IDs should be provided, if both or none are provided in the payload, the API would throw an error (400 BAD REQUEST).
|
||||||
|
|
||||||
|
Interval specifies the start and end times as IS0 8601 strings. `interval=(start/end)` where start and end both are inclusive and only the segments completely contained within the specified interval will be disabled, partially overlapping segments will not be affected.
|
||||||
|
|
||||||
|
JSON Request Payload:
|
||||||
|
|
||||||
|
|Key|Description|Example|
|
||||||
|
|----------|-------------|---------|
|
||||||
|
|`interval`|The interval for which to mark segments unused|`"2015-09-12T03:00:00.000Z/2015-09-12T05:00:00.000Z"`|
|
||||||
|
|`segmentIds`|Set of segment IDs to be marked unused|`["segmentId1", "segmentId2"]`|
|
||||||
|
|
||||||
|
`DELETE /druid/coordinator/v1/datasources/{dataSourceName}`
|
||||||
|
|
||||||
|
Marks as unused all segments belonging to a datasource. Returns a JSON object of the form
|
||||||
|
`{"numChangedSegments": <number>}` with the number of segments in the database whose state has been changed (that is,
|
||||||
|
the segments were marked as unused) as the result of this API call.
|
||||||
|
|
||||||
|
`DELETE /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}`
|
||||||
|
`@Deprecated. /druid/coordinator/v1/datasources/{dataSourceName}?kill=true&interval={myInterval}`
|
||||||
|
|
||||||
|
Runs a [Kill task](../ingestion/tasks.md) for a given interval and datasource.
|
||||||
|
|
||||||
|
`DELETE /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
|
||||||
|
|
||||||
|
Marks as unused a segment of a datasource. Returns a JSON object of the form `{"segmentStateChanged": <boolean>}` with
|
||||||
|
the boolean indicating if the state of the segment has been changed (that is, the segment was marked as unused) as the
|
||||||
|
result of this API call.
|
|
@ -0,0 +1,75 @@
|
||||||
|
---
|
||||||
|
id: dynamic-configuration-api
|
||||||
|
title: Dynamic configuration API
|
||||||
|
sidebar_label: Dynamic configuration
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the API endpoints to retrieve and manage the dynamic configurations for the [Coordinator](../configuration/index.html#overlord-dynamic-configuration) and [Overlord](../configuration/index.html#dynamic-configuration) in Apache Druid.
|
||||||
|
|
||||||
|
## Coordinator dynamic configuration
|
||||||
|
|
||||||
|
See [Coordinator Dynamic Configuration](../configuration/index.md#dynamic-configuration) for details.
|
||||||
|
|
||||||
|
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
|
||||||
|
as in `2016-06-27_2016-06-28`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/config`
|
||||||
|
|
||||||
|
Retrieves current coordinator dynamic configuration.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/config/history?interval={interval}&count={count}`
|
||||||
|
|
||||||
|
Retrieves history of changes to overlord dynamic configuration. Accepts `interval` and `count` query string parameters
|
||||||
|
to filter by interval and limit the number of results respectively.
|
||||||
|
|
||||||
|
`POST /druid/coordinator/v1/config`
|
||||||
|
|
||||||
|
Update overlord dynamic worker configuration.
|
||||||
|
|
||||||
|
|
||||||
|
## Overlord dynamic configuration
|
||||||
|
|
||||||
|
See [Overlord Dynamic Configuration](../configuration/index.md#overlord-dynamic-configuration) for details.
|
||||||
|
|
||||||
|
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
|
||||||
|
as in `2016-06-27_2016-06-28`.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/worker`
|
||||||
|
|
||||||
|
Retrieves current overlord dynamic configuration.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/worker/history?interval={interval}&count={count}`
|
||||||
|
|
||||||
|
Retrieves history of changes to overlord dynamic configuration. Accepts `interval` and `count` query string parameters
|
||||||
|
to filter by interval and limit the number of results respectively.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/workers`
|
||||||
|
|
||||||
|
Retrieves a list of all the worker nodes in the cluster along with its metadata.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/scaling`
|
||||||
|
|
||||||
|
Retrieves overlord scaling events if auto-scaling runners are in use.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/worker`
|
||||||
|
|
||||||
|
Update overlord dynamic worker configuration.
|
|
@ -0,0 +1,36 @@
|
||||||
|
---
|
||||||
|
id: json-querying-api
|
||||||
|
title: JSON querying API
|
||||||
|
sidebar_label: JSON querying
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the API endpoints to submit JSON-based [native queries](../querying/querying.md) to Apache Druid.
|
||||||
|
|
||||||
|
## Queries
|
||||||
|
|
||||||
|
`POST /druid/v2/`
|
||||||
|
|
||||||
|
The endpoint for submitting queries. Accepts an option `?pretty` that pretty prints the results.
|
||||||
|
|
||||||
|
`POST /druid/v2/candidates/`
|
||||||
|
|
||||||
|
Returns segment information lists including server locations for the given query.
|
|
@ -0,0 +1,315 @@
|
||||||
|
---
|
||||||
|
id: legacy-metadata-api
|
||||||
|
title: Legacy metadata API
|
||||||
|
sidebar_label: Legacy metadata
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the legacy API endpoints to retrieve datasource metadata from Apache Druid. Use the [SQL metadata tables](/querying/sql-metadata-tables.md) to retrieve datasource metadata instead.
|
||||||
|
|
||||||
|
## Segment loading
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/loadstatus`
|
||||||
|
|
||||||
|
Returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/loadstatus?simple`
|
||||||
|
|
||||||
|
Returns the number of segments left to load until segments that should be loaded in the cluster are available for queries. This does not include segment replication counts.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/loadstatus?full`
|
||||||
|
|
||||||
|
Returns the number of segments left to load in each tier until segments that should be loaded in the cluster are all available. This includes segment replication counts.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/loadstatus?full&computeUsingClusterView`
|
||||||
|
|
||||||
|
Returns the number of segments not yet loaded for each tier until all segments loading in the cluster are available.
|
||||||
|
The result includes segment replication counts. It also factors in the number of available nodes that are of a service type that can load the segment when computing the number of segments remaining to load.
|
||||||
|
A segment is considered fully loaded when:
|
||||||
|
- Druid has replicated it the number of times configured in the corresponding load rule.
|
||||||
|
- Or the number of replicas for the segment in each tier where it is configured to be replicated equals the available nodes of a service type that are currently allowed to load the segment in the tier.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/loadqueue`
|
||||||
|
|
||||||
|
Returns the ids of segments to load and drop for each Historical process.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/loadqueue?simple`
|
||||||
|
|
||||||
|
Returns the number of segments to load and drop, as well as the total segment load and drop size in bytes for each Historical process.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/loadqueue?full`
|
||||||
|
|
||||||
|
Returns the serialized JSON of segments to load and drop for each Historical process.
|
||||||
|
|
||||||
|
## Segment loading by datasource
|
||||||
|
|
||||||
|
Note that all _interval_ query parameters are ISO 8601 strings—for example, 2016-06-27/2016-06-28.
|
||||||
|
Also note that these APIs only guarantees that the segments are available at the time of the call.
|
||||||
|
Segments can still become missing because of historical process failures or any other reasons afterward.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?forceMetadataRefresh={boolean}&interval={myInterval}`
|
||||||
|
|
||||||
|
Returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster for the given
|
||||||
|
datasource over the given interval (or last 2 weeks if interval is not given). `forceMetadataRefresh` is required to be set.
|
||||||
|
* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
|
||||||
|
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
|
||||||
|
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
|
||||||
|
* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
|
||||||
|
If no used segments are found for the given inputs, this API returns `204 No Content`
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?simple&forceMetadataRefresh={boolean}&interval={myInterval}`
|
||||||
|
|
||||||
|
Returns the number of segments left to load until segments that should be loaded in the cluster are available for the given datasource
|
||||||
|
over the given interval (or last 2 weeks if interval is not given). This does not include segment replication counts. `forceMetadataRefresh` is required to be set.
|
||||||
|
* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
|
||||||
|
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
|
||||||
|
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
|
||||||
|
* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
|
||||||
|
If no used segments are found for the given inputs, this API returns `204 No Content`
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?full&forceMetadataRefresh={boolean}&interval={myInterval}`
|
||||||
|
|
||||||
|
Returns the number of segments left to load in each tier until segments that should be loaded in the cluster are all available for the given datasource over the given interval (or last 2 weeks if interval is not given). This includes segment replication counts. `forceMetadataRefresh` is required to be set.
|
||||||
|
* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
|
||||||
|
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
|
||||||
|
of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
|
||||||
|
* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
|
||||||
|
|
||||||
|
You can pass the optional query parameter `computeUsingClusterView` to factor in the available cluster services when calculating
|
||||||
|
the segments left to load. See [Coordinator Segment Loading](#segment-loading) for details.
|
||||||
|
If no used segments are found for the given inputs, this API returns `204 No Content`
|
||||||
|
|
||||||
|
## Metadata store information
|
||||||
|
|
||||||
|
> Note: Much of this information is available in a simpler, easier-to-use form through the Druid SQL
|
||||||
|
> [`sys.segments`](../querying/sql-metadata-tables.md#segments-table) table.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/segments`
|
||||||
|
|
||||||
|
Returns a list of all segments for each datasource enabled in the cluster.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/segments?datasources={dataSourceName1}&datasources={dataSourceName2}`
|
||||||
|
|
||||||
|
Returns a list of all segments for one or more specific datasources enabled in the cluster.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus`
|
||||||
|
|
||||||
|
Returns a list of all segments for each datasource with the full segment metadata and an extra field `overshadowed`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus&datasources={dataSourceName1}&datasources={dataSourceName2}`
|
||||||
|
|
||||||
|
Returns a list of all segments for one or more specific datasources with the full segment metadata and an extra field `overshadowed`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources`
|
||||||
|
|
||||||
|
Returns a list of the names of datasources with at least one used segment in the cluster, retrieved from the metadata database. Users should call this API to get the eventual state that the system will be in.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources?includeUnused`
|
||||||
|
|
||||||
|
Returns a list of the names of datasources, regardless of whether there are used segments belonging to those datasources in the cluster or not.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources?includeDisabled`
|
||||||
|
|
||||||
|
Returns a list of the names of datasources, regardless of whether the datasource is disabled or not.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources?full`
|
||||||
|
|
||||||
|
Returns a list of all datasources with at least one used segment in the cluster. Returns all metadata about those datasources as stored in the metadata store.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}`
|
||||||
|
|
||||||
|
Returns full metadata for a datasource as stored in the metadata store.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments`
|
||||||
|
|
||||||
|
Returns a list of all segments for a datasource as stored in the metadata store.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments?full`
|
||||||
|
|
||||||
|
Returns a list of all segments for a datasource with the full segment metadata as stored in the metadata store.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments/{segmentId}`
|
||||||
|
|
||||||
|
Returns full segment metadata for a specific segment as stored in the metadata store, if the segment is used. If the
|
||||||
|
segment is unused, or is unknown, a 404 response is returned.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments`
|
||||||
|
|
||||||
|
Returns a list of all segments, overlapping with any of given intervals, for a datasource as stored in the metadata store. Request body is array of string IS0 8601 intervals like `[interval1, interval2,...]`—for example, `["2012-01-01T00:00:00.000/2012-01-03T00:00:00.000", "2012-01-05T00:00:00.000/2012-01-07T00:00:00.000"]`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments?full`
|
||||||
|
|
||||||
|
Returns a list of all segments, overlapping with any of given intervals, for a datasource with the full segment metadata as stored in the metadata store. Request body is array of string ISO 8601 intervals like `[interval1, interval2,...]`—for example, `["2012-01-01T00:00:00.000/2012-01-03T00:00:00.000", "2012-01-05T00:00:00.000/2012-01-07T00:00:00.000"]`.
|
||||||
|
|
||||||
|
<a name="coordinator-datasources"></a>
|
||||||
|
|
||||||
|
## Datasources
|
||||||
|
|
||||||
|
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`—for example, `2016-06-27_2016-06-28`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources`
|
||||||
|
|
||||||
|
Returns a list of datasource names found in the cluster as seen by the coordinator. This view is updated every [`druid.coordinator.period`](../configuration/index.md#coordinator-operation).
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources?simple`
|
||||||
|
|
||||||
|
Returns a list of JSON objects containing the name and properties of datasources found in the cluster. Properties include segment count, total segment byte size, replicated total segment byte size, minTime, and maxTime.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources?full`
|
||||||
|
|
||||||
|
Returns a list of datasource names found in the cluster with all metadata about those datasources.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}`
|
||||||
|
|
||||||
|
Returns a JSON object containing the name and properties of a datasource. Properties include segment count, total segment byte size, replicated total segment byte size, minTime, and maxTime.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}?full`
|
||||||
|
|
||||||
|
Returns full metadata for a datasource.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals`
|
||||||
|
|
||||||
|
Returns a set of segment intervals.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals?simple`
|
||||||
|
|
||||||
|
Returns a map of an interval to a JSON object containing the total byte size of segments and number of segments for that interval.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals?full`
|
||||||
|
|
||||||
|
Returns a map of an interval to a map of segment metadata to a set of server names that contain the segment for that interval.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}`
|
||||||
|
|
||||||
|
Returns a set of segment ids for an interval.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}?simple`
|
||||||
|
|
||||||
|
Returns a map of segment intervals contained within the specified interval to a JSON object containing the total byte size of segments and number of segments for an interval.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}?full`
|
||||||
|
|
||||||
|
Returns a map of segment intervals contained within the specified interval to a map of segment metadata to a set of server names that contain the segment for an interval.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}/serverview`
|
||||||
|
|
||||||
|
Returns a map of segment intervals contained within the specified interval to information about the servers that contain the segment for an interval.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments`
|
||||||
|
|
||||||
|
Returns a list of all segments for a datasource in the cluster.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments?full`
|
||||||
|
|
||||||
|
Returns a list of all segments for a datasource in the cluster with the full segment metadata.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
|
||||||
|
|
||||||
|
Returns full segment metadata for a specific segment in the cluster.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/datasources/{dataSourceName}/tiers`
|
||||||
|
|
||||||
|
Return the tiers that a datasource exists in.
|
||||||
|
|
||||||
|
## Intervals
|
||||||
|
|
||||||
|
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/` as in `2016-06-27_2016-06-28`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/intervals`
|
||||||
|
|
||||||
|
Returns all intervals for all datasources with total size and count.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/intervals/{interval}`
|
||||||
|
|
||||||
|
Returns aggregated total size and count for all intervals that intersect given ISO interval.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/intervals/{interval}?simple`
|
||||||
|
|
||||||
|
Returns total size and count for each interval within given ISO interval.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/intervals/{interval}?full`
|
||||||
|
|
||||||
|
Returns total size and count for each datasource for each interval within given ISO interval.
|
||||||
|
|
||||||
|
## Server information
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/servers`
|
||||||
|
|
||||||
|
Returns a list of servers URLs using the format `{hostname}:{port}`. Note that
|
||||||
|
processes that run with different types will appear multiple times with different
|
||||||
|
ports.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/servers?simple`
|
||||||
|
|
||||||
|
Returns a list of server data objects in which each object has the following keys:
|
||||||
|
* `host`: host URL include (`{hostname}:{port}`)
|
||||||
|
* `type`: process type (`indexer-executor`, `historical`)
|
||||||
|
* `currSize`: storage size currently used
|
||||||
|
* `maxSize`: maximum storage size
|
||||||
|
* `priority`
|
||||||
|
* `tier`
|
||||||
|
|
||||||
|
|
||||||
|
## Query server
|
||||||
|
|
||||||
|
This section documents the API endpoints for the processes that reside on Query servers (Brokers) in the suggested [three-server configuration](../design/processes.md#server-types).
|
||||||
|
|
||||||
|
### Broker
|
||||||
|
|
||||||
|
#### Datasource information
|
||||||
|
|
||||||
|
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
|
||||||
|
as in `2016-06-27_2016-06-28`.
|
||||||
|
|
||||||
|
> Note: Much of this information is available in a simpler, easier-to-use form through the Druid SQL
|
||||||
|
> [`INFORMATION_SCHEMA.TABLES`](../querying/sql-metadata-tables.md#tables-table),
|
||||||
|
> [`INFORMATION_SCHEMA.COLUMNS`](../querying/sql-metadata-tables.md#columns-table), and
|
||||||
|
> [`sys.segments`](../querying/sql-metadata-tables.md#segments-table) tables.
|
||||||
|
|
||||||
|
`GET /druid/v2/datasources`
|
||||||
|
|
||||||
|
Returns a list of queryable datasources.
|
||||||
|
|
||||||
|
`GET /druid/v2/datasources/{dataSourceName}`
|
||||||
|
|
||||||
|
Returns the dimensions and metrics of the datasource. Optionally, you can provide request parameter "full" to get list of served intervals with dimensions and metrics being served for those intervals. You can also provide request param "interval" explicitly to refer to a particular interval.
|
||||||
|
|
||||||
|
If no interval is specified, a default interval spanning a configurable period before the current time will be used. The default duration of this interval is specified in ISO 8601 duration format via: `druid.query.segmentMetadata.defaultHistory`
|
||||||
|
|
||||||
|
`GET /druid/v2/datasources/{dataSourceName}/dimensions`
|
||||||
|
|
||||||
|
> This API is deprecated and will be removed in future releases. Please use [SegmentMetadataQuery](../querying/segmentmetadataquery.md) instead
|
||||||
|
> which provides more comprehensive information and supports all dataSource types including streaming dataSources. It's also encouraged to use [INFORMATION_SCHEMA tables](../querying/sql-metadata-tables.md)
|
||||||
|
> if you're using SQL.
|
||||||
|
>
|
||||||
|
Returns the dimensions of the datasource.
|
||||||
|
|
||||||
|
`GET /druid/v2/datasources/{dataSourceName}/metrics`
|
||||||
|
|
||||||
|
> This API is deprecated and will be removed in future releases. Please use [SegmentMetadataQuery](../querying/segmentmetadataquery.md) instead
|
||||||
|
> which provides more comprehensive information and supports all dataSource types including streaming dataSources. It's also encouraged to use [INFORMATION_SCHEMA tables](../querying/sql-metadata-tables.md)
|
||||||
|
> if you're using SQL.
|
||||||
|
|
||||||
|
Returns the metrics of the datasource.
|
||||||
|
|
||||||
|
`GET /druid/v2/datasources/{dataSourceName}/candidates?intervals={comma-separated-intervals}&numCandidates={numCandidates}`
|
||||||
|
|
||||||
|
Returns segment information lists including server locations for the given datasource and intervals. If "numCandidates" is not specified, it will return all servers for each interval.
|
|
@ -0,0 +1,278 @@
|
||||||
|
---
|
||||||
|
id: lookups-api
|
||||||
|
title: Lookups API
|
||||||
|
sidebar_label: Lookups
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the API endpoints to configure, update, retrieve, and manage lookups for Apache Druid.
|
||||||
|
|
||||||
|
## Configure lookups
|
||||||
|
|
||||||
|
### Bulk update
|
||||||
|
Lookups can be updated in bulk by posting a JSON object to `/druid/coordinator/v1/lookups/config`. The format of the json object is as follows:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"<tierName>": {
|
||||||
|
"<lookupName>": {
|
||||||
|
"version": "<version>",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "<someExtractorFactoryType>",
|
||||||
|
"<someExtractorField>": "<someExtractorValue>"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that "version" is an arbitrary string assigned by the user, when making updates to existing lookup then user would need to specify a lexicographically higher version.
|
||||||
|
|
||||||
|
For example, a config might look something like:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"__default": {
|
||||||
|
"country_code": {
|
||||||
|
"version": "v0",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"77483": "United States"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"site_id": {
|
||||||
|
"version": "v0",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "cachedNamespace",
|
||||||
|
"extractionNamespace": {
|
||||||
|
"type": "jdbc",
|
||||||
|
"connectorConfig": {
|
||||||
|
"createTables": true,
|
||||||
|
"connectURI": "jdbc:mysql:\/\/localhost:3306\/druid",
|
||||||
|
"user": "druid",
|
||||||
|
"password": "diurd"
|
||||||
|
},
|
||||||
|
"table": "lookupTable",
|
||||||
|
"keyColumn": "country_id",
|
||||||
|
"valueColumn": "country_name",
|
||||||
|
"tsColumn": "timeColumn"
|
||||||
|
},
|
||||||
|
"firstCacheTimeout": 120000,
|
||||||
|
"injective": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"site_id_customer1": {
|
||||||
|
"version": "v0",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"847632": "Internal Use Only"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"site_id_customer2": {
|
||||||
|
"version": "v0",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"AHF77": "Home"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"realtime_customer1": {
|
||||||
|
"country_code": {
|
||||||
|
"version": "v0",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"77483": "United States"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"site_id_customer1": {
|
||||||
|
"version": "v0",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"847632": "Internal Use Only"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"realtime_customer2": {
|
||||||
|
"country_code": {
|
||||||
|
"version": "v0",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"77483": "United States"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"site_id_customer2": {
|
||||||
|
"version": "v0",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"AHF77": "Home"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
All entries in the map will UPDATE existing entries. No entries will be deleted.
|
||||||
|
|
||||||
|
### Update lookup
|
||||||
|
|
||||||
|
A `POST` to a particular lookup extractor factory via `/druid/coordinator/v1/lookups/config/{tier}/{id}` creates or updates that specific extractor factory.
|
||||||
|
|
||||||
|
For example, a post to `/druid/coordinator/v1/lookups/config/realtime_customer1/site_id_customer1` might contain the following:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"version": "v1",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"847632": "Internal Use Only"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This will replace the `site_id_customer1` lookup in the `realtime_customer1` with the definition above.
|
||||||
|
|
||||||
|
Assign a unique version identifier each time you update a lookup extractor factory. Otherwise the call will fail.
|
||||||
|
|
||||||
|
### Get all lookups
|
||||||
|
|
||||||
|
A `GET` to `/druid/coordinator/v1/lookups/config/all` will return all known lookup specs for all tiers.
|
||||||
|
|
||||||
|
### Get lookup
|
||||||
|
|
||||||
|
A `GET` to a particular lookup extractor factory is accomplished via `/druid/coordinator/v1/lookups/config/{tier}/{id}`
|
||||||
|
|
||||||
|
Using the prior example, a `GET` to `/druid/coordinator/v1/lookups/config/realtime_customer2/site_id_customer2` should return
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"version": "v1",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"AHF77": "Home"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Delete lookup
|
||||||
|
|
||||||
|
A `DELETE` to `/druid/coordinator/v1/lookups/config/{tier}/{id}` will remove that lookup from the cluster. If it was last lookup in the tier, then tier is deleted as well.
|
||||||
|
|
||||||
|
### Delete tier
|
||||||
|
|
||||||
|
A `DELETE` to `/druid/coordinator/v1/lookups/config/{tier}` will remove that tier from the cluster.
|
||||||
|
|
||||||
|
### List tier names
|
||||||
|
|
||||||
|
A `GET` to `/druid/coordinator/v1/lookups/config` will return a list of known tier names in the dynamic configuration.
|
||||||
|
To discover a list of tiers currently active in the cluster in addition to ones known in the dynamic configuration, the parameter `discover=true` can be added as per `/druid/coordinator/v1/lookups/config?discover=true`.
|
||||||
|
|
||||||
|
### List lookup names
|
||||||
|
|
||||||
|
A `GET` to `/druid/coordinator/v1/lookups/config/{tier}` will return a list of known lookup names for that tier.
|
||||||
|
|
||||||
|
These end points can be used to get the propagation status of configured lookups to processes using lookups such as Historicals.
|
||||||
|
|
||||||
|
## Lookup status
|
||||||
|
|
||||||
|
### List load status of all lookups
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/lookups/status` with optional query parameter `detailed`.
|
||||||
|
|
||||||
|
### List load status of lookups in a tier
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/lookups/status/{tier}` with optional query parameter `detailed`.
|
||||||
|
|
||||||
|
### List load status of single lookup
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/lookups/status/{tier}/{lookup}` with optional query parameter `detailed`.
|
||||||
|
|
||||||
|
### List lookup state of all processes
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/lookups/nodeStatus` with optional query parameter `discover` to discover tiers advertised by other Druid nodes, or by default, returning all configured lookup tiers. The default response will also include the lookups which are loaded, being loaded, or being dropped on each node, for each tier, including the complete lookup spec. Add the optional query parameter `detailed=false` to only include the 'version' of the lookup instead of the complete spec.
|
||||||
|
|
||||||
|
### List lookup state of processes in a tier
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/lookups/nodeStatus/{tier}`
|
||||||
|
|
||||||
|
### List lookup state of single process
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/lookups/nodeStatus/{tier}/{host:port}`
|
||||||
|
|
||||||
|
## Internal API
|
||||||
|
|
||||||
|
The Peon, Router, Broker, and Historical processes all have the ability to consume lookup configuration.
|
||||||
|
There is an internal API these processes use to list/load/drop their lookups starting at `/druid/listen/v1/lookups`.
|
||||||
|
These follow the same convention for return values as the cluster wide dynamic configuration. Following endpoints
|
||||||
|
can be used for debugging purposes but not otherwise.
|
||||||
|
|
||||||
|
### Get lookups
|
||||||
|
|
||||||
|
A `GET` to the process at `/druid/listen/v1/lookups` will return a json map of all the lookups currently active on the process.
|
||||||
|
The return value will be a json map of the lookups to their extractor factories.
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"site_id_customer2": {
|
||||||
|
"version": "v1",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"AHF77": "Home"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Get lookup
|
||||||
|
|
||||||
|
A `GET` to the process at `/druid/listen/v1/lookups/some_lookup_name` will return the LookupExtractorFactory for the lookup identified by `some_lookup_name`.
|
||||||
|
The return value will be the json representation of the factory.
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"version": "v1",
|
||||||
|
"lookupExtractorFactory": {
|
||||||
|
"type": "map",
|
||||||
|
"map": {
|
||||||
|
"AHF77": "Home"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
|
@ -0,0 +1,69 @@
|
||||||
|
---
|
||||||
|
id: retention-rules-api
|
||||||
|
title: Retention rules API
|
||||||
|
sidebar_label: Retention rules
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the API endpoints for managing retention rules in Apache Druid.
|
||||||
|
|
||||||
|
## Retention rules
|
||||||
|
|
||||||
|
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/` as in `2016-06-27_2016-06-28`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/rules`
|
||||||
|
|
||||||
|
Returns all rules as JSON objects for all datasources in the cluster including the default datasource.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/rules/{dataSourceName}`
|
||||||
|
|
||||||
|
Returns all rules for a specified datasource.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/rules/{dataSourceName}?full`
|
||||||
|
|
||||||
|
Returns all rules for a specified datasource and includes default datasource.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/rules/history?interval=<interval>`
|
||||||
|
|
||||||
|
Returns audit history of rules for all datasources. Default value of interval can be specified by setting `druid.audit.manager.auditHistoryMillis` (1 week if not configured) in Coordinator `runtime.properties`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/rules/history?count=<n>`
|
||||||
|
|
||||||
|
Returns last `n` entries of audit history of rules for all datasources.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/rules/{dataSourceName}/history?interval=<interval>`
|
||||||
|
|
||||||
|
Returns audit history of rules for a specified datasource. Default value of interval can be specified by setting `druid.audit.manager.auditHistoryMillis` (1 week if not configured) in Coordinator `runtime.properties`.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/rules/{dataSourceName}/history?count=<n>`
|
||||||
|
|
||||||
|
Returns last `n` entries of audit history of rules for a specified datasource.
|
||||||
|
|
||||||
|
`POST /druid/coordinator/v1/rules/{dataSourceName}`
|
||||||
|
|
||||||
|
POST with a list of rules in JSON form to update rules.
|
||||||
|
|
||||||
|
Optional Header Parameters for auditing the config change can also be specified.
|
||||||
|
|
||||||
|
|Header Param Name| Description | Default |
|
||||||
|
|----------|-------------|---------|
|
||||||
|
|`X-Druid-Author`| Author making the config change|`""`|
|
||||||
|
|`X-Druid-Comment`| Comment describing the change being done|`""`|
|
|
@ -0,0 +1,176 @@
|
||||||
|
---
|
||||||
|
id: service-status-api
|
||||||
|
title: Service status API
|
||||||
|
sidebar_label: Service status
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the API endpoints to retrieve service (process) status, cluster information for Apache Druid
|
||||||
|
|
||||||
|
## Common
|
||||||
|
|
||||||
|
All processes support the following endpoints.
|
||||||
|
|
||||||
|
### Process information
|
||||||
|
|
||||||
|
`GET /status`
|
||||||
|
|
||||||
|
Returns the Druid version, loaded extensions, memory used, total memory, and other useful information about the process.
|
||||||
|
|
||||||
|
`GET /status/health`
|
||||||
|
|
||||||
|
Always returns a boolean `true` value with a 200 OK response, useful for automated health checks.
|
||||||
|
|
||||||
|
`GET /status/properties`
|
||||||
|
|
||||||
|
Returns the current configuration properties of the process.
|
||||||
|
|
||||||
|
`GET /status/selfDiscovered/status`
|
||||||
|
|
||||||
|
Returns a JSON map of the form `{"selfDiscovered": true/false}`, indicating whether the node has received a confirmation
|
||||||
|
from the central node discovery mechanism (currently ZooKeeper) of the Druid cluster that the node has been added to the
|
||||||
|
cluster. It is recommended to not consider a Druid node "healthy" or "ready" in automated deployment/container
|
||||||
|
management systems until it returns `{"selfDiscovered": true}` from this endpoint. This is because a node may be
|
||||||
|
isolated from the rest of the cluster due to network issues and it doesn't make sense to consider nodes "healthy" in
|
||||||
|
this case. Also, when nodes such as Brokers use ZooKeeper segment discovery for building their view of the Druid cluster
|
||||||
|
(as opposed to HTTP segment discovery), they may be unusable until the ZooKeeper client is fully initialized and starts
|
||||||
|
to receive data from the ZooKeeper cluster. `{"selfDiscovered": true}` is a proxy event indicating that the ZooKeeper
|
||||||
|
client on the node has started to receive data from the ZooKeeper cluster and it's expected that all segments and other
|
||||||
|
nodes will be discovered by this node timely from this point.
|
||||||
|
|
||||||
|
`GET /status/selfDiscovered`
|
||||||
|
|
||||||
|
Similar to `/status/selfDiscovered/status`, but returns 200 OK response with empty body if the node has discovered itself
|
||||||
|
and 503 SERVICE UNAVAILABLE if the node hasn't discovered itself yet. This endpoint might be useful because some
|
||||||
|
monitoring checks such as AWS load balancer health checks are not able to look at the response body.
|
||||||
|
|
||||||
|
## Master server
|
||||||
|
|
||||||
|
### Coordinator
|
||||||
|
|
||||||
|
#### Leadership
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/leader`
|
||||||
|
|
||||||
|
Returns the current leader Coordinator of the cluster.
|
||||||
|
|
||||||
|
`GET /druid/coordinator/v1/isLeader`
|
||||||
|
|
||||||
|
Returns a JSON object with `leader` parameter, either true or false, indicating if this server is the current leader
|
||||||
|
Coordinator of the cluster. In addition, returns HTTP 200 if the server is the current leader and HTTP 404 if not.
|
||||||
|
This is suitable for use as a load balancer status check if you only want the active leader to be considered in-service
|
||||||
|
at the load balancer.
|
||||||
|
|
||||||
|
<a name="coordinator-segment-loading"></a>
|
||||||
|
|
||||||
|
### Overlord
|
||||||
|
|
||||||
|
#### Leadership
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/leader`
|
||||||
|
|
||||||
|
Returns the current leader Overlord of the cluster. If you have multiple Overlords, just one is leading at any given time. The others are on standby.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/isLeader`
|
||||||
|
|
||||||
|
This returns a JSON object with field `leader`, either true or false. In addition, this call returns HTTP 200 if the
|
||||||
|
server is the current leader and HTTP 404 if not. This is suitable for use as a load balancer status check if you
|
||||||
|
only want the active leader to be considered in-service at the load balancer.
|
||||||
|
|
||||||
|
## Data server
|
||||||
|
|
||||||
|
### MiddleManager
|
||||||
|
|
||||||
|
`GET /druid/worker/v1/enabled`
|
||||||
|
|
||||||
|
Check whether a MiddleManager is in an enabled or disabled state. Returns JSON object keyed by the combined `druid.host`
|
||||||
|
and `druid.port` with the boolean state as the value.
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"localhost:8091":true}
|
||||||
|
```
|
||||||
|
|
||||||
|
`GET /druid/worker/v1/tasks`
|
||||||
|
|
||||||
|
Retrieve a list of active tasks being run on MiddleManager. Returns JSON list of taskid strings. Normal usage should
|
||||||
|
prefer to use the `/druid/indexer/v1/tasks` [Tasks API](./tasks-api.md) or one of it's task state specific variants instead.
|
||||||
|
|
||||||
|
```json
|
||||||
|
["index_wikiticker_2019-02-11T02:20:15.316Z"]
|
||||||
|
```
|
||||||
|
|
||||||
|
`GET /druid/worker/v1/task/{taskid}/log`
|
||||||
|
|
||||||
|
Retrieve task log output stream by task id. Normal usage should prefer to use the `/druid/indexer/v1/task/{taskId}/log`
|
||||||
|
[Tasks API](./tasks-api.md) instead.
|
||||||
|
|
||||||
|
`POST /druid/worker/v1/disable`
|
||||||
|
|
||||||
|
Disable a MiddleManager, causing it to stop accepting new tasks but complete all existing tasks. Returns JSON object
|
||||||
|
keyed by the combined `druid.host` and `druid.port`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"localhost:8091":"disabled"}
|
||||||
|
```
|
||||||
|
|
||||||
|
`POST /druid/worker/v1/enable`
|
||||||
|
|
||||||
|
Enable a MiddleManager, allowing it to accept new tasks again if it was previously disabled. Returns JSON object
|
||||||
|
keyed by the combined `druid.host` and `druid.port`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"localhost:8091":"enabled"}
|
||||||
|
```
|
||||||
|
|
||||||
|
`POST /druid/worker/v1/task/{taskid}/shutdown`
|
||||||
|
|
||||||
|
Shutdown a running task by `taskid`. Normal usage should prefer to use the `/druid/indexer/v1/task/{taskId}/shutdown`
|
||||||
|
[Tasks API](./tasks-api.md) instead. Returns JSON:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"task":"index_kafka_wikiticker_f7011f8ffba384b_fpeclode"}
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Historical
|
||||||
|
### Segment loading
|
||||||
|
|
||||||
|
`GET /druid/historical/v1/loadstatus`
|
||||||
|
|
||||||
|
Returns JSON of the form `{"cacheInitialized":<value>}`, where value is either `true` or `false` indicating if all
|
||||||
|
segments in the local cache have been loaded. This can be used to know when a Historical process is ready
|
||||||
|
to be queried after a restart.
|
||||||
|
|
||||||
|
`GET /druid/historical/v1/readiness`
|
||||||
|
|
||||||
|
Similar to `/druid/historical/v1/loadstatus`, but instead of returning JSON with a flag, responses 200 OK if segments
|
||||||
|
in the local cache have been loaded, and 503 SERVICE UNAVAILABLE, if they haven't.
|
||||||
|
|
||||||
|
|
||||||
|
## Load Status
|
||||||
|
|
||||||
|
`GET /druid/broker/v1/loadstatus`
|
||||||
|
|
||||||
|
Returns a flag indicating if the Broker knows about all segments in the cluster. This can be used to know when a Broker process is ready to be queried after a restart.
|
||||||
|
|
||||||
|
`GET /druid/broker/v1/readiness`
|
||||||
|
|
||||||
|
Similar to `/druid/broker/v1/loadstatus`, but instead of returning a JSON, responses 200 OK if its ready and otherwise 503 SERVICE UNAVAILABLE.
|
|
@ -123,7 +123,7 @@ print(response.text)
|
||||||
|
|
||||||
| Field | Description |
|
| Field | Description |
|
||||||
|---|---|
|
|---|---|
|
||||||
| `taskId` | Controller task ID. You can use Druid's standard [task APIs](api-reference.md#overlord) to interact with this controller task. |
|
| `taskId` | Controller task ID. You can use Druid's standard [Tasks API](./tasks-api.md) to interact with this controller task. |
|
||||||
| `state` | Initial state for the query, which is "RUNNING". |
|
| `state` | Initial state for the query, which is "RUNNING". |
|
||||||
|
|
||||||
## Get the status for a query task
|
## Get the status for a query task
|
||||||
|
|
|
@ -0,0 +1,111 @@
|
||||||
|
---
|
||||||
|
id: supervisor-api
|
||||||
|
title: Supervisor API
|
||||||
|
sidebar_label: Supervisors
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the API endpoints to manage and monitor supervisors for Apache Druid.
|
||||||
|
|
||||||
|
## Supervisors
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/supervisor`
|
||||||
|
|
||||||
|
Returns a list of strings of the currently active supervisor ids.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/supervisor?full`
|
||||||
|
|
||||||
|
Returns a list of objects of the currently active supervisors.
|
||||||
|
|
||||||
|
|Field|Type|Description|
|
||||||
|
|---|---|---|
|
||||||
|
|`id`|String|supervisor unique identifier|
|
||||||
|
|`state`|String|basic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. Check [Kafka Docs](../development/extensions-core/kafka-supervisor-operations.md) for details.|
|
||||||
|
|`detailedState`|String|supervisor specific state. See documentation of specific supervisor for details: [Kafka](../development/extensions-core/kafka-ingestion.md) or [Kinesis](../development/extensions-core/kinesis-ingestion.md)|
|
||||||
|
|`healthy`|Boolean|true or false indicator of overall supervisor health|
|
||||||
|
|`spec`|SupervisorSpec|JSON specification of supervisor|
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/supervisor?state=true`
|
||||||
|
|
||||||
|
Returns a list of objects of the currently active supervisors and their current state.
|
||||||
|
|
||||||
|
|Field|Type|Description|
|
||||||
|
|---|---|---|
|
||||||
|
|`id`|String|supervisor unique identifier|
|
||||||
|
|`state`|String|basic state of the supervisor. Available states: `UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. Check [Kafka Docs](../development/extensions-core/kafka-supervisor-operations.md) for details.|
|
||||||
|
|`detailedState`|String|supervisor specific state. See documentation of the specific supervisor for details: [Kafka](../development/extensions-core/kafka-ingestion.md) or [Kinesis](../development/extensions-core/kinesis-ingestion.md)|
|
||||||
|
|`healthy`|Boolean|true or false indicator of overall supervisor health|
|
||||||
|
|`suspended`|Boolean|true or false indicator of whether the supervisor is in suspended state|
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/supervisor/<supervisorId>`
|
||||||
|
|
||||||
|
Returns the current spec for the supervisor with the provided ID.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/supervisor/<supervisorId>/status`
|
||||||
|
|
||||||
|
Returns the current status of the supervisor with the provided ID.
|
||||||
|
|
||||||
|
`GET/druid/indexer/v1/supervisor/history`
|
||||||
|
|
||||||
|
Returns an audit history of specs for all supervisors (current and past).
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/supervisor/<supervisorId>/history`
|
||||||
|
|
||||||
|
Returns an audit history of specs for the supervisor with the provided ID.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor`
|
||||||
|
|
||||||
|
Create a new supervisor or update an existing one.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor/<supervisorId>/suspend`
|
||||||
|
|
||||||
|
Suspend the current running supervisor of the provided ID. Responds with updated SupervisorSpec.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor/suspendAll`
|
||||||
|
|
||||||
|
Suspend all supervisors at once.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor/<supervisorId>/resume`
|
||||||
|
|
||||||
|
Resume indexing tasks for a supervisor. Responds with updated SupervisorSpec.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor/resumeAll`
|
||||||
|
|
||||||
|
Resume all supervisors at once.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor/<supervisorId>/reset`
|
||||||
|
|
||||||
|
Reset the specified supervisor.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor/<supervisorId>/terminate`
|
||||||
|
|
||||||
|
Terminate a supervisor of the provided ID.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor/terminateAll`
|
||||||
|
|
||||||
|
Terminate all supervisors at once.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/supervisor/<supervisorId>/shutdown`
|
||||||
|
|
||||||
|
> This API is deprecated and will be removed in future releases.
|
||||||
|
> Please use the equivalent `terminate` instead.
|
||||||
|
|
||||||
|
Shutdown a supervisor.
|
|
@ -0,0 +1,101 @@
|
||||||
|
---
|
||||||
|
id: tasks-api
|
||||||
|
title: Tasks API
|
||||||
|
sidebar_label: Tasks
|
||||||
|
---
|
||||||
|
|
||||||
|
<!--
|
||||||
|
~ Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
~ or more contributor license agreements. See the NOTICE file
|
||||||
|
~ distributed with this work for additional information
|
||||||
|
~ regarding copyright ownership. The ASF licenses this file
|
||||||
|
~ to you under the Apache License, Version 2.0 (the
|
||||||
|
~ "License"); you may not use this file except in compliance
|
||||||
|
~ with the License. You may obtain a copy of the License at
|
||||||
|
~
|
||||||
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
~
|
||||||
|
~ Unless required by applicable law or agreed to in writing,
|
||||||
|
~ software distributed under the License is distributed on an
|
||||||
|
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
~ KIND, either express or implied. See the License for the
|
||||||
|
~ specific language governing permissions and limitations
|
||||||
|
~ under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
This document describes the API endpoints for task retrieval, submission, and deletion for Apache Druid.
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
|
||||||
|
as in `2016-06-27_2016-06-28`.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/tasks`
|
||||||
|
|
||||||
|
Retrieve list of tasks. Accepts query string parameters `state`, `datasource`, `createdTimeInterval`, `max`, and `type`.
|
||||||
|
|
||||||
|
|Query Parameter |Description |
|
||||||
|
|---|---|
|
||||||
|
|`state`|filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.|
|
||||||
|
| `datasource`| return tasks filtered by Druid datasource.|
|
||||||
|
| `createdTimeInterval`| return tasks created within the specified interval. |
|
||||||
|
| `max`| maximum number of `"complete"` tasks to return. Only applies when `state` is set to `"complete"`.|
|
||||||
|
| `type`| filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.|
|
||||||
|
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/completeTasks`
|
||||||
|
|
||||||
|
Retrieve list of complete tasks. Equivalent to `/druid/indexer/v1/tasks?state=complete`.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/runningTasks`
|
||||||
|
|
||||||
|
Retrieve list of running tasks. Equivalent to `/druid/indexer/v1/tasks?state=running`.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/waitingTasks`
|
||||||
|
|
||||||
|
Retrieve list of waiting tasks. Equivalent to `/druid/indexer/v1/tasks?state=waiting`.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/pendingTasks`
|
||||||
|
|
||||||
|
Retrieve list of pending tasks. Equivalent to `/druid/indexer/v1/tasks?state=pending`.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/task/{taskId}`
|
||||||
|
|
||||||
|
Retrieve the 'payload' of a task.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/task/{taskId}/status`
|
||||||
|
|
||||||
|
Retrieve the status of a task.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/task/{taskId}/segments`
|
||||||
|
|
||||||
|
> This API is deprecated and will be removed in future releases.
|
||||||
|
|
||||||
|
Retrieve information about the segments of a task.
|
||||||
|
|
||||||
|
`GET /druid/indexer/v1/task/{taskId}/reports`
|
||||||
|
|
||||||
|
Retrieve a [task completion report](../ingestion/tasks.md#task-reports) for a task. Only works for completed tasks.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/task`
|
||||||
|
|
||||||
|
Endpoint for submitting tasks and supervisor specs to the Overlord. Returns the taskId of the submitted task.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/task/{taskId}/shutdown`
|
||||||
|
|
||||||
|
Shuts down a task.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/datasources/{dataSource}/shutdownAllTasks`
|
||||||
|
|
||||||
|
Shuts down all tasks for a dataSource.
|
||||||
|
|
||||||
|
`POST /druid/indexer/v1/taskStatus`
|
||||||
|
|
||||||
|
Retrieve list of task status objects for list of task id strings in request body.
|
||||||
|
|
||||||
|
`DELETE /druid/indexer/v1/pendingSegments/{dataSource}`
|
||||||
|
|
||||||
|
Manually clean up pending segments table in metadata storage for `datasource`. Returns a JSON object response with
|
||||||
|
`numDeleted` and count of rows deleted from the pending segments table. This API is used by the
|
||||||
|
`druid.coordinator.kill.pendingSegments.on` [coordinator setting](../configuration/index.md#coordinator-operation)
|
||||||
|
which automates this operation to perform periodically.
|
|
@ -999,7 +999,7 @@ These configuration options control Coordinator lookup management. See [dynamic
|
||||||
##### Automatic compaction dynamic configuration
|
##### Automatic compaction dynamic configuration
|
||||||
|
|
||||||
You can set or update [automatic compaction](../data-management/automatic-compaction.md) properties dynamically using the
|
You can set or update [automatic compaction](../data-management/automatic-compaction.md) properties dynamically using the
|
||||||
[Coordinator API](../api-reference/api-reference.md#automatic-compaction-configuration) without restarting Coordinators.
|
[Automatic compaction API](../api-reference/automatic-compaction-api.md) without restarting Coordinators.
|
||||||
|
|
||||||
For details about segment compaction, see [Segment size optimization](../operations/segment-optimization.md).
|
For details about segment compaction, see [Segment size optimization](../operations/segment-optimization.md).
|
||||||
|
|
||||||
|
|
|
@ -40,9 +40,9 @@ This topic guides you through setting up automatic compaction for your Druid clu
|
||||||
## Enable automatic compaction
|
## Enable automatic compaction
|
||||||
|
|
||||||
You can enable automatic compaction for a datasource using the web console or programmatically via an API.
|
You can enable automatic compaction for a datasource using the web console or programmatically via an API.
|
||||||
This process differs for manual compaction tasks, which can be submitted from the [Tasks view of the web console](../operations/web-console.md) or the [Tasks API](../api-reference/api-reference.md#tasks).
|
This process differs for manual compaction tasks, which can be submitted from the [Tasks view of the web console](../operations/web-console.md) or the [Tasks API](../api-reference/tasks-api.md).
|
||||||
|
|
||||||
### web console
|
### Web console
|
||||||
|
|
||||||
Use the web console to enable automatic compaction for a datasource as follows.
|
Use the web console to enable automatic compaction for a datasource as follows.
|
||||||
|
|
||||||
|
@ -59,10 +59,10 @@ To disable auto-compaction for a datasource, click **Delete** from the **Compact
|
||||||
|
|
||||||
### Compaction configuration API
|
### Compaction configuration API
|
||||||
|
|
||||||
Use the [Coordinator API](../api-reference/api-reference.md#automatic-compaction-status) to configure automatic compaction.
|
Use the [Automatic compaction API](../api-reference/automatic-compaction-api.md#automatic-compaction-status) to configure automatic compaction.
|
||||||
To enable auto-compaction for a datasource, create a JSON object with the desired auto-compaction settings.
|
To enable auto-compaction for a datasource, create a JSON object with the desired auto-compaction settings.
|
||||||
See [Configure automatic compaction](#configure-automatic-compaction) for the syntax of an auto-compaction spec.
|
See [Configure automatic compaction](#configure-automatic-compaction) for the syntax of an auto-compaction spec.
|
||||||
Send the JSON object as a payload in a [`POST` request](../api-reference/api-reference.md#automatic-compaction-configuration) to `/druid/coordinator/v1/config/compaction`.
|
Send the JSON object as a payload in a [`POST` request](../api-reference/automatic-compaction-api.md#automatic-compaction-configuration) to `/druid/coordinator/v1/config/compaction`.
|
||||||
The following example configures auto-compaction for the `wikipedia` datasource:
|
The following example configures auto-compaction for the `wikipedia` datasource:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
|
@ -76,7 +76,7 @@ curl --location --request POST 'http://localhost:8081/druid/coordinator/v1/confi
|
||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
To disable auto-compaction for a datasource, send a [`DELETE` request](../api-reference/api-reference.md#automatic-compaction-configuration) to `/druid/coordinator/v1/config/compaction/{dataSource}`. Replace `{dataSource}` with the name of the datasource for which to disable auto-compaction. For example:
|
To disable auto-compaction for a datasource, send a [`DELETE` request](../api-reference/automatic-compaction-api.md#automatic-compaction-configuration) to `/druid/coordinator/v1/config/compaction/{dataSource}`. Replace `{dataSource}` with the name of the datasource for which to disable auto-compaction. For example:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
curl --location --request DELETE 'http://localhost:8081/druid/coordinator/v1/config/compaction/wikipedia'
|
curl --location --request DELETE 'http://localhost:8081/druid/coordinator/v1/config/compaction/wikipedia'
|
||||||
|
@ -152,7 +152,7 @@ After the Coordinator has initiated auto-compaction, you can view compaction sta
|
||||||
|
|
||||||
In the web console, the Datasources view displays auto-compaction statistics. The Tasks view shows the task information for compaction tasks that were triggered by the automatic compaction system.
|
In the web console, the Datasources view displays auto-compaction statistics. The Tasks view shows the task information for compaction tasks that were triggered by the automatic compaction system.
|
||||||
|
|
||||||
To get statistics by API, send a [`GET` request](../api-reference/api-reference.md#automatic-compaction-status) to `/druid/coordinator/v1/compaction/status`. To filter the results to a particular datasource, pass the datasource name as a query parameter to the request—for example, `/druid/coordinator/v1/compaction/status?dataSource=wikipedia`.
|
To get statistics by API, send a [`GET` request](../api-reference/automatic-compaction-api.md#automatic-compaction-status) to `/druid/coordinator/v1/compaction/status`. To filter the results to a particular datasource, pass the datasource name as a query parameter to the request—for example, `/druid/coordinator/v1/compaction/status?dataSource=wikipedia`.
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
|
|
||||||
|
|
|
@ -38,7 +38,7 @@ Deletion by time range happens in two steps:
|
||||||
you have a backup.
|
you have a backup.
|
||||||
|
|
||||||
For documentation on disabling segments using the Coordinator API, see the
|
For documentation on disabling segments using the Coordinator API, see the
|
||||||
[Coordinator API reference](../api-reference/api-reference.md#coordinator-datasources).
|
[Legacy metadata API reference](../api-reference/legacy-metadata-api.md#datasources).
|
||||||
|
|
||||||
A data deletion tutorial is available at [Tutorial: Deleting data](../tutorials/tutorial-delete-data.md).
|
A data deletion tutorial is available at [Tutorial: Deleting data](../tutorials/tutorial-delete-data.md).
|
||||||
|
|
||||||
|
|
|
@ -31,7 +31,7 @@ For basic tuning guidance for the Broker process, see [Basic cluster tuning](../
|
||||||
|
|
||||||
### HTTP endpoints
|
### HTTP endpoints
|
||||||
|
|
||||||
For a list of API endpoints supported by the Broker, see [Broker API](../api-reference/api-reference.md#broker).
|
For a list of API endpoints supported by the Broker, see [Broker API](../api-reference/legacy-metadata-api.md#broker).
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
|
|
||||||
|
|
|
@ -31,7 +31,7 @@ For basic tuning guidance for the Coordinator process, see [Basic cluster tuning
|
||||||
|
|
||||||
### HTTP endpoints
|
### HTTP endpoints
|
||||||
|
|
||||||
For a list of API endpoints supported by the Coordinator, see [Coordinator API](../api-reference/api-reference.md#coordinator).
|
For a list of API endpoints supported by the Coordinator, see [Service status API reference](../api-reference/service-status-api.md#coordinator).
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
|
|
||||||
|
@ -92,7 +92,7 @@ Once some segments are found, it issues a [compaction task](../ingestion/tasks.m
|
||||||
The maximum number of running compaction tasks is `min(sum of worker capacity * slotRatio, maxSlots)`.
|
The maximum number of running compaction tasks is `min(sum of worker capacity * slotRatio, maxSlots)`.
|
||||||
Note that even if `min(sum of worker capacity * slotRatio, maxSlots) = 0`, at least one compaction task is always submitted
|
Note that even if `min(sum of worker capacity * slotRatio, maxSlots) = 0`, at least one compaction task is always submitted
|
||||||
if the compaction is enabled for a dataSource.
|
if the compaction is enabled for a dataSource.
|
||||||
See [Automatic compaction configuration API](../api-reference/api-reference.md#automatic-compaction-configuration) and [Automatic compaction configuration](../configuration/index.md#automatic-compaction-dynamic-configuration) to enable and configure automatic compaction.
|
See [Automatic compaction configuration API](../api-reference/automatic-compaction-api.md#automatic-compaction-configuration) and [Automatic compaction configuration](../configuration/index.md#automatic-compaction-dynamic-configuration) to enable and configure automatic compaction.
|
||||||
|
|
||||||
Compaction tasks might fail due to the following reasons:
|
Compaction tasks might fail due to the following reasons:
|
||||||
|
|
||||||
|
|
|
@ -31,7 +31,7 @@ For basic tuning guidance for the Historical process, see [Basic cluster tuning]
|
||||||
|
|
||||||
### HTTP endpoints
|
### HTTP endpoints
|
||||||
|
|
||||||
For a list of API endpoints supported by the Historical, please see the [API reference](../api-reference/api-reference.md#historical).
|
For a list of API endpoints supported by the Historical, please see the [Service status API reference](../api-reference/service-status-api.md#historical).
|
||||||
|
|
||||||
### Running
|
### Running
|
||||||
|
|
||||||
|
|
|
@ -35,7 +35,7 @@ For Apache Druid Indexer Process Configuration, see [Indexer Configuration](../c
|
||||||
|
|
||||||
### HTTP endpoints
|
### HTTP endpoints
|
||||||
|
|
||||||
The Indexer process shares the same HTTP endpoints as the [MiddleManager](../api-reference/api-reference.md#middlemanager).
|
The Indexer process shares the same HTTP endpoints as the [MiddleManager](../api-reference/service-status-api.md#middlemanager).
|
||||||
|
|
||||||
### Running
|
### Running
|
||||||
|
|
||||||
|
|
|
@ -30,7 +30,7 @@ Indexing [tasks](../ingestion/tasks.md) are responsible for creating and [killin
|
||||||
The indexing service is composed of three main components: [Peons](../design/peons.md) that can run a single task, [MiddleManagers](../design/middlemanager.md) that manage Peons, and an [Overlord](../design/overlord.md) that manages task distribution to MiddleManagers.
|
The indexing service is composed of three main components: [Peons](../design/peons.md) that can run a single task, [MiddleManagers](../design/middlemanager.md) that manage Peons, and an [Overlord](../design/overlord.md) that manages task distribution to MiddleManagers.
|
||||||
Overlords and MiddleManagers may run on the same process or across multiple processes, while MiddleManagers and Peons always run on the same process.
|
Overlords and MiddleManagers may run on the same process or across multiple processes, while MiddleManagers and Peons always run on the same process.
|
||||||
|
|
||||||
Tasks are managed using API endpoints on the Overlord service. Please see [Overlord Task API](../api-reference/api-reference.md#tasks) for more information.
|
Tasks are managed using API endpoints on the Overlord service. Please see [Tasks API](../api-reference/tasks-api.md) for more information.
|
||||||
|
|
||||||
![Indexing Service](../assets/indexing_service.png "Indexing Service")
|
![Indexing Service](../assets/indexing_service.png "Indexing Service")
|
||||||
|
|
||||||
|
|
|
@ -31,7 +31,7 @@ For basic tuning guidance for the MiddleManager process, see [Basic cluster tuni
|
||||||
|
|
||||||
### HTTP endpoints
|
### HTTP endpoints
|
||||||
|
|
||||||
For a list of API endpoints supported by the MiddleManager, please see the [API reference](../api-reference/api-reference.md#middlemanager).
|
For a list of API endpoints supported by the MiddleManager, please see the [Service status API reference](../api-reference/service-status-api.md#middlemanager).
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
|
|
||||||
|
|
|
@ -31,7 +31,7 @@ For basic tuning guidance for the Overlord process, see [Basic cluster tuning](.
|
||||||
|
|
||||||
### HTTP endpoints
|
### HTTP endpoints
|
||||||
|
|
||||||
For a list of API endpoints supported by the Overlord, please see the [API reference](../api-reference/api-reference.md#overlord).
|
For a list of API endpoints supported by the Overlord, please see the [Service status API reference](../api-reference/service-status-api.md#overlord).
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
|
|
||||||
|
|
|
@ -31,8 +31,6 @@ For basic tuning guidance for MiddleManager tasks, see [Basic cluster tuning](..
|
||||||
|
|
||||||
### HTTP endpoints
|
### HTTP endpoints
|
||||||
|
|
||||||
For a list of API endpoints supported by the Peon, please see the [Peon API reference](../api-reference/api-reference.md#peon).
|
|
||||||
|
|
||||||
Peons run a single task in a single JVM. MiddleManager is responsible for creating Peons for running tasks.
|
Peons run a single task in a single JVM. MiddleManager is responsible for creating Peons for running tasks.
|
||||||
Peons should rarely (if ever for testing purposes) be run on their own.
|
Peons should rarely (if ever for testing purposes) be run on their own.
|
||||||
|
|
||||||
|
|
|
@ -36,7 +36,7 @@ For basic tuning guidance for the Router process, see [Basic cluster tuning](../
|
||||||
|
|
||||||
### HTTP endpoints
|
### HTTP endpoints
|
||||||
|
|
||||||
For a list of API endpoints supported by the Router, see [Router API](../api-reference/api-reference.md#router).
|
For a list of API endpoints supported by the Router, see [Legacy metadata API reference](../api-reference/legacy-metadata-api.md#datasource-information).
|
||||||
|
|
||||||
### Running
|
### Running
|
||||||
|
|
||||||
|
|
|
@ -25,7 +25,7 @@ description: "Reference topic for running and maintaining Apache Kafka superviso
|
||||||
-->
|
-->
|
||||||
This topic contains operations reference information to run and maintain Apache Kafka supervisors for Apache Druid. It includes descriptions of how some supervisor APIs work within Kafka Indexing Service.
|
This topic contains operations reference information to run and maintain Apache Kafka supervisors for Apache Druid. It includes descriptions of how some supervisor APIs work within Kafka Indexing Service.
|
||||||
|
|
||||||
For all supervisor APIs, see [Supervisor APIs](../../api-reference/api-reference.md#supervisors).
|
For all supervisor APIs, see [Supervisor API reference](../../api-reference/supervisor-api.md).
|
||||||
|
|
||||||
## Getting Supervisor Status Report
|
## Getting Supervisor Status Report
|
||||||
|
|
||||||
|
|
|
@ -205,7 +205,7 @@ The `tuningConfig` is optional and default parameters will be used if no `tuning
|
||||||
| `indexSpecForIntermediatePersists`| | Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. This can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](#indexspec) for possible values. | no (default = same as `indexSpec`) |
|
| `indexSpecForIntermediatePersists`| | Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. This can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](#indexspec) for possible values. | no (default = same as `indexSpec`) |
|
||||||
| `reportParseExceptions` | Boolean | *DEPRECATED*. If true, exceptions encountered during parsing will be thrown and will halt ingestion; if false, unparseable rows and fields will be skipped. Setting `reportParseExceptions` to true will override existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to no more than 1. | no (default == false) |
|
| `reportParseExceptions` | Boolean | *DEPRECATED*. If true, exceptions encountered during parsing will be thrown and will halt ingestion; if false, unparseable rows and fields will be skipped. Setting `reportParseExceptions` to true will override existing configurations for `maxParseExceptions` and `maxSavedParseExceptions`, setting `maxParseExceptions` to 0 and limiting `maxSavedParseExceptions` to no more than 1. | no (default == false) |
|
||||||
| `handoffConditionTimeout` | Long | Milliseconds to wait for segment handoff. It must be >= 0, where 0 means to wait forever. | no (default == 0) |
|
| `handoffConditionTimeout` | Long | Milliseconds to wait for segment handoff. It must be >= 0, where 0 means to wait forever. | no (default == 0) |
|
||||||
| `resetOffsetAutomatically` | Boolean | Controls behavior when Druid needs to read Kafka messages that are no longer available (i.e. when `OffsetOutOfRangeException` is encountered).<br/><br/>If false, the exception will bubble up, which will cause your tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation; potentially using the [Reset Supervisor API](../../api-reference/api-reference.md#supervisors). This mode is useful for production, since it will make you aware of issues with ingestion.<br/><br/>If true, Druid will automatically reset to the earlier or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if true, latest if false). Note that this can lead to data being _DROPPED_ (if `useEarliestOffset` is false) or _DUPLICATED_ (if `useEarliestOffset` is true) without your knowledge. Messages will be logged indicating that a reset has occurred, but ingestion will continue. This mode is useful for non-production situations, since it will make Druid attempt to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.<br/><br/>This feature behaves similarly to the Kafka `auto.offset.reset` consumer property. | no (default == false) |
|
| `resetOffsetAutomatically` | Boolean | Controls behavior when Druid needs to read Kafka messages that are no longer available (i.e. when `OffsetOutOfRangeException` is encountered).<br/><br/>If false, the exception will bubble up, which will cause your tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation; potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md). This mode is useful for production, since it will make you aware of issues with ingestion.<br/><br/>If true, Druid will automatically reset to the earlier or latest offset available in Kafka, based on the value of the `useEarliestOffset` property (earliest if true, latest if false). Note that this can lead to data being _DROPPED_ (if `useEarliestOffset` is false) or _DUPLICATED_ (if `useEarliestOffset` is true) without your knowledge. Messages will be logged indicating that a reset has occurred, but ingestion will continue. This mode is useful for non-production situations, since it will make Druid attempt to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.<br/><br/>This feature behaves similarly to the Kafka `auto.offset.reset` consumer property. | no (default == false) |
|
||||||
| `workerThreads` | Integer | The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation. | no (default == min(10, taskCount)) |
|
| `workerThreads` | Integer | The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation. | no (default == min(10, taskCount)) |
|
||||||
| `chatAsync` | Boolean | If true, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If false, use synchronous communication in a thread pool of size `chatThreads`. | no (default == true) |
|
| `chatAsync` | Boolean | If true, use asynchronous communication with indexing tasks, and ignore the `chatThreads` parameter. If false, use synchronous communication in a thread pool of size `chatThreads`. | no (default == true) |
|
||||||
| `chatThreads` | Integer | The number of threads that will be used for communicating with indexing tasks. Ignored if `chatAsync` is `true` (the default). | no (default == min(10, taskCount * replicas)) |
|
| `chatThreads` | Integer | The number of threads that will be used for communicating with indexing tasks. Ignored if `chatAsync` is `true` (the default). | no (default == min(10, taskCount * replicas)) |
|
||||||
|
|
|
@ -284,7 +284,7 @@ The `tuningConfig` is optional. If no `tuningConfig` is specified, default param
|
||||||
|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. This can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](#indexspec) for possible values.| no (default = same as `indexSpec`)|
|
|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. This can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](#indexspec) for possible values.| no (default = same as `indexSpec`)|
|
||||||
|`reportParseExceptions`|Boolean|If true, exceptions encountered during parsing will be thrown and will halt ingestion; if false, unparseable rows and fields will be skipped.|no (default == false)|
|
|`reportParseExceptions`|Boolean|If true, exceptions encountered during parsing will be thrown and will halt ingestion; if false, unparseable rows and fields will be skipped.|no (default == false)|
|
||||||
|`handoffConditionTimeout`|Long| Milliseconds to wait for segment handoff. It must be >= 0, where 0 means to wait forever.| no (default == 0)|
|
|`handoffConditionTimeout`|Long| Milliseconds to wait for segment handoff. It must be >= 0, where 0 means to wait forever.| no (default == 0)|
|
||||||
|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kinesis messages that are no longer available.<br/><br/>If false, the exception bubbles up, causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/api-reference.md#supervisors). This mode is useful for production, since it highlights issues with ingestion.<br/><br/>If true, Druid automatically resets to the earliest or latest sequence number available in Kinesis, based on the value of the `useEarliestSequenceNumber` property (earliest if true, latest if false). Note that this can lead to data being *DROPPED* (if `useEarliestSequenceNumber` is false) or *DUPLICATED* (if `useEarliestSequenceNumber` is true) without your knowledge. Druid will log messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.|no (default == false)|
|
|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read Kinesis messages that are no longer available.<br/><br/>If false, the exception bubbles up, causing tasks to fail and ingestion to halt. If this occurs, manual intervention is required to correct the situation, potentially using the [Reset Supervisor API](../../api-reference/supervisor-api.md). This mode is useful for production, since it highlights issues with ingestion.<br/><br/>If true, Druid automatically resets to the earliest or latest sequence number available in Kinesis, based on the value of the `useEarliestSequenceNumber` property (earliest if true, latest if false). Note that this can lead to data being *DROPPED* (if `useEarliestSequenceNumber` is false) or *DUPLICATED* (if `useEarliestSequenceNumber` is true) without your knowledge. Druid will log messages indicating that a reset has occurred without interrupting ingestion. This mode is useful for non-production situations since it enables Druid to recover from problems automatically, even if they lead to quiet dropping or duplicating of data.|no (default == false)|
|
||||||
|`skipSequenceNumberAvailabilityCheck`|Boolean|Whether to enable checking if the current sequence number is still available in a particular Kinesis shard. If set to false, the indexing task will attempt to reset the current sequence number (or not), depending on the value of `resetOffsetAutomatically`.|no (default == false)|
|
|`skipSequenceNumberAvailabilityCheck`|Boolean|Whether to enable checking if the current sequence number is still available in a particular Kinesis shard. If set to false, the indexing task will attempt to reset the current sequence number (or not), depending on the value of `resetOffsetAutomatically`.|no (default == false)|
|
||||||
|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|no (default == min(10, taskCount))|
|
|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|no (default == min(10, taskCount))|
|
||||||
|`chatAsync`|Boolean| If true, the supervisor uses asynchronous communication with indexing tasks and ignores the `chatThreads` parameter. If false, the supervisor uses synchronous communication in a thread pool of size `chatThreads`.| no (default == true)|
|
|`chatAsync`|Boolean| If true, the supervisor uses asynchronous communication with indexing tasks and ignores the `chatThreads` parameter. If false, the supervisor uses synchronous communication in a thread pool of size `chatThreads`.| no (default == true)|
|
||||||
|
@ -338,7 +338,7 @@ For Concise bitmaps:
|
||||||
## Operations
|
## Operations
|
||||||
|
|
||||||
This section describes how some supervisor APIs work in Kinesis Indexing Service.
|
This section describes how some supervisor APIs work in Kinesis Indexing Service.
|
||||||
For all supervisor APIs, check [Supervisor APIs](../../api-reference/api-reference.md#supervisors).
|
For all supervisor APIs, check [Supervisor API reference](../../api-reference/supervisor-api.md).
|
||||||
|
|
||||||
### AWS Authentication
|
### AWS Authentication
|
||||||
|
|
||||||
|
|
|
@ -57,15 +57,15 @@ Make sure to include the `druid-hdfs-storage` and all the hadoop configuration,
|
||||||
|
|
||||||
You can verify if segments created by a recent ingestion task are loaded onto historicals and available for querying using the following workflow.
|
You can verify if segments created by a recent ingestion task are loaded onto historicals and available for querying using the following workflow.
|
||||||
1. Submit your ingestion task.
|
1. Submit your ingestion task.
|
||||||
2. Repeatedly poll the [Overlord's tasks API](../api-reference/api-reference.md#tasks) ( `/druid/indexer/v1/task/{taskId}/status`) until your task is shown to be successfully completed.
|
2. Repeatedly poll the [Overlord's tasks API](../api-reference/tasks-api.md) ( `/druid/indexer/v1/task/{taskId}/status`) until your task is shown to be successfully completed.
|
||||||
3. Poll the [Segment Loading by Datasource API](../api-reference/api-reference.md#segment-loading-by-datasource) (`/druid/coordinator/v1/datasources/{dataSourceName}/loadstatus`) with
|
3. Poll the [Segment Loading by Datasource API](../api-reference/legacy-metadata-api.md#segment-loading-by-datasource) (`/druid/coordinator/v1/datasources/{dataSourceName}/loadstatus`) with
|
||||||
`forceMetadataRefresh=true` and `interval=<INTERVAL_OF_INGESTED_DATA>` once.
|
`forceMetadataRefresh=true` and `interval=<INTERVAL_OF_INGESTED_DATA>` once.
|
||||||
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms of the load on the metadata store but is necessary to make sure that we verify all the latest segments' load status)
|
(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms of the load on the metadata store but is necessary to make sure that we verify all the latest segments' load status)
|
||||||
If there are segments not yet loaded, continue to step 4, otherwise you can now query the data.
|
If there are segments not yet loaded, continue to step 4, otherwise you can now query the data.
|
||||||
4. Repeatedly poll the [Segment Loading by Datasource API](../api-reference/api-reference.md#segment-loading-by-datasource) (`/druid/coordinator/v1/datasources/{dataSourceName}/loadstatus`) with
|
4. Repeatedly poll the [Segment Loading by Datasource API](../api-reference/legacy-metadata-api.md#segment-loading-by-datasource) (`/druid/coordinator/v1/datasources/{dataSourceName}/loadstatus`) with
|
||||||
`forceMetadataRefresh=false` and `interval=<INTERVAL_OF_INGESTED_DATA>`.
|
`forceMetadataRefresh=false` and `interval=<INTERVAL_OF_INGESTED_DATA>`.
|
||||||
Continue polling until all segments are loaded. Once all segments are loaded you can now query the data.
|
Continue polling until all segments are loaded. Once all segments are loaded you can now query the data.
|
||||||
Note that this workflow only guarantees that the segments are available at the time of the [Segment Loading by Datasource API](../api-reference/api-reference.md#segment-loading-by-datasource) call. Segments can still become missing because of historical process failures or any other reasons afterward.
|
Note that this workflow only guarantees that the segments are available at the time of the [Segment Loading by Datasource API](../api-reference/legacy-metadata-api.md#segment-loading-by-datasource) call. Segments can still become missing because of historical process failures or any other reasons afterward.
|
||||||
|
|
||||||
## I don't see my Druid segments on my Historical processes
|
## I don't see my Druid segments on my Historical processes
|
||||||
|
|
||||||
|
|
|
@ -28,7 +28,7 @@ instance of a Druid [Overlord](../design/overlord.md). Please refer to our [Hado
|
||||||
comparisons between Hadoop-based, native batch (simple), and native batch (parallel) ingestion.
|
comparisons between Hadoop-based, native batch (simple), and native batch (parallel) ingestion.
|
||||||
|
|
||||||
To run a Hadoop-based ingestion task, write an ingestion spec as specified below. Then POST it to the
|
To run a Hadoop-based ingestion task, write an ingestion spec as specified below. Then POST it to the
|
||||||
[`/druid/indexer/v1/task`](../api-reference/api-reference.md#tasks) endpoint on the Overlord, or use the
|
[`/druid/indexer/v1/task`](../api-reference/tasks-api.md) endpoint on the Overlord, or use the
|
||||||
`bin/post-index-task` script included with Druid.
|
`bin/post-index-task` script included with Druid.
|
||||||
|
|
||||||
## Tutorial
|
## Tutorial
|
||||||
|
|
|
@ -69,7 +69,7 @@ runs for the duration of the job.
|
||||||
| **Method** | [Native batch](./native-batch.md) | [SQL](../multi-stage-query/index.md) | [Hadoop-based](hadoop.md) |
|
| **Method** | [Native batch](./native-batch.md) | [SQL](../multi-stage-query/index.md) | [Hadoop-based](hadoop.md) |
|
||||||
|---|-----|--------------|------------|
|
|---|-----|--------------|------------|
|
||||||
| **Controller task type** | `index_parallel` | `query_controller` | `index_hadoop` |
|
| **Controller task type** | `index_parallel` | `query_controller` | `index_hadoop` |
|
||||||
| **How you submit it** | Send an `index_parallel` spec to the [task API](../api-reference/api-reference.md#tasks). | Send an [INSERT](../multi-stage-query/concepts.md#insert) or [REPLACE](../multi-stage-query/concepts.md#replace) statement to the [SQL task API](../api-reference/sql-ingestion-api.md#submit-a-query). | Send an `index_hadoop` spec to the [task API](../api-reference/api-reference.md#tasks). |
|
| **How you submit it** | Send an `index_parallel` spec to the [Tasks API](../api-reference/tasks-api.md). | Send an [INSERT](../multi-stage-query/concepts.md#insert) or [REPLACE](../multi-stage-query/concepts.md#replace) statement to the [SQL task API](../api-reference/sql-ingestion-api.md#submit-a-query). | Send an `index_hadoop` spec to the [Tasks API](../api-reference/tasks-api.md). |
|
||||||
| **Parallelism** | Using subtasks, if [`maxNumConcurrentSubTasks`](native-batch.md#tuningconfig) is greater than 1. | Using `query_worker` subtasks. | Using YARN. |
|
| **Parallelism** | Using subtasks, if [`maxNumConcurrentSubTasks`](native-batch.md#tuningconfig) is greater than 1. | Using `query_worker` subtasks. | Using YARN. |
|
||||||
| **Fault tolerance** | Workers automatically relaunched upon failure. Controller task failure leads to job failure. | Controller or worker task failure leads to job failure. | YARN containers automatically relaunched upon failure. Controller task failure leads to job failure. |
|
| **Fault tolerance** | Workers automatically relaunched upon failure. Controller task failure leads to job failure. | Controller or worker task failure leads to job failure. | YARN containers automatically relaunched upon failure. Controller task failure leads to job failure. |
|
||||||
| **Can append?** | Yes. | Yes (INSERT). | No. |
|
| **Can append?** | Yes. | Yes (INSERT). | No. |
|
||||||
|
|
|
@ -41,7 +41,7 @@ For related information on batch indexing, see:
|
||||||
|
|
||||||
To run either kind of native batch indexing task you can:
|
To run either kind of native batch indexing task you can:
|
||||||
- Use the **Load Data** UI in the web console to define and submit an ingestion spec.
|
- Use the **Load Data** UI in the web console to define and submit an ingestion spec.
|
||||||
- Define an ingestion spec in JSON based upon the [examples](#parallel-indexing-example) and reference topics for batch indexing. Then POST the ingestion spec to the [Indexer API endpoint](../api-reference/api-reference.md#tasks),
|
- Define an ingestion spec in JSON based upon the [examples](#parallel-indexing-example) and reference topics for batch indexing. Then POST the ingestion spec to the [Tasks API endpoint](../api-reference/tasks-api.md),
|
||||||
`/druid/indexer/v1/task`, the Overlord service. Alternatively you can use the indexing script included with Druid at `bin/post-index-task`.
|
`/druid/indexer/v1/task`, the Overlord service. Alternatively you can use the indexing script included with Druid at `bin/post-index-task`.
|
||||||
|
|
||||||
## Parallel task indexing
|
## Parallel task indexing
|
||||||
|
|
|
@ -26,7 +26,7 @@ sidebar_label: Task reference
|
||||||
Tasks do all [ingestion](index.md)-related work in Druid.
|
Tasks do all [ingestion](index.md)-related work in Druid.
|
||||||
|
|
||||||
For batch ingestion, you will generally submit tasks directly to Druid using the
|
For batch ingestion, you will generally submit tasks directly to Druid using the
|
||||||
[Task APIs](../api-reference/api-reference.md#tasks). For streaming ingestion, tasks are generally submitted for you by a
|
[Tasks APIs](../api-reference/tasks-api.md). For streaming ingestion, tasks are generally submitted for you by a
|
||||||
supervisor.
|
supervisor.
|
||||||
|
|
||||||
## Task API
|
## Task API
|
||||||
|
@ -34,7 +34,7 @@ supervisor.
|
||||||
Task APIs are available in two main places:
|
Task APIs are available in two main places:
|
||||||
|
|
||||||
- The [Overlord](../design/overlord.md) process offers HTTP APIs to submit tasks, cancel tasks, check their status,
|
- The [Overlord](../design/overlord.md) process offers HTTP APIs to submit tasks, cancel tasks, check their status,
|
||||||
review logs and reports, and more. Refer to the [Tasks API reference page](../api-reference/api-reference.md#tasks) for a
|
review logs and reports, and more. Refer to the [Tasks API reference](../api-reference/tasks-api.md) for a
|
||||||
full list.
|
full list.
|
||||||
- Druid SQL includes a [`sys.tasks`](../querying/sql-metadata-tables.md#tasks-table) table that provides information about currently
|
- Druid SQL includes a [`sys.tasks`](../querying/sql-metadata-tables.md#tasks-table) table that provides information about currently
|
||||||
running tasks. This table is read-only, and has a limited (but useful!) subset of the full information available through
|
running tasks. This table is read-only, and has a limited (but useful!) subset of the full information available through
|
||||||
|
@ -406,7 +406,7 @@ The task then starts creating logs in a local directory of the middle manager (o
|
||||||
|
|
||||||
When the task completes - whether it succeeds or fails - the middle manager (or indexer) will push the task log file into the location specified in [`druid.indexer.logs`](../configuration/index.md#task-logging).
|
When the task completes - whether it succeeds or fails - the middle manager (or indexer) will push the task log file into the location specified in [`druid.indexer.logs`](../configuration/index.md#task-logging).
|
||||||
|
|
||||||
Task logs on the Druid web console are retrieved via an [API](../api-reference/api-reference.md#overlord) on the Overlord. It automatically detects where the log file is, either in the middleManager / indexer or in long-term storage, and passes it back.
|
Task logs on the Druid web console are retrieved via an [API](../api-reference/service-status-api.md#overlord) on the Overlord. It automatically detects where the log file is, either in the middleManager / indexer or in long-term storage, and passes it back.
|
||||||
|
|
||||||
If you don't see the log file in long-term storage, it means either:
|
If you don't see the log file in long-term storage, it means either:
|
||||||
|
|
||||||
|
|
|
@ -38,7 +38,7 @@ Retention rules are persistent: they remain in effect until you change them. Dru
|
||||||
|
|
||||||
## Set retention rules
|
## Set retention rules
|
||||||
|
|
||||||
You can use the Druid [web console](./web-console.md) or the [Coordinator API](../api-reference/api-reference.md#coordinator) to create and manage retention rules.
|
You can use the Druid [web console](./web-console.md) or the [Service status API reference](../api-reference/service-status-api.md#coordinator) to create and manage retention rules.
|
||||||
|
|
||||||
### Use the web console
|
### Use the web console
|
||||||
|
|
||||||
|
|
|
@ -146,257 +146,8 @@ The Coordinator periodically checks if any of the processes need to load/drop lo
|
||||||
|
|
||||||
Please note that only 2 simultaneous lookup configuration propagation requests can be concurrently handled by a single query serving process. This limit is applied to prevent lookup handling from consuming too many server HTTP connections.
|
Please note that only 2 simultaneous lookup configuration propagation requests can be concurrently handled by a single query serving process. This limit is applied to prevent lookup handling from consuming too many server HTTP connections.
|
||||||
|
|
||||||
## API for configuring lookups
|
## API
|
||||||
|
See [Lookups API](../api-reference/lookups-api.md) for reference on configuring lookups and lookup status.
|
||||||
### Bulk update
|
|
||||||
Lookups can be updated in bulk by posting a JSON object to `/druid/coordinator/v1/lookups/config`. The format of the json object is as follows:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"<tierName>": {
|
|
||||||
"<lookupName>": {
|
|
||||||
"version": "<version>",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "<someExtractorFactoryType>",
|
|
||||||
"<someExtractorField>": "<someExtractorValue>"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Note that "version" is an arbitrary string assigned by the user, when making updates to existing lookup then user would need to specify a lexicographically higher version.
|
|
||||||
|
|
||||||
For example, a config might look something like:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"__default": {
|
|
||||||
"country_code": {
|
|
||||||
"version": "v0",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"77483": "United States"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"site_id": {
|
|
||||||
"version": "v0",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "cachedNamespace",
|
|
||||||
"extractionNamespace": {
|
|
||||||
"type": "jdbc",
|
|
||||||
"connectorConfig": {
|
|
||||||
"createTables": true,
|
|
||||||
"connectURI": "jdbc:mysql:\/\/localhost:3306\/druid",
|
|
||||||
"user": "druid",
|
|
||||||
"password": "diurd"
|
|
||||||
},
|
|
||||||
"table": "lookupTable",
|
|
||||||
"keyColumn": "country_id",
|
|
||||||
"valueColumn": "country_name",
|
|
||||||
"tsColumn": "timeColumn"
|
|
||||||
},
|
|
||||||
"firstCacheTimeout": 120000,
|
|
||||||
"injective": true
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"site_id_customer1": {
|
|
||||||
"version": "v0",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"847632": "Internal Use Only"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"site_id_customer2": {
|
|
||||||
"version": "v0",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"AHF77": "Home"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"realtime_customer1": {
|
|
||||||
"country_code": {
|
|
||||||
"version": "v0",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"77483": "United States"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"site_id_customer1": {
|
|
||||||
"version": "v0",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"847632": "Internal Use Only"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"realtime_customer2": {
|
|
||||||
"country_code": {
|
|
||||||
"version": "v0",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"77483": "United States"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"site_id_customer2": {
|
|
||||||
"version": "v0",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"AHF77": "Home"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
All entries in the map will UPDATE existing entries. No entries will be deleted.
|
|
||||||
|
|
||||||
### Update lookup
|
|
||||||
|
|
||||||
A `POST` to a particular lookup extractor factory via `/druid/coordinator/v1/lookups/config/{tier}/{id}` creates or updates that specific extractor factory.
|
|
||||||
|
|
||||||
For example, a post to `/druid/coordinator/v1/lookups/config/realtime_customer1/site_id_customer1` might contain the following:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"version": "v1",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"847632": "Internal Use Only"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
This will replace the `site_id_customer1` lookup in the `realtime_customer1` with the definition above.
|
|
||||||
|
|
||||||
Assign a unique version identifier each time you update a lookup extractor factory. Otherwise the call will fail.
|
|
||||||
|
|
||||||
### Get all lookups
|
|
||||||
|
|
||||||
A `GET` to `/druid/coordinator/v1/lookups/config/all` will return all known lookup specs for all tiers.
|
|
||||||
|
|
||||||
### Get lookup
|
|
||||||
|
|
||||||
A `GET` to a particular lookup extractor factory is accomplished via `/druid/coordinator/v1/lookups/config/{tier}/{id}`
|
|
||||||
|
|
||||||
Using the prior example, a `GET` to `/druid/coordinator/v1/lookups/config/realtime_customer2/site_id_customer2` should return
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"version": "v1",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"AHF77": "Home"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Delete lookup
|
|
||||||
|
|
||||||
A `DELETE` to `/druid/coordinator/v1/lookups/config/{tier}/{id}` will remove that lookup from the cluster. If it was last lookup in the tier, then tier is deleted as well.
|
|
||||||
|
|
||||||
### Delete tier
|
|
||||||
|
|
||||||
A `DELETE` to `/druid/coordinator/v1/lookups/config/{tier}` will remove that tier from the cluster.
|
|
||||||
|
|
||||||
### List tier names
|
|
||||||
|
|
||||||
A `GET` to `/druid/coordinator/v1/lookups/config` will return a list of known tier names in the dynamic configuration.
|
|
||||||
To discover a list of tiers currently active in the cluster in addition to ones known in the dynamic configuration, the parameter `discover=true` can be added as per `/druid/coordinator/v1/lookups/config?discover=true`.
|
|
||||||
|
|
||||||
### List lookup names
|
|
||||||
|
|
||||||
A `GET` to `/druid/coordinator/v1/lookups/config/{tier}` will return a list of known lookup names for that tier.
|
|
||||||
|
|
||||||
These end points can be used to get the propagation status of configured lookups to processes using lookups such as Historicals.
|
|
||||||
|
|
||||||
## API for lookup status
|
|
||||||
|
|
||||||
### List load status of all lookups
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/lookups/status` with optional query parameter `detailed`.
|
|
||||||
|
|
||||||
### List load status of lookups in a tier
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/lookups/status/{tier}` with optional query parameter `detailed`.
|
|
||||||
|
|
||||||
### List load status of single lookup
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/lookups/status/{tier}/{lookup}` with optional query parameter `detailed`.
|
|
||||||
|
|
||||||
### List lookup state of all processes
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/lookups/nodeStatus` with optional query parameter `discover` to discover tiers advertised by other Druid nodes, or by default, returning all configured lookup tiers. The default response will also include the lookups which are loaded, being loaded, or being dropped on each node, for each tier, including the complete lookup spec. Add the optional query parameter `detailed=false` to only include the 'version' of the lookup instead of the complete spec.
|
|
||||||
|
|
||||||
### List lookup state of processes in a tier
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/lookups/nodeStatus/{tier}`
|
|
||||||
|
|
||||||
### List lookup state of single process
|
|
||||||
|
|
||||||
`GET /druid/coordinator/v1/lookups/nodeStatus/{tier}/{host:port}`
|
|
||||||
|
|
||||||
## Internal API
|
|
||||||
|
|
||||||
The Peon, Router, Broker, and Historical processes all have the ability to consume lookup configuration.
|
|
||||||
There is an internal API these processes use to list/load/drop their lookups starting at `/druid/listen/v1/lookups`.
|
|
||||||
These follow the same convention for return values as the cluster wide dynamic configuration. Following endpoints
|
|
||||||
can be used for debugging purposes but not otherwise.
|
|
||||||
|
|
||||||
### Get lookups
|
|
||||||
|
|
||||||
A `GET` to the process at `/druid/listen/v1/lookups` will return a json map of all the lookups currently active on the process.
|
|
||||||
The return value will be a json map of the lookups to their extractor factories.
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"site_id_customer2": {
|
|
||||||
"version": "v1",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"AHF77": "Home"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Get lookup
|
|
||||||
|
|
||||||
A `GET` to the process at `/druid/listen/v1/lookups/some_lookup_name` will return the LookupExtractorFactory for the lookup identified by `some_lookup_name`.
|
|
||||||
The return value will be the json representation of the factory.
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"version": "v1",
|
|
||||||
"lookupExtractorFactory": {
|
|
||||||
"type": "map",
|
|
||||||
"map": {
|
|
||||||
"AHF77": "Home"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
|
|
|
@ -164,10 +164,33 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"API reference":[
|
"API reference":[
|
||||||
|
"api-reference/api-reference",
|
||||||
|
{
|
||||||
|
"type": "subcategory",
|
||||||
|
"label": "HTTP APIs",
|
||||||
|
"ids": [
|
||||||
"api-reference/sql-api",
|
"api-reference/sql-api",
|
||||||
"api-reference/sql-ingestion-api",
|
"api-reference/sql-ingestion-api",
|
||||||
"api-reference/sql-jdbc",
|
"api-reference/json-querying-api",
|
||||||
"api-reference/api-reference"
|
"api-reference/tasks-api",
|
||||||
|
"api-reference/supervisor-api",
|
||||||
|
"api-reference/retention-rules-api",
|
||||||
|
"api-reference/data-management-api",
|
||||||
|
"api-reference/automatic-compaction-api",
|
||||||
|
"api-reference/lookups-api",
|
||||||
|
"api-reference/service-status-api",
|
||||||
|
"api-reference/dynamic-configuration-api",
|
||||||
|
"api-reference/legacy-metadata-api"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "subcategory",
|
||||||
|
"label": "Java APIs",
|
||||||
|
"ids": [
|
||||||
|
"api-reference/sql-jdbc"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
],
|
],
|
||||||
"Configuration": [
|
"Configuration": [
|
||||||
"configuration/index",
|
"configuration/index",
|
||||||
|
|
Loading…
Reference in New Issue