15 KiB
id | title | sidebar_label |
---|---|---|
data-management-api | Data management API | Data management |
import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';
This topic describes the data management API endpoints for Apache Druid. This includes information on how to mark segments as used or unused and delete them from Druid.
In this topic, http://ROUTER_IP:ROUTER_PORT
is a placeholder for your Router service address and port.
Replace it with the information for your deployment.
For example, use http://localhost:8888
for quickstart deployments.
:::info Avoid using indexing or kill tasks and these APIs at the same time for the same datasource and time chunk. :::
Segment management
You can mark segments as used by sending POST requests to the datasource, but the Coordinator may subsequently mark segments as unused if they meet any configured drop rules. Even if these API requests update segments to used, you still need to configure a load rule to load them onto Historical processes.
When you use these APIs concurrently with an indexing task or a kill task, the behavior is undefined. Druid terminates some segments and marks others as used. Furthermore, it is possible that all segments could be unused, yet an indexing task might still be able to read data from these segments and complete successfully.
Segment IDs
You must provide segment IDs when using many of the endpoints described in this topic. For information on segment IDs, see Segment identification. For information on finding segment IDs in the web console, see Segments.
Mark a single segment unused
Marks the state of a segment as unused, using the segment ID. This is a "soft delete" of the segment from Historicals. To undo this action, mark the segment used.
Note that this endpoint returns an HTTP 200 OK
response code even if the segment ID or datasource doesn't exist.
URL
DELETE
/druid/coordinator/v1/datasources/:datasource/segments/:segmentId
Header
The following headers are required for this request:
Content-Type: application/json
Accept: application/json, text/plain
Responses
Successfully updated segment
Sample request
The following example updates the segment wikipedia_hour_2015-09-12T16:00:00.000Z_2015-09-12T17:00:00.000Z_2023-08-10T04:12:03.860Z
from datasource wikipedia_hour
as unused
.
curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/datasources/wikipedia_hour/segments/wikipedia_hour_2015-09-12T16:00:00.000Z_2015-09-12T17:00:00.000Z_2023-08-10T04:12:03.860Z" \
--header 'Content-Type: application/json' \
--header 'Accept: application/json, text/plain'
DELETE /druid/coordinator/v1/datasources/wikipedia_hour/segments/wikipedia_hour_2015-09-12T16:00:00.000Z_2015-09-12T17:00:00.000Z_2023-08-10T04:12:03.860Z HTTP/1.1
Host: http://ROUTER_IP:ROUTER_PORT
Content-Type: application/json
Accept: application/json, text/plain
Sample response
Show sample response
{
"segmentStateChanged": true
}
Mark a single segment as used
Marks the state of a segment as used, using the segment ID.
URL
POST
/druid/coordinator/v1/datasources/segments/:segmentId
Header
The following headers are required for this request:
Content-Type: application/json
Accept: application/json, text/plain
Responses
Successfully updated segments
Sample request
The following example updates the segment with ID wikipedia_hour_2015-09-12T18:00:00.000Z_2015-09-12T19:00:00.000Z_2023-08-10T04:12:03.860Z
to used.
curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/datasources/wikipedia_hour/segments/wikipedia_hour_2015-09-12T18:00:00.000Z_2015-09-12T19:00:00.000Z_2023-08-10T04:12:03.860Z" \
--header 'Content-Type: application/json' \
--header 'Accept: application/json, text/plain'
POST /druid/coordinator/v1/datasources/wikipedia_hour/segments/wikipedia_hour_2015-09-12T18:00:00.000Z_2015-09-12T19:00:00.000Z_2023-08-10T04:12:03.860Z HTTP/1.1
Host: http://ROUTER_IP:ROUTER_PORT
Content-Type: application/json
Accept: application/json, text/plain
Sample response
Show sample response
{
"segmentStateChanged": true
}
Mark a group of segments unused
Marks the state of a group of segments as unused, using an array of segment IDs or an interval. Pass the array of segment IDs or interval as a JSON object in the request body.
For the interval, specify the start and end times as ISO 8601 strings to identify segments inclusive of the start time and exclusive of the end time. Optionally, specify an array of segment versions with interval. Druid updates only the segments completely contained within the specified interval that match the optional list of versions; partially overlapping segments are not affected.
URL
POST
/druid/coordinator/v1/datasources/:datasource/markUnused
Request body
The group of segments is sent as a JSON request payload that accepts the following properties:
Property | Description | Required | Example |
---|---|---|---|
interval |
ISO 8601 segments interval. | Yes, if segmentIds is not specified. |
"2015-09-12T03:00:00.000Z/2015-09-12T05:00:00.000Z" |
segmentIds |
List of segment IDs. | Yes, if interval is not specified. |
["segmentId1", "segmentId2"] |
versions |
List of segment versions. Must be provided with interval . |
No. | ["2024-03-14T16:00:04.086Z", ""2024-03-12T16:00:04.086Z"] |
Responses
Successfully updated segments
Invalid datasource name
Invalid request payload
Sample request
The following example marks two segments from the wikipedia_hour
datasource unused based on their segment IDs.
curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/datasources/wikipedia_hour/markUnused" \
--header 'Content-Type: application/json' \
--data '{
"segmentIds": [
"wikipedia_hour_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2023-08-10T04:12:03.860Z",
"wikipedia_hour_2015-09-12T04:00:00.000Z_2015-09-12T05:00:00.000Z_2023-08-10T04:12:03.860Z"
]
}'
POST /druid/coordinator/v1/datasources/wikipedia_hour/markUnused HTTP/1.1
Host: http://ROUTER_IP:ROUTER_PORT
Content-Type: application/json
Content-Length: 230
{
"segmentIds": [
"wikipedia_hour_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2023-08-10T04:12:03.860Z",
"wikipedia_hour_2015-09-12T04:00:00.000Z_2015-09-12T05:00:00.000Z_2023-08-10T04:12:03.860Z"
]
}
Sample response
Show sample response
{
"numChangedSegments": 2
}
Mark a group of segments used
Marks the state of a group of segments as used, using an array of segment IDs or an interval. Pass the array of segment IDs or interval as a JSON object in the request body.
For the interval, specify the start and end times as ISO 8601 strings to identify segments inclusive of the start time and exclusive of the end time. Optionally, specify an array of segment versions with interval. Druid updates only the segments completely contained within the specified interval that match the optional list of versions; partially overlapping segments are not affected.
URL
POST
/druid/coordinator/v1/datasources/:datasource/markUsed
Request body
The group of segments is sent as a JSON request payload that accepts the following properties:
Property | Description | Required | Example |
---|---|---|---|
interval |
ISO 8601 segments interval. | Yes, if segmentIds is not specified. |
"2015-09-12T03:00:00.000Z/2015-09-12T05:00:00.000Z" |
segmentIds |
List of segment IDs. | Yes, if interval is not specified. |
["segmentId1", "segmentId2"] |
versions |
List of segment versions. Must be provided with interval . |
No. | ["2024-03-14T16:00:04.086Z", ""2024-03-12T16:00:04.086Z"] |
Responses
Successfully updated segments
Invalid datasource name
Invalid request payload
Sample request
The following example marks two segments from the wikipedia_hour
datasource used based on their segment IDs.
curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/datasources/wikipedia_hour/markUsed" \
--header 'Content-Type: application/json' \
--data '{
"segmentIds": [
"wikipedia_hour_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2023-08-10T04:12:03.860Z",
"wikipedia_hour_2015-09-12T04:00:00.000Z_2015-09-12T05:00:00.000Z_2023-08-10T04:12:03.860Z"
]
}'
POST /druid/coordinator/v1/datasources/wikipedia_hour/markUsed HTTP/1.1
Host: http://ROUTER_IP:ROUTER_PORT
Content-Type: application/json
Content-Length: 230
{
"segmentIds": [
"wikipedia_hour_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2023-08-10T04:12:03.860Z",
"wikipedia_hour_2015-09-12T04:00:00.000Z_2015-09-12T05:00:00.000Z_2023-08-10T04:12:03.860Z"
]
}
Sample response
Show sample response
{
"numChangedSegments": 2
}
Mark all segments unused
Marks the state of all segments of a datasource as unused. This action performs a "soft delete" of the segments from Historicals.
Note that this endpoint returns an HTTP 200 OK
response code even if the datasource doesn't exist.
URL
DELETE
/druid/coordinator/v1/datasources/:datasource
Responses
Successfully updated segments
Sample request
curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/datasources/wikipedia_hour"
DELETE /druid/coordinator/v1/datasources/wikipedia_hour HTTP/1.1
Host: http://ROUTER_IP:ROUTER_PORT
Sample response
Show sample response
{
"numChangedSegments": 24
}
Mark all segments used
Marks the state of all unused segments of a datasource as used. The endpoint returns the number of changed segments.
Note that this endpoint returns an HTTP 200 OK
response code even if the datasource doesn't exist.
URL
POST
/druid/coordinator/v1/datasources/:datasource
Header
The following headers are required for this request:
Content-Type: application/json
Accept: application/json, text/plain
Responses
Successfully updated segments
Sample request
The following example updates all unused segments of wikipedia_hour
to used.
wikipedia_hour
contains one unused segment eligible to be marked as used.
curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/datasources/wikipedia_hour" \
--header 'Content-Type: application/json' \
--header 'Accept: application/json, text/plain'
POST /druid/coordinator/v1/datasources/wikipedia_hour HTTP/1.1
Host: http://ROUTER_IP:ROUTER_PORT
Content-Type: application/json
Accept: application/json, text/plain
Sample response
Show sample response
{
"numChangedSegments": 1
}
Segment deletion
Permanently delete segments
The DELETE endpoint sends a kill task for a given interval and datasource. The interval value is an ISO 8601 string delimited by _
. This request permanently deletes all metadata for unused segments and removes them from deep storage.
Note that this endpoint returns an HTTP 200 OK
response code even if the datasource doesn't exist.
This endpoint supersedes the deprecated endpoint: DELETE /druid/coordinator/v1/datasources/:datasource?kill=true&interval=:interval
URL
DELETE
/druid/coordinator/v1/datasources/:datasource/intervals/:interval
Responses
Successfully sent kill task
Sample request
The following example sends a kill task to permanently delete segments in the datasource wikipedia_hour
from the interval 2015-09-12
to 2015-09-13
.
curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/datasources/wikipedia_hour/intervals/2015-09-12_2015-09-13"
DELETE /druid/coordinator/v1/datasources/wikipedia_hour/intervals/2015-09-12_2015-09-13 HTTP/1.1
Host: http://ROUTER_IP:ROUTER_PORT
Sample response
A successful request returns an HTTP 200 OK
and an empty response body.