From 80a4f7c5b5e70c8014a3c5c4b8b113287b160b57 Mon Sep 17 00:00:00 2001 From: nishantmonu51 Date: Fri, 1 Aug 2014 16:43:39 +0530 Subject: [PATCH] doc for IngestSegmentFirehose --- docs/content/Firehose.md | 32 +++++++++++++++++++++++++++++++- docs/content/Ingestion-FAQ.md | 5 +++++ 2 files changed, 36 insertions(+), 1 deletion(-) diff --git a/docs/content/Firehose.md b/docs/content/Firehose.md index c87d0b792bf..ebe15b3632a 100644 --- a/docs/content/Firehose.md +++ b/docs/content/Firehose.md @@ -36,10 +36,40 @@ See [Examples](Examples.html). This firehose connects directly to the twitter sp See [Examples](Examples.html). This firehose creates a stream of random numbers. -#### RabbitMqFirehouse +#### RabbitMqFirehose This firehose ingests events from a define rabbit-mq queue. +#### IngestSegmentFirehose + +This Firehose can be used to read the data from existing druid segments. +It can be used ingest existing druid segments using a new schema and change the name, dimensions, metrics, rollup, etc. of the segment. +A sample ingest firehose spec is shown below - + +```json +{ + "type" : "ingestSegment", + "dataSource" : "wikipedia", + "interval" : "2013-01-01/2013-01-02", + "dimensions":[], + "metrics":[] +} +``` + +|property|description|required?| +|--------|-----------|---------| +|type|ingestSegment. Type of firehose|yes| +|dataSource|A String defining the data source to fetch rows from, very similar to a table in a relational database|yes| +|interval|A String representing ISO-8601 Interval. This defines the time range to fetch the data over.|yes| +|dimensions|The list of dimensions to select. If left empty, all dimensions are selected.|no| +|metrics|The list of metrics to select. If left empty, all metrics are returned.|no| +|filter| See [Filters](Filters.html)|yes| + + + + + + Parsing Data ------------ diff --git a/docs/content/Ingestion-FAQ.md b/docs/content/Ingestion-FAQ.md index 30e5de81a53..59bb8fb7a93 100644 --- a/docs/content/Ingestion-FAQ.md +++ b/docs/content/Ingestion-FAQ.md @@ -37,6 +37,11 @@ You can check the coordinator console located at `:/cluste You can check `:/druid/v2/datasources/?interval=0/3000` for the dimensions and metrics that have been created for your datasource. Make sure that the name of the aggregators you use in your query match one of these metrics. Also make sure that the query interval you specify match a valid time range where data exists. Note: the broker endpoint will only return valid results on historical segments. +## How can I Reindex existing data in Druid with schema changes? + +You can use IngestSegmentFirehose with index task to ingest existing druid segments using a new schema and change the name, dimensions, metrics, rollup, etc. of the segment. +See [Firehose](Firehose.html) for more details on IngestSegmentFirehose. + ## More information Getting data into Druid can definitely be difficult for first time users. Please don't hesitate to ask questions in our IRC channel or on our [google groups page](https://groups.google.com/forum/#!forum/druid-development).