From 8eee259629684b7a88955f296e99934d99662125 Mon Sep 17 00:00:00 2001
From: David Lim <dclim@users.noreply.github.com>
Date: Mon, 19 Dec 2016 10:41:47 -0700
Subject: [PATCH] add documentation on segments generated (#3785)

---
 .../extensions-core/kafka-ingestion.md        | 44 ++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/docs/content/development/extensions-core/kafka-ingestion.md b/docs/content/development/extensions-core/kafka-ingestion.md
index c0dea426bce..84900617bf7 100644
--- a/docs/content/development/extensions-core/kafka-ingestion.md
+++ b/docs/content/development/extensions-core/kafka-ingestion.md
@@ -292,7 +292,7 @@ shuts down the currently running supervisor. When a supervisor is shut down in t
 managed tasks to stop reading and begin publishing their segments immediately. The call to the shutdown endpoint will
 return after all tasks have been signalled to stop but before the tasks finish publishing their segments.
 
-### Schema/Configuration Changes
+## Schema/Configuration Changes
 
 Schema and configuration changes are handled by submitting the new supervisor spec via the same
 `POST /druid/indexer/v1/supervisor` endpoint used to initially create the supervisor. The overlord will initiate a
@@ -300,3 +300,45 @@ graceful shutdown of the existing supervisor which will cause the tasks being ma
 and begin publishing their segments. A new supervisor will then be started which will create a new set of tasks that
 will start reading from the offsets where the previous now-publishing tasks left off, but using the updated schema.
 In this way, configuration changes can be applied without requiring any pause in ingestion.
+
+## Deployment Notes
+
+### On the Subject of Segments
+
+The Kafka indexing service may generate a significantly large number of segments which over time will cause query
+performance issues if not properly managed. One important characteristic to understand is that the Kafka indexing task
+will generate a Druid partition in each segment granularity interval for each partition in the Kafka topic. As an
+example, if you are ingesting realtime data and your segment granularity is 15 minutes with 10 partitions in the Kafka
+topic, you would generate a minimum of 40 segments an hour. This is a limitation imposed by the Kafka architecture which
+guarantees delivery order within a partition but not across partitions. Therefore as a consumer of Kafka, in order to
+generate segments deterministically (and be able to provide exactly-once ingestion semantics) partitions need to be
+handled separately.
+
+Compounding this, if your taskDuration was also set to 15 minutes, you would actually generate 80 segments an hour since
+any given 15 minute interval would be handled by two tasks. For an example of this behavior, let's say we started the
+supervisor at 9:05 with a 15 minute segment granularity. The first task would create a segment for 9:00-9:15 and a
+segment for 9:15-9:30 before stopping at 9:20. A second task would be created at 9:20 which would create another segment
+for 9:15-9:30 and a segment for 9:30-9:45 before stopping at 9:35. Hence, if taskDuration and segmentGranularity are the
+same duration, you will get two tasks generating a segment for each segment granularity interval.
+
+Understanding this behavior is the first step to managing the number of segments produced. Some recommendations for
+keeping the number of segments low are:
+
+  * Keep the number of Kafka partitions to the minimum required to sustain the required throughput for your event streams.
+  * Increase segment granularity and task duration so that more events are written into the same segment. One
+    consideration here is that segments are only handed off to historical nodes after the task duration has elapsed.
+    Since workers tend to be configured with less query-serving resources than historical nodes, query performance may
+    suffer if tasks run excessively long without handing off segments.
+
+In many production installations which have been ingesting events for a long period of time, these suggestions alone
+will not be sufficient to keep the number of segments at an optimal level. It is recommended that scheduled re-indexing
+tasks be run to merge segments together into new segments of an ideal size (in the range of ~500-700 MB per segment).
+Currently, the recommended way of doing this is by running periodic Hadoop batch ingestion jobs and using a `dataSource`
+inputSpec to read from the segments generated by the Kafka indexing tasks. Details on how to do this can be found under
+['Updating Existing Data'](../../ingestion/update-existing-data.html). Note that the Merge Task and Append Task described
+[here](../../ingestion/tasks.html) will not work as they require unsharded segments while Kafka indexing tasks always
+generated sharded segments.
+
+There is ongoing work to support automatic segment compaction of sharded segments as well as compaction not requiring
+Hadoop (see [here](https://github.com/druid-io/druid/pull/1998) and [here](https://github.com/druid-io/druid/pull/3611)
+for related PRs).