druid/docs/content/Realtime.md

---
layout: doc_page
---
Real-time Node
==============
For Real-time Node Configuration, see [Realtime Configuration](Realtime-Config.html).

For Real-time Ingestion, see [Realtime Ingestion](Realtime-ingestion.html).

Realtime nodes provide a realtime index. Data indexed via these nodes is immediately available for querying. Realtime nodes will periodically build segments representing the data they’ve collected over some span of time and transfer these segments off to [Historical](Historical.html) nodes. They use ZooKeeper to monitor the transfer and MySQL to store metadata about the transfered segment. Once transfered, segments are forgotten by the Realtime nodes.

### Running

```
io.druid.cli.Main server realtime
```
Segment Propagation
-------------------

The segment propagation diagram for real-time data ingestion can be seen below:

![Segment Propagation](../img/segmentPropagation.png "Segment Propagation")

You can read about the various components shown in this diagram under the Architecture section (see the menu on the left).

### Firehose

See [Firehose](Firehose.html).

### Plumber

See [Plumber](Plumber.html)

Extending the code
------------------

Realtime integration is intended to be extended in two ways:

1.  Connect to data streams from varied systems ([Firehose](https://github.com/druid-io/druid-api/blob/master/src/main/java/io/druid/data/input/FirehoseFactory.java))
2.  Adjust the publishing strategy to match your needs ([Plumber](https://github.com/metamx/druid/blob/master/server/src/main/java/io/druid/segment/realtime/plumber/PlumberSchool.java))

The expectations are that the former will be very common and something that users of Druid will do on a fairly regular basis. Most users will probably never have to deal with the latter form of customization. Indeed, we hope that all potential use cases can be packaged up as part of Druid proper without requiring proprietary customization.

Given those expectations, adding a firehose is straightforward and completely encapsulated inside of the interface. Adding a plumber is more involved and requires understanding of how the system works to get right, it’s not impossible, but it’s not intended that individuals new to Druid will be able to do it immediately.
-												Added prepend tag to make pages display.

											
										
										
											2013-09-16 17:49:36 -04:00
+								---
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								layout: doc_page
-												Added prepend tag to make pages display.

											
										
										
											2013-09-16 17:49:36 -04:00
+								---
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								Real-time Node
 								==============
 								For Real-time Node Configuration, see [Realtime Configuration](Realtime-Config.html).
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								For Real-time Ingestion, see [Realtime Ingestion](Realtime-ingestion.html).
-												Update to Architecture intro file:
* Updated description of druid architecture components
* Added links from descriptions to actual component pages
* Removed Data-Flow page, which was a (mostly) redundant subset of other pages under Architecture)
* Moved non-redundant info from Data-Flow to other Architecture pages
* Updated info on how data actually flows in Druid 0.6

											
										
										
											2013-11-01 20:17:57 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								Realtime nodes provide a realtime index. Data indexed via these nodes is immediately available for querying. Realtime nodes will periodically build segments representing the data they’ve collected over some span of time and transfer these segments off to [Historical](Historical.html) nodes. They use ZooKeeper to monitor the transfer and MySQL to store metadata about the transfered segment. Once transfered, segments are forgotten by the Realtime nodes.
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								### Running
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
-												port docs over to 0.6 and a bunch of misc fixes

											
										
										
											2013-10-11 21:38:53 -04:00
+								```
 								io.druid.cli.Main server realtime
 								```
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								Segment Propagation
 								-------------------
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								The segment propagation diagram for real-time data ingestion can be seen below:
-												port docs over to 0.6 and a bunch of misc fixes

											
										
										
											2013-10-11 21:38:53 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								![Segment Propagation](../img/segmentPropagation.png "Segment Propagation")
-												fix realtime doc

											
										
										
											2013-11-18 19:20:25 -05:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								You can read about the various components shown in this diagram under the Architecture section (see the menu on the left).
-												fix realtime doc

											
										
										
											2013-11-18 19:20:25 -05:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								### Firehose
-												port docs over to 0.6 and a bunch of misc fixes

											
										
										
											2013-10-11 21:38:53 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								See [Firehose](Firehose.html).
-												Separated realitme ingestion from realtime node info; under Data Ingestion the Realtime link now points to the new realtime-ingestion page

											
										
										
											2013-12-20 16:43:55 -05:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								### Plumber
-												Separated realitme ingestion from realtime node info; under Data Ingestion the Realtime link now points to the new realtime-ingestion page

											
										
										
											2013-12-20 16:43:55 -05:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								See [Plumber](Plumber.html)
-												port docs over to 0.6 and a bunch of misc fixes

											
										
										
											2013-10-11 21:38:53 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								Extending the code
 								------------------
-												port docs over to 0.6 and a bunch of misc fixes

											
										
										
											2013-10-11 21:38:53 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								Realtime integration is intended to be extended in two ways:
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+.  Connect to data streams from varied systems ([Firehose](https://github.com/druid-io/druid-api/blob/master/src/main/java/io/druid/data/input/FirehoseFactory.java))
 .  Adjust the publishing strategy to match your needs ([Plumber](https://github.com/metamx/druid/blob/master/server/src/main/java/io/druid/segment/realtime/plumber/PlumberSchool.java))
-												port docs over to 0.6 and a bunch of misc fixes

											
										
										
											2013-10-11 21:38:53 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								The expectations are that the former will be very common and something that users of Druid will do on a fairly regular basis. Most users will probably never have to deal with the latter form of customization. Indeed, we hope that all potential use cases can be packaged up as part of Druid proper without requiring proprietary customization.
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
-												refactor out the configuration docs

											
										
										
											2014-02-07 14:35:44 -05:00
+								Given those expectations, adding a firehose is straightforward and completely encapsulated inside of the interface. Adding a plumber is more involved and requires understanding of how the system works to get right, it’s not impossible, but it’s not intended that individuals new to Druid will be able to do it immediately.