druid/docs/content/Examples.md

---
layout: doc_page
---
Examples
========

The examples on this page are setup in order to give you a feel for what Druid does in practice. They are quick demos of Druid based on [CliRealtimeExample](https://github.com/metamx/druid/blob/master/services/src/main/java/io/druid/cli/CliRealtimeExample.java). While you wouldn’t run it this way in production you should be able to see how ingestion works and the kind of exploratory queries that are possible. Everything that can be done on your box here can be scaled out to 10’s of billions of events and terabytes of data per day in a production cluster while still giving the snappy responsive exploratory queries.

Installing Standalone Druid
---------------------------

There are two options for installing standalone Druid. Building from source, and downloading the Druid Standalone Kit (DSK).

### Building from source

Clone Druid and build it:

``` bash
git clone https://github.com/metamx/druid.git druid
cd druid
git fetch --tags
git checkout druid-0.6.9
./build.sh
```

### Downloading the DSK (Druid Standalone Kit)

[Download](http://static.druid.io/artifacts/releases/druid-services-0.6.9-bin.tar.gz) a stand-alone tarball and run it:

``` bash
tar -xzf druid-services-0.X.X-bin.tar.gz
cd druid-services-0.X.X
```

Twitter Example
---------------

For a full tutorial based on the twitter example, check out this [Twitter Tutorial](Twitter-Tutorial.html).

This Example uses a feature of Twitter that allows for sampling of it’s stream. We sample the Twitter stream via our [TwitterSpritzerFirehoseFactory](https://github.com/metamx/druid/blob/master/examples/src/main/java/druid/examples/twitter/TwitterSpritzerFirehoseFactory.java) class and use it to simulate the kinds of data you might ingest into Druid. Then, with the client part, the sample shows what kinds of analytics explorations you can do during and after the data is loaded.

### What you’ll learn
* See how large amounts of data gets ingested into Druid in real-time
* Learn how to do fast, interactive, analytics queries on that real-time data

### What you need
* A build of standalone Druid with the Twitter example (see above)
* A Twitter username and password.

### What you’ll do

See [Twitter Tutorial](Twitter-Tutorial.html)

Rand Example
------------

This uses `RandomFirehoseFactory` which emits a stream of random numbers (outColumn, a positive double) with timestamps along with an associated token (target). This provides a timeseries that requires no network access for demonstration, characterization, and testing. The generated tuples can be thought of as asynchronously produced triples (timestamp, outColumn, target) where the timestamp varies depending on speed of processing.

In a terminal window, (NOTE: If you are using the cloned Github repository these scripts are in ./examples/bin) start the server with:

``` bash
./run_example_server.sh # type rand when prompted
```

In another terminal window:

``` bash
./run_example_client.sh # type rand when prompted
```

The result of the client query is in JSON format. The client makes a REST request using the program `curl` which is usually installed on Linux, Unix, and OSX by default.
-												Added prepend tag to make pages display.

											
										
										
											2013-09-16 17:49:36 -04:00
+								---
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								layout: doc_page
-												Added prepend tag to make pages display.

											
										
										
											2013-09-16 17:49:36 -04:00
+								---
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
+								Examples
 								========
-												fix docs for 0.6 part 1 of many

											
										
										
											2013-10-07 17:47:04 -04:00
+								The examples on this page are setup in order to give you a feel for what Druid does in practice. They are quick demos of Druid based on [CliRealtimeExample](https://github.com/metamx/druid/blob/master/services/src/main/java/io/druid/cli/CliRealtimeExample.java). While you wouldn’t run it this way in production you should be able to see how ingestion works and the kind of exploratory queries that are possible. Everything that can be done on your box here can be scaled out to 10’s of billions of events and terabytes of data per day in a production cluster while still giving the snappy responsive exploratory queries.
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
 								Installing Standalone Druid
 								---------------------------
 								There are two options for installing standalone Druid. Building from source, and downloading the Druid Standalone Kit (DSK).
 								### Building from source
 								Clone Druid and build it:
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								``` bash
 								git clone https://github.com/metamx/druid.git druid
 								cd druid
 								git fetch --tags
-												fix broken http post emitter and prepare for next release

											
										
										
											2013-11-08 17:01:16 -05:00
+								git checkout druid-0.6.9
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								./build.sh
 								```
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
 								### Downloading the DSK (Druid Standalone Kit)
-												fix broken http post emitter and prepare for next release

											
										
										
											2013-11-08 17:01:16 -05:00
+								[Download](http://static.druid.io/artifacts/releases/druid-services-0.6.9-bin.tar.gz) a stand-alone tarball and run it:
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								``` bash
-												fix more docs

											
										
										
											2013-10-18 18:37:43 -04:00
+								tar -xzf druid-services-0.X.X-bin.tar.gz
 								cd druid-services-0.X.X
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								```
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
 								Twitter Example
 								---------------
-												Replaced spaces with dashes

											
										
										
											2013-09-16 19:19:49 -04:00
+								For a full tutorial based on the twitter example, check out this [Twitter Tutorial](Twitter-Tutorial.html).
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
 								This Example uses a feature of Twitter that allows for sampling of it’s stream. We sample the Twitter stream via our [TwitterSpritzerFirehoseFactory](https://github.com/metamx/druid/blob/master/examples/src/main/java/druid/examples/twitter/TwitterSpritzerFirehoseFactory.java) class and use it to simulate the kinds of data you might ingest into Druid. Then, with the client part, the sample shows what kinds of analytics explorations you can do during and after the data is loaded.
 								### What you’ll learn
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								* See how large amounts of data gets ingested into Druid in real-time
 								* Learn how to do fast, interactive, analytics queries on that real-time data
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
 								### What you need
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								* A build of standalone Druid with the Twitter example (see above)
 								* A Twitter username and password.
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
 								### What you’ll do
-												fix docs for 0.6 part 1 of many

											
										
										
											2013-10-07 17:47:04 -04:00
+								See [Twitter Tutorial](Twitter-Tutorial.html)
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
 								Rand Example
 								------------
 								This uses `RandomFirehoseFactory` which emits a stream of random numbers (outColumn, a positive double) with timestamps along with an associated token (target). This provides a timeseries that requires no network access for demonstration, characterization, and testing. The generated tuples can be thought of as asynchronously produced triples (timestamp, outColumn, target) where the timestamp varies depending on speed of processing.
 								In a terminal window, (NOTE: If you are using the cloned Github repository these scripts are in ./examples/bin) start the server with:
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								``` bash
 								./run_example_server.sh # type rand when prompted
 								```
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
 								In another terminal window:
-												Docs working

											
										
										
											2013-09-26 19:22:28 -04:00
+								``` bash
 								./run_example_client.sh # type rand when prompted
 								```
-												Add docs from github wiki

											
										
										
											2013-09-13 18:20:39 -04:00
-												Finish converting docs over to something that displays properly

											
										
										
											2013-09-27 20:08:34 -04:00
+								The result of the client query is in JSON format. The client makes a REST request using the program `curl` which is usually installed on Linux, Unix, and OSX by default.