mirror of https://github.com/apache/druid.git
Replaced spaces with dashes
This commit is contained in:
parent
063a068ab2
commit
946a9e502f
|
@ -9,7 +9,7 @@ There are two choices for batch data ingestion to your Druid cluster, you can us
|
||||||
Which should I use?
|
Which should I use?
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
The [Indexing service](Indexing service.html) is a node that can run as part of your Druid cluster and can accomplish a number of different types of indexing tasks. Even if all you care about is batch indexing, it provides for the encapsulation of things like the Database that is used for segment metadata and other things, so that your indexing tasks do not need to include such information. Long-term, the indexing service is going to be the preferred method of ingesting data.
|
The [Indexing service](Indexing-service.html) is a node that can run as part of your Druid cluster and can accomplish a number of different types of indexing tasks. Even if all you care about is batch indexing, it provides for the encapsulation of things like the Database that is used for segment metadata and other things, so that your indexing tasks do not need to include such information. Long-term, the indexing service is going to be the preferred method of ingesting data.
|
||||||
|
|
||||||
The `HadoopDruidIndexerMain` runs hadoop jobs in order to separate and index data segments. It takes advantage of Hadoop as a job scheduling and distributed job execution platform. It is a simple method if you already have Hadoop running and don’t want to spend the time configuring and deploying the [Indexing service](Indexing service.html) just yet.
|
The `HadoopDruidIndexerMain` runs hadoop jobs in order to separate and index data segments. It takes advantage of Hadoop as a job scheduling and distributed job execution platform. It is a simple method if you already have Hadoop running and don’t want to spend the time configuring and deploying the [Indexing service](Indexing service.html) just yet.
|
||||||
|
|
||||||
|
|
|
@ -3,7 +3,7 @@ layout: default
|
||||||
---
|
---
|
||||||
# Booting a Single Node Cluster #
|
# Booting a Single Node Cluster #
|
||||||
|
|
||||||
[Loading Your Data](Loading Your Data.html) and [Querying Your Data](Querying Your Data.html) contain recipes to boot a small druid cluster on localhost. Here we will boot a small cluster on EC2. You can checkout the code, or download a tarball from [here](http://static.druid.io/artifacts/druid-services-0.5.51-SNAPSHOT-bin.tar.gz).
|
[Loading Your Data](Loading-Your-Data.html) and [Querying Your Data](Querying-Your-Data.html) contain recipes to boot a small druid cluster on localhost. Here we will boot a small cluster on EC2. You can checkout the code, or download a tarball from [here](http://static.druid.io/artifacts/druid-services-0.5.51-SNAPSHOT-bin.tar.gz).
|
||||||
|
|
||||||
The [ec2 run script](https://github.com/metamx/druid/blob/master/examples/bin/run_ec2.sh), run_ec2.sh, is located at 'examples/bin' if you have checked out the code, or at the root of the project if you've downloaded a tarball. The scripts rely on the [Amazon EC2 API Tools](http://aws.amazon.com/developertools/351), and you will need to set three environment variables:
|
The [ec2 run script](https://github.com/metamx/druid/blob/master/examples/bin/run_ec2.sh), run_ec2.sh, is located at 'examples/bin' if you have checked out the code, or at the root of the project if you've downloaded a tarball. The scripts rely on the [Amazon EC2 API Tools](http://aws.amazon.com/developertools/351), and you will need to set three environment variables:
|
||||||
|
|
||||||
|
|
|
@ -91,7 +91,7 @@ These properties are for connecting with S3 and using it to pull down segments.
|
||||||
|
|
||||||
### JDBC connection
|
### JDBC connection
|
||||||
|
|
||||||
These properties specify the jdbc connection and other configuration around the “segments table” database. The only processes that connect to the DB with these properties are the [Master](Master.html) and [Indexing service](Indexing service.html). This is tested on MySQL.
|
These properties specify the jdbc connection and other configuration around the “segments table” database. The only processes that connect to the DB with these properties are the [Master](Master.html) and [Indexing service](Indexing-service.html). This is tested on MySQL.
|
||||||
|
|
||||||
|Property|Description|Default|
|
|Property|Description|Default|
|
||||||
|--------|-----------|-------|
|
|--------|-----------|-------|
|
||||||
|
|
|
@ -3,7 +3,7 @@ layout: default
|
||||||
---
|
---
|
||||||
# Druid Personal Demo Cluster (DPDC)
|
# Druid Personal Demo Cluster (DPDC)
|
||||||
|
|
||||||
Note, there are currently some issues with the CloudFormation. We are working through them and will update the documentation here when things work properly. In the meantime, the simplest way to get your feet wet with a cluster setup is to run through the instructions at [housejester/druid-test-harness](https://github.com/housejester/druid-test-harness), though it is based on an older version. If you just want to get a feel for the types of data and queries that you can issue, check out [Realtime Examples](Realtime Examples.html)
|
Note, there are currently some issues with the CloudFormation. We are working through them and will update the documentation here when things work properly. In the meantime, the simplest way to get your feet wet with a cluster setup is to run through the instructions at [housejester/druid-test-harness](https://github.com/housejester/druid-test-harness), though it is based on an older version. If you just want to get a feel for the types of data and queries that you can issue, check out [Realtime Examples](Realtime-Examples.html)
|
||||||
|
|
||||||
## Introduction
|
## Introduction
|
||||||
To make it easy for you to get started with Druid, we created an AWS (Amazon Web Services) [CloudFormation](http://aws.amazon.com/cloudformation/) Template that allows you to create a small pre-configured Druid cluster using your own AWS account. The cluster contains a pre-loaded sample workload, the Wikipedia edit stream, and a basic query interface that gets you familiar with Druid capabilities like drill-downs and filters.
|
To make it easy for you to get started with Druid, we created an AWS (Amazon Web Services) [CloudFormation](http://aws.amazon.com/cloudformation/) Template that allows you to create a small pre-configured Druid cluster using your own AWS account. The cluster contains a pre-loaded sample workload, the Wikipedia edit stream, and a basic query interface that gets you familiar with Druid capabilities like drill-downs and filters.
|
||||||
|
|
|
@ -34,7 +34,7 @@ Clone Druid and build it:
|
||||||
Twitter Example
|
Twitter Example
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
For a full tutorial based on the twitter example, check out this [Twitter Tutorial](Twitter Tutorial.html).
|
For a full tutorial based on the twitter example, check out this [Twitter Tutorial](Twitter-Tutorial.html).
|
||||||
|
|
||||||
This Example uses a feature of Twitter that allows for sampling of it’s stream. We sample the Twitter stream via our [TwitterSpritzerFirehoseFactory](https://github.com/metamx/druid/blob/master/examples/src/main/java/druid/examples/twitter/TwitterSpritzerFirehoseFactory.java) class and use it to simulate the kinds of data you might ingest into Druid. Then, with the client part, the sample shows what kinds of analytics explorations you can do during and after the data is loaded.
|
This Example uses a feature of Twitter that allows for sampling of it’s stream. We sample the Twitter stream via our [TwitterSpritzerFirehoseFactory](https://github.com/metamx/druid/blob/master/examples/src/main/java/druid/examples/twitter/TwitterSpritzerFirehoseFactory.java) class and use it to simulate the kinds of data you might ingest into Druid. Then, with the client part, the sample shows what kinds of analytics explorations you can do during and after the data is loaded.
|
||||||
|
|
||||||
|
|
|
@ -98,7 +98,7 @@ There are 9 main parts to a groupBy query:
|
||||||
|granularity|Defines the granularity of the query. See [Granularities](Granularities.html)|yes|
|
|granularity|Defines the granularity of the query. See [Granularities](Granularities.html)|yes|
|
||||||
|filter|See [Filters](Filters.html)|no|
|
|filter|See [Filters](Filters.html)|no|
|
||||||
|aggregations|See [Aggregations](Aggregations.html)|yes|
|
|aggregations|See [Aggregations](Aggregations.html)|yes|
|
||||||
|postAggregations|See [Post Aggregations](Post Aggregations.html)|no|
|
|postAggregations|See [Post Aggregations](Post-Aggregations.html)|no|
|
||||||
|intervals|A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.|yes|
|
|intervals|A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.|yes|
|
||||||
|context|An additional JSON Object which can be used to specify certain flags.|no|
|
|context|An additional JSON Object which can be used to specify certain flags.|no|
|
||||||
|
|
||||||
|
|
|
@ -13,6 +13,9 @@ Some great folks have written their own libraries to interact with Druid
|
||||||
#### Ruby
|
#### Ruby
|
||||||
\* [madvertise/ruby-druid](https://github.com/madvertise/ruby-druid) - A ruby client for Druid
|
\* [madvertise/ruby-druid](https://github.com/madvertise/ruby-druid) - A ruby client for Druid
|
||||||
|
|
||||||
|
#### Python
|
||||||
|
\* [metamx/pydruid](https://github.com/metamx/pydruid) - A python client for Druid
|
||||||
|
|
||||||
#### Helper Libraries
|
#### Helper Libraries
|
||||||
|
|
||||||
- [madvertise/druid-dumbo](https://github.com/madvertise/druid-dumbo) - Scripts to help generate batch configs for the ingestion of data into Druid
|
- [madvertise/druid-dumbo](https://github.com/madvertise/druid-dumbo) - Scripts to help generate batch configs for the ingestion of data into Druid
|
||||||
|
|
|
@ -165,7 +165,7 @@ curl -X POST "http://localhost:8080/druid/v2/?pretty" \
|
||||||
}
|
}
|
||||||
} ]
|
} ]
|
||||||
```
|
```
|
||||||
Now you're ready for [Querying Your Data](Querying Your Data.html)!
|
Now you're ready for [Querying Your Data](Querying-Your-Data.html)!
|
||||||
|
|
||||||
## Loading Data with the HadoopDruidIndexer ##
|
## Loading Data with the HadoopDruidIndexer ##
|
||||||
|
|
||||||
|
@ -367,4 +367,4 @@ Now its time to run the Hadoop [Batch-ingestion](Batch-ingestion.html) job, Hado
|
||||||
java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Ddruid.realtime.specFile=realtime.spec -classpath lib/* com.metamx.druid.indexer.HadoopDruidIndexerMain batchConfig.json
|
java -Xmx256m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Ddruid.realtime.specFile=realtime.spec -classpath lib/* com.metamx.druid.indexer.HadoopDruidIndexerMain batchConfig.json
|
||||||
```
|
```
|
||||||
|
|
||||||
You can now move on to [Querying Your Data](Querying Your Data.html)!
|
You can now move on to [Querying Your Data](Querying-Your-Data.html)!
|
|
@ -15,7 +15,7 @@ Rules
|
||||||
|
|
||||||
Segments are loaded and dropped from the cluster based on a set of rules. Rules indicate how segments should be assigned to different compute node tiers and how many replicants of a segment should exist in each tier. Rules may also indicate when segments should be dropped entirely from the cluster. The master loads a set of rules from the database. Rules may be specific to a certain datasource and/or a default set of rules can be configured. Rules are read in order and hence the ordering of rules is important. The master will cycle through all available segments and match each segment with the first rule that applies. Each segment may only match a single rule
|
Segments are loaded and dropped from the cluster based on a set of rules. Rules indicate how segments should be assigned to different compute node tiers and how many replicants of a segment should exist in each tier. Rules may also indicate when segments should be dropped entirely from the cluster. The master loads a set of rules from the database. Rules may be specific to a certain datasource and/or a default set of rules can be configured. Rules are read in order and hence the ordering of rules is important. The master will cycle through all available segments and match each segment with the first rule that applies. Each segment may only match a single rule
|
||||||
|
|
||||||
For more information on rules, see [Rule Configuration](Rule Configuration.html).
|
For more information on rules, see [Rule Configuration](Rule-Configuration.html).
|
||||||
|
|
||||||
Cleaning Up Segments
|
Cleaning Up Segments
|
||||||
--------------------
|
--------------------
|
||||||
|
|
|
@ -44,4 +44,4 @@ The config table is used to store runtime configuration objects. We do not have
|
||||||
Task-related Tables
|
Task-related Tables
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
There are also a number of tables created and used by the [Indexing Service](Indexing Service.html) in the course of its work.
|
There are also a number of tables created and used by the [Indexing Service](Indexing-Service.html) in the course of its work.
|
||||||
|
|
|
@ -3,7 +3,7 @@ layout: default
|
||||||
---
|
---
|
||||||
# Setup #
|
# Setup #
|
||||||
|
|
||||||
Before we start querying druid, we're going to finish setting up a complete cluster on localhost. In [Loading Your Data](Loading Your Data.html) we setup a [Realtime](Realtime.html), [Compute](Compute.html) and [Master](Master.html) node. If you've already completed that tutorial, you need only follow the directions for 'Booting a Broker Node'.
|
Before we start querying druid, we're going to finish setting up a complete cluster on localhost. In [Loading Your Data](Loading-Your-Data.html) we setup a [Realtime](Realtime.html), [Compute](Compute.html) and [Master](Master.html) node. If you've already completed that tutorial, you need only follow the directions for 'Booting a Broker Node'.
|
||||||
|
|
||||||
## Booting a Broker Node ##
|
## Booting a Broker Node ##
|
||||||
|
|
||||||
|
@ -98,7 +98,7 @@ com.metamx.druid.http.ComputeMain
|
||||||
|
|
||||||
# Querying Your Data #
|
# Querying Your Data #
|
||||||
|
|
||||||
Now that we have a complete cluster setup on localhost, we need to load data. To do so, refer to [Loading Your Data](Loading Your Data.html). Having done that, its time to query our data! For a complete specification of queries, see [Querying](Querying.html).
|
Now that we have a complete cluster setup on localhost, we need to load data. To do so, refer to [Loading Your Data](Loading-Your-Data.html). Having done that, its time to query our data! For a complete specification of queries, see [Querying](Querying.html).
|
||||||
|
|
||||||
## Querying Different Nodes ##
|
## Querying Different Nodes ##
|
||||||
|
|
||||||
|
@ -363,4 +363,4 @@ Check out [Filters](Filters.html) for more.
|
||||||
|
|
||||||
## Learn More ##
|
## Learn More ##
|
||||||
|
|
||||||
You can learn more about querying at [Querying](Querying.html)! Now check out [Booting a production cluster](Booting a production cluster.html)!
|
You can learn more about querying at [Querying](Querying.html)! Now check out [Booting a production cluster](Booting-a-production-cluster.html)!
|
|
@ -87,7 +87,7 @@ There are 7 main parts to a timeseries query:
|
||||||
|granularity|Defines the granularity of the query. See [Granularities](Granularities.html)|yes|
|
|granularity|Defines the granularity of the query. See [Granularities](Granularities.html)|yes|
|
||||||
|filter|See [Filters](Filters.html)|no|
|
|filter|See [Filters](Filters.html)|no|
|
||||||
|aggregations|See [Aggregations](Aggregations.html)|yes|
|
|aggregations|See [Aggregations](Aggregations.html)|yes|
|
||||||
|postAggregations|See [Post Aggregations](Post Aggregations.html)|no|
|
|postAggregations|See [Post Aggregations](Post-Aggregations.html)|no|
|
||||||
|intervals|A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.|yes|
|
|intervals|A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.|yes|
|
||||||
|context|An additional JSON Object which can be used to specify certain flags.|no|
|
|context|An additional JSON Object which can be used to specify certain flags.|no|
|
||||||
|
|
||||||
|
|
|
@ -357,9 +357,9 @@ Feel free to tweak other query parameters to answer other questions you may have
|
||||||
Next Steps
|
Next Steps
|
||||||
----------
|
----------
|
||||||
|
|
||||||
What to know even more information about the Druid Cluster? Check out [Tutorial: The Druid Cluster](Tutorial: The Druid Cluster.html)
|
What to know even more information about the Druid Cluster? Check out [Tutorial: The Druid Cluster](Tutorial:-The-Druid-Cluster.html)
|
||||||
|
|
||||||
Druid is even more fun if you load your own data into it! To learn how to load your data, see [Loading Your Data](Loading Your Data.html).
|
Druid is even more fun if you load your own data into it! To learn how to load your data, see [Loading Your Data](Loading-Your-Data.html).
|
||||||
|
|
||||||
Additional Information
|
Additional Information
|
||||||
----------------------
|
----------------------
|
||||||
|
|
|
@ -19,7 +19,7 @@ tar -zxvf druid-services-*-bin.tar.gz
|
||||||
cd druid-services-*
|
cd druid-services-*
|
||||||
```
|
```
|
||||||
|
|
||||||
You can also [Build From Source](Build From Source.html).
|
You can also [Build From Source](Build-From-Source.html).
|
||||||
|
|
||||||
## External Dependencies ##
|
## External Dependencies ##
|
||||||
|
|
||||||
|
|
|
@ -346,8 +346,8 @@ Feel free to tweak other query parameters to answer other questions you may have
|
||||||
Next Steps
|
Next Steps
|
||||||
----------
|
----------
|
||||||
|
|
||||||
What to know even more information about the Druid Cluster? Check out [Tutorial: The Druid Cluster](Tutorial: The Druid Cluster.html)
|
What to know even more information about the Druid Cluster? Check out [Tutorial: The Druid Cluster](Tutorial:-The-Druid-Cluster.html)
|
||||||
Druid is even more fun if you load your own data into it! To learn how to load your data, see [Loading Your Data](Loading Your Data.html).
|
Druid is even more fun if you load your own data into it! To learn how to load your data, see [Loading Your Data](Loading-Your-Data.html).
|
||||||
|
|
||||||
Additional Information
|
Additional Information
|
||||||
----------------------
|
----------------------
|
||||||
|
|
|
@ -9,20 +9,20 @@ Contents
|
||||||
========================
|
========================
|
||||||
|
|
||||||
Getting Started
|
Getting Started
|
||||||
\* [Tutorial: A First Look at Druid](Tutorial: A First Look at Druid.html)
|
\* [Tutorial: A First Look at Druid](Tutorial:-A-First-Look-at-Druid.html)
|
||||||
\* [Tutorial: The Druid Cluster](Tutorial: The Druid Cluster.html)
|
\* [Tutorial: The Druid Cluster](Tutorial:-The-Druid-Cluster.html)
|
||||||
\* [Loading Your Data](Loading Your Data.html)
|
\* [Loading Your Data](Loading-Your-Data.html)
|
||||||
\* [Querying Your Data](Querying Your Data.html)
|
\* [Querying Your Data](Querying-Your-Data.html)
|
||||||
\* [Booting a Production Cluster](Booting a Production Cluster.html)
|
\* [Booting a Production Cluster](Booting-a-Production-Cluster.html)
|
||||||
\* [Examples](Examples.html)
|
\* [Examples](Examples.html)
|
||||||
\* [Cluster Setup](Cluster Setup.html)
|
\* [Cluster Setup](Cluster-Setup.html)
|
||||||
\* [Configuration](Configuration.html)
|
\* [Configuration](Configuration.html)
|
||||||
--------------------------------------
|
--------------------------------------
|
||||||
|
|
||||||
Data Ingestion
|
Data Ingestion
|
||||||
\* [Realtime](Realtime.html)
|
\* [Realtime](Realtime.html)
|
||||||
\* [Batch|Batch Ingestion](Batch|Batch Ingestion.html)
|
\* [Batch|Batch Ingestion](Batch|Batch-Ingestion.html)
|
||||||
\* [Indexing Service](Indexing Service.html)
|
\* [Indexing Service](Indexing-Service.html)
|
||||||
----------------------------
|
----------------------------
|
||||||
|
|
||||||
Querying
|
Querying
|
||||||
|
@ -57,12 +57,12 @@ Architecture
|
||||||
**\* ]
|
**\* ]
|
||||||
**\* [MySQL](MySQL.html)
|
**\* [MySQL](MySQL.html)
|
||||||
**\* ]
|
**\* ]
|
||||||
** [Concepts and Terminology](Concepts and Terminology.html)
|
** [Concepts and Terminology](Concepts-and-Terminology.html)
|
||||||
-------------------------------
|
-------------------------------
|
||||||
|
|
||||||
Development
|
Development
|
||||||
\* [Versioning](Versioning.html)
|
\* [Versioning](Versioning.html)
|
||||||
\* [Build From Source](Build From Source.html)
|
\* [Build From Source](Build-From-Source.html)
|
||||||
\* [Libraries](Libraries.html)
|
\* [Libraries](Libraries.html)
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue