mirror of https://github.com/apache/druid.git
fixup docs to download from Apache mirror, fixup tarball name and path, change references from quickstart/* to quickstart/tutorial/* (#6570)
This commit is contained in:
parent
26d992840c
commit
23ad3d214c
|
@ -58,22 +58,25 @@ First, download and unpack the release archive. It's best to do this on a single
|
|||
since you will be editing the configurations and then copying the modified distribution out to all
|
||||
of your servers.
|
||||
|
||||
[Download](https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/#{DRUIDVERSION}/apache-druid-#{DRUIDVERSION}-bin.tar.gz)
|
||||
the #{DRUIDVERSION} release.
|
||||
|
||||
Extract Druid by running the following commands in your terminal:
|
||||
|
||||
```bash
|
||||
curl -O http://static.druid.io/artifacts/releases/druid-#{DRUIDVERSION}-bin.tar.gz
|
||||
tar -xzf druid-#{DRUIDVERSION}-bin.tar.gz
|
||||
cd druid-#{DRUIDVERSION}
|
||||
tar -xzf apache-druid-#{DRUIDVERSION}-bin.tar.gz
|
||||
cd apache-druid-#{DRUIDVERSION}
|
||||
```
|
||||
|
||||
In this package, you'll find:
|
||||
In the package, you should find:
|
||||
|
||||
|
||||
* `LICENSE` - the license files.
|
||||
* `bin/` - scripts related to the [single-machine quickstart](quickstart.html).
|
||||
* `conf/*` - template configurations for a clustered setup.
|
||||
* `extensions/*` - all Druid extensions.
|
||||
* `hadoop-dependencies/*` - Druid Hadoop dependencies.
|
||||
* `lib/*` - all included software packages for core Druid.
|
||||
* `quickstart/*` - files related to the [single-machine quickstart](quickstart.html).
|
||||
* `DISCLAIMER`, `LICENSE`, and `NOTICE` files
|
||||
* `bin/*` - scripts related to the [single-machine quickstart](quickstart.html)
|
||||
* `conf/*` - template configurations for a clustered setup
|
||||
* `extensions/*` - core Druid extensions
|
||||
* `hadoop-dependencies/*` - Druid Hadoop dependencies
|
||||
* `lib/*` - libraries and dependencies for core Druid
|
||||
* `quickstart/*` - files related to the [single-machine quickstart](quickstart.html)
|
||||
|
||||
We'll be editing the files in `conf/` in order to get things running.
|
||||
|
||||
|
@ -284,7 +287,7 @@ server. If you have been editing the configurations on your local machine, you c
|
|||
copy them:
|
||||
|
||||
```bash
|
||||
rsync -az druid-#{DRUIDVERSION}/ COORDINATION_SERVER:druid-#{DRUIDVERSION}/
|
||||
rsync -az apache-druid-#{DRUIDVERSION}/ COORDINATION_SERVER:apache-druid-#{DRUIDVERSION}/
|
||||
```
|
||||
|
||||
Log on to your coordination server and install Zookeeper:
|
||||
|
|
|
@ -29,27 +29,29 @@ OSes](http://www.webupd8.org/2012/09/install-oracle-java-8-in-ubuntu-via-ppa.htm
|
|||
|
||||
## Getting started
|
||||
|
||||
To install Druid, run the following commands in your terminal:
|
||||
[Download](https://www.apache.org/dyn/closer.cgi?path=/incubator/druid/#{DRUIDVERSION}/apache-druid-#{DRUIDVERSION}-bin.tar.gz)
|
||||
the #{DRUIDVERSION} release.
|
||||
|
||||
Extract Druid by running the following commands in your terminal:
|
||||
|
||||
```bash
|
||||
curl -O http://static.druid.io/artifacts/releases/druid-#{DRUIDVERSION}-bin.tar.gz
|
||||
tar -xzf druid-#{DRUIDVERSION}-bin.tar.gz
|
||||
cd druid-#{DRUIDVERSION}
|
||||
tar -xzf apache-druid-#{DRUIDVERSION}-bin.tar.gz
|
||||
cd apache-druid-#{DRUIDVERSION}
|
||||
```
|
||||
|
||||
In the package, you should find:
|
||||
|
||||
* `LICENSE` - the license files.
|
||||
* `bin/` - scripts useful for this quickstart.
|
||||
* `conf/*` - template configurations for a clustered setup.
|
||||
* `extensions/*` - all Druid extensions.
|
||||
* `hadoop-dependencies/*` - Druid Hadoop dependencies.
|
||||
* `lib/*` - all included software packages for core Druid.
|
||||
* `DISCLAIMER`, `LICENSE`, and `NOTICE` files
|
||||
* `bin/*` - scripts useful for this quickstart
|
||||
* `conf/*` - template configurations for a clustered setup
|
||||
* `extensions/*` - core Druid extensions
|
||||
* `hadoop-dependencies/*` - Druid Hadoop dependencies
|
||||
* `lib/*` - libraries and dependencies for core Druid
|
||||
* `quickstart/*` - configuration files, sample data, and other files for the quickstart tutorials
|
||||
|
||||
## Download Zookeeper
|
||||
|
||||
Druid currently has a dependency on [Apache ZooKeeper](http://zookeeper.apache.org/) for distributed coordination. You'll
|
||||
Druid has a dependency on [Apache ZooKeeper](http://zookeeper.apache.org/) for distributed coordination. You'll
|
||||
need to download and run Zookeeper.
|
||||
|
||||
In the package root, run the following commands:
|
||||
|
@ -60,11 +62,11 @@ tar -xzf zookeeper-3.4.11.tar.gz
|
|||
mv zookeeper-3.4.11 zk
|
||||
```
|
||||
|
||||
The startup scripts for the tutorial will expect the contents of the Zookeeper tarball to be located at `zk` under the druid-#{DRUIDVERSION} package root.
|
||||
The startup scripts for the tutorial will expect the contents of the Zookeeper tarball to be located at `zk` under the apache-druid-#{DRUIDVERSION} package root.
|
||||
|
||||
## Start up Druid services
|
||||
|
||||
From the druid-#{DRUIDVERSION} package root, run the following command:
|
||||
From the apache-druid-#{DRUIDVERSION} package root, run the following command:
|
||||
|
||||
```bash
|
||||
bin/supervise -c quickstart/tutorial/conf/tutorial-cluster.conf
|
||||
|
@ -74,16 +76,16 @@ This will bring up instances of Zookeeper and the Druid services, all running on
|
|||
|
||||
```bash
|
||||
bin/supervise -c quickstart/tutorial/conf/tutorial-cluster.conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[zk], logging to[/stage/druid-#{DRUIDVERSION}/var/sv/zk.log]: bin/run-zk quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[coordinator], logging to[/stage/druid-#{DRUIDVERSION}/var/sv/coordinator.log]: bin/run-druid coordinator quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[broker], logging to[//stage/druid-#{DRUIDVERSION}/var/sv/broker.log]: bin/run-druid broker quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[historical], logging to[/stage/druid-#{DRUIDVERSION}/var/sv/historical.log]: bin/run-druid historical quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[overlord], logging to[/stage/druid-#{DRUIDVERSION}/var/sv/overlord.log]: bin/run-druid overlord quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[middleManager], logging to[/stage/druid-#{DRUIDVERSION}/var/sv/middleManager.log]: bin/run-druid middleManager quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[zk], logging to[/stage/apache-druid-#{DRUIDVERSION}/var/sv/zk.log]: bin/run-zk quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[coordinator], logging to[/stage/apache-druid-#{DRUIDVERSION}/var/sv/coordinator.log]: bin/run-druid coordinator quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[broker], logging to[//stage/apache-druid-#{DRUIDVERSION}/var/sv/broker.log]: bin/run-druid broker quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[historical], logging to[/stage/apache-druid-#{DRUIDVERSION}/var/sv/historical.log]: bin/run-druid historical quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[overlord], logging to[/stage/apache-druid-#{DRUIDVERSION}/var/sv/overlord.log]: bin/run-druid overlord quickstart/tutorial/conf
|
||||
[Thu Jul 26 12:16:23 2018] Running command[middleManager], logging to[/stage/apache-druid-#{DRUIDVERSION}/var/sv/middleManager.log]: bin/run-druid middleManager quickstart/tutorial/conf
|
||||
|
||||
```
|
||||
|
||||
All persistent state such as the cluster metadata store and segments for the services will be kept in the `var` directory under the druid-#{DRUIDVERSION} package root. Logs for the services are located at `var/sv`.
|
||||
All persistent state such as the cluster metadata store and segments for the services will be kept in the `var` directory under the apache-druid-#{DRUIDVERSION} package root. Logs for the services are located at `var/sv`.
|
||||
|
||||
Later on, if you'd like to stop the services, CTRL-C to exit the `bin/supervise` script, which will terminate the Druid processes.
|
||||
|
||||
|
@ -109,7 +111,7 @@ rm -rf /tmp/kafka-logs
|
|||
|
||||
For the following data loading tutorials, we have included a sample data file containing Wikipedia page edit events that occurred on 2015-09-12.
|
||||
|
||||
This sample data is located at `quickstart/wikiticker-2015-09-12-sampled.json.gz` from the Druid package root. The page edit events are stored as JSON objects in a text file.
|
||||
This sample data is located at `quickstart/tutorial/wikiticker-2015-09-12-sampled.json.gz` from the Druid package root. The page edit events are stored as JSON objects in a text file.
|
||||
|
||||
The sample data has the following columns, and an example event is shown below:
|
||||
|
||||
|
|
|
@ -20,7 +20,7 @@ For this tutorial, we've provided a Dockerfile for a Hadoop 2.8.3 cluster, which
|
|||
|
||||
This Dockerfile and related files are located at `quickstart/tutorial/hadoop/docker`.
|
||||
|
||||
From the druid-#{DRUIDVERSION} package root, run the following commands to build a Docker image named "druid-hadoop-demo" with version tag "2.8.3":
|
||||
From the apache-druid-#{DRUIDVERSION} package root, run the following commands to build a Docker image named "druid-hadoop-demo" with version tag "2.8.3":
|
||||
|
||||
```bash
|
||||
cd quickstart/tutorial/hadoop/docker
|
||||
|
@ -88,10 +88,10 @@ docker exec -it druid-hadoop-demo bash
|
|||
|
||||
### Copy input data to the Hadoop container
|
||||
|
||||
From the druid-#{DRUIDVERSION} package root on the host, copy the `quickstart/wikiticker-2015-09-12-sampled.json.gz` sample data to the shared folder:
|
||||
From the apache-druid-#{DRUIDVERSION} package root on the host, copy the `quickstart/tutorial/wikiticker-2015-09-12-sampled.json.gz` sample data to the shared folder:
|
||||
|
||||
```bash
|
||||
cp quickstart/wikiticker-2015-09-12-sampled.json.gz /tmp/shared/wikiticker-2015-09-12-sampled.json.gz
|
||||
cp quickstart/tutorial/wikiticker-2015-09-12-sampled.json.gz /tmp/shared/wikiticker-2015-09-12-sampled.json.gz
|
||||
```
|
||||
|
||||
### Setup HDFS directories
|
||||
|
|
|
@ -16,8 +16,8 @@ don't need to have loaded any data yet.
|
|||
|
||||
A data load is initiated by submitting an *ingestion task* spec to the Druid overlord. For this tutorial, we'll be loading the sample Wikipedia page edits data.
|
||||
|
||||
The Druid package includes the following sample native batch ingestion task spec at `quickstart/wikipedia-index.json`, shown here for convenience,
|
||||
which has been configured to read the `quickstart/wikiticker-2015-09-12-sampled.json.gz` input file:
|
||||
The Druid package includes the following sample native batch ingestion task spec at `quickstart/tutorial/wikipedia-index.json`, shown here for convenience,
|
||||
which has been configured to read the `quickstart/tutorial/wikiticker-2015-09-12-sampled.json.gz` input file:
|
||||
|
||||
```json
|
||||
{
|
||||
|
@ -71,7 +71,7 @@ which has been configured to read the `quickstart/wikiticker-2015-09-12-sampled.
|
|||
"type" : "index",
|
||||
"firehose" : {
|
||||
"type" : "local",
|
||||
"baseDir" : "quickstart/",
|
||||
"baseDir" : "quickstart/tutorial/",
|
||||
"filter" : "wikiticker-2015-09-12-sampled.json.gz"
|
||||
},
|
||||
"appendToExisting" : false
|
||||
|
@ -131,7 +131,7 @@ If you wish to go through any of the other ingestion tutorials, you will need to
|
|||
|
||||
Let's briefly discuss how we would've submitted the ingestion task without using the script. You do not need to run these commands.
|
||||
|
||||
To submit the task, POST it to Druid in a new terminal window from the druid-#{DRUIDVERSION} directory:
|
||||
To submit the task, POST it to Druid in a new terminal window from the apache-druid-#{DRUIDVERSION} directory:
|
||||
|
||||
```bash
|
||||
curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/tutorial/wikipedia-index.json http://localhost:8090/druid/indexer/v1/task
|
||||
|
|
|
@ -48,7 +48,7 @@ In the `rule #2` box at the bottom, click `Drop` and `Forever`.
|
|||
|
||||
This will cause the first 12 segments of `deletion-tutorial` to be dropped. However, these dropped segments are not removed from deep storage.
|
||||
|
||||
You can see that all 24 segments are still present in deep storage by listing the contents of `druid-#{DRUIDVERSION}/var/druid/segments/deletion-tutorial`:
|
||||
You can see that all 24 segments are still present in deep storage by listing the contents of `apache-druid-#{DRUIDVERSION}/var/druid/segments/deletion-tutorial`:
|
||||
|
||||
```bash
|
||||
$ ls -l1 var/druid/segments/deletion-tutorial/
|
||||
|
@ -132,7 +132,7 @@ $ ls -l1 var/druid/segments/deletion-tutorial/
|
|||
|
||||
Now that we have disabled some segments, we can submit a Kill Task, which will delete the disabled segments from metadata and deep storage.
|
||||
|
||||
A Kill Task spec has been provided at `quickstart/deletion-kill.json`. Submit this task to the Overlord with the following command:
|
||||
A Kill Task spec has been provided at `quickstart/tutorial/deletion-kill.json`. Submit this task to the Overlord with the following command:
|
||||
|
||||
```bash
|
||||
curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/tutorial/deletion-kill.json http://localhost:8090/druid/indexer/v1/task
|
||||
|
|
|
@ -611,7 +611,7 @@ We've finished defining the ingestion spec, it should now look like the followin
|
|||
|
||||
## Submit the task and query the data
|
||||
|
||||
From the druid-#{DRUIDVERSION} package root, run the following command:
|
||||
From the apache-druid-#{DRUIDVERSION} package root, run the following command:
|
||||
|
||||
```bash
|
||||
bin/post-index-task --file quickstart/ingestion-tutorial-index.json
|
||||
|
|
|
@ -57,7 +57,7 @@ Let's launch a console producer for our topic and send some data!
|
|||
In your Druid directory, run the following command:
|
||||
|
||||
```bash
|
||||
cd quickstart
|
||||
cd quickstart/tutorial
|
||||
gunzip -k wikiticker-2015-09-12-sampled.json.gz
|
||||
```
|
||||
|
||||
|
@ -65,7 +65,7 @@ In your Kafka directory, run the following command, where {PATH_TO_DRUID} is rep
|
|||
|
||||
```bash
|
||||
export KAFKA_OPTS="-Dfile.encoding=UTF-8"
|
||||
./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic wikipedia < {PATH_TO_DRUID}/quickstart/wikiticker-2015-09-12-sampled.json
|
||||
./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic wikipedia < {PATH_TO_DRUID}/quickstart/tutorial/wikiticker-2015-09-12-sampled.json
|
||||
```
|
||||
|
||||
The previous command posted sample events to the *wikipedia* Kafka topic which were then ingested into Druid by the Kafka indexing service. You're now ready to run some queries!
|
||||
|
|
|
@ -94,7 +94,7 @@ The SQL queries are submitted as JSON over HTTP.
|
|||
|
||||
### TopN query example
|
||||
|
||||
The tutorial package includes an example file that contains the SQL query shown above at `quickstart/wikipedia-top-pages-sql.json`. Let's submit that query to the Druid broker:
|
||||
The tutorial package includes an example file that contains the SQL query shown above at `quickstart/tutorial/wikipedia-top-pages-sql.json`. Let's submit that query to the Druid broker:
|
||||
|
||||
```bash
|
||||
curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/tutorial/wikipedia-top-pages-sql.json http://localhost:8082/druid/v2/sql
|
||||
|
|
|
@ -15,7 +15,7 @@ It will also be helpful to have finished [Tutorial: Loading a file](../tutorials
|
|||
|
||||
For this tutorial, we'll be using the Wikipedia edits sample data, with an ingestion task spec that will create a separate segment for each hour in the input data.
|
||||
|
||||
The ingestion spec can be found at `quickstart/retention-index.json`. Let's submit that spec, which will create a datasource called `retention-tutorial`:
|
||||
The ingestion spec can be found at `quickstart/tutorial/retention-index.json`. Let's submit that spec, which will create a datasource called `retention-tutorial`:
|
||||
|
||||
```bash
|
||||
bin/post-index-task --file quickstart/tutorial/retention-index.json
|
||||
|
|
|
@ -95,7 +95,7 @@ We will see how these definitions are used after we load this data.
|
|||
|
||||
## Load the example data
|
||||
|
||||
From the druid-#{DRUIDVERSION} package root, run the following command:
|
||||
From the apache-druid-#{DRUIDVERSION} package root, run the following command:
|
||||
|
||||
```bash
|
||||
bin/post-index-task --file quickstart/tutorial/rollup-index.json
|
||||
|
|
|
@ -24,7 +24,7 @@ tar -xzf tranquility-distribution-0.8.2.tgz
|
|||
mv tranquility-distribution-0.8.2 tranquility
|
||||
```
|
||||
|
||||
The startup scripts for the tutorial will expect the contents of the Tranquility tarball to be located at `tranquility` under the druid-#{DRUIDVERSION} package root.
|
||||
The startup scripts for the tutorial will expect the contents of the Tranquility tarball to be located at `tranquility` under the apache-druid-#{DRUIDVERSION} package root.
|
||||
|
||||
## Enable Tranquility Server
|
||||
|
||||
|
@ -34,7 +34,7 @@ The startup scripts for the tutorial will expect the contents of the Tranquility
|
|||
As part of the output of *supervise* you should see something like:
|
||||
|
||||
```bash
|
||||
Running command[tranquility-server], logging to[/stage/druid-#{DRUIDVERSION}/var/sv/tranquility-server.log]: tranquility/bin/tranquility server -configFile quickstart/tutorial/conf/tranquility/server.json -Ddruid.extensions.loadList=[]
|
||||
Running command[tranquility-server], logging to[/stage/apache-druid-#{DRUIDVERSION}/var/sv/tranquility-server.log]: tranquility/bin/tranquility server -configFile quickstart/tutorial/conf/tranquility/server.json -Ddruid.extensions.loadList=[]
|
||||
```
|
||||
|
||||
You can check the log file in `var/sv/tranquility-server.log` to confirm that the server is starting up properly.
|
||||
|
@ -44,8 +44,8 @@ You can check the log file in `var/sv/tranquility-server.log` to confirm that th
|
|||
Let's send the sample Wikipedia edits data to Tranquility:
|
||||
|
||||
```bash
|
||||
gunzip -k quickstart/wikiticker-2015-09-12-sampled.json.gz
|
||||
curl -XPOST -H'Content-Type: application/json' --data-binary @quickstart/wikiticker-2015-09-12-sampled.json http://localhost:8200/v1/post/wikipedia
|
||||
gunzip -k quickstart/tutorial/wikiticker-2015-09-12-sampled.json.gz
|
||||
curl -XPOST -H'Content-Type: application/json' --data-binary @quickstart/tutorial/wikiticker-2015-09-12-sampled.json http://localhost:8200/v1/post/wikipedia
|
||||
```
|
||||
|
||||
Which will print something like:
|
||||
|
|
|
@ -49,7 +49,7 @@
|
|||
"type" : "index",
|
||||
"firehose" : {
|
||||
"type" : "local",
|
||||
"baseDir" : "quickstart/",
|
||||
"baseDir" : "quickstart/tutorial/",
|
||||
"filter" : "wikiticker-2015-09-12-sampled.json.gz"
|
||||
},
|
||||
"appendToExisting" : false
|
||||
|
|
|
@ -49,7 +49,7 @@
|
|||
"type" : "index",
|
||||
"firehose" : {
|
||||
"type" : "local",
|
||||
"baseDir" : "quickstart/",
|
||||
"baseDir" : "quickstart/tutorial/",
|
||||
"filter" : "wikiticker-2015-09-12-sampled.json.gz"
|
||||
},
|
||||
"appendToExisting" : false
|
||||
|
|
|
@ -49,7 +49,7 @@
|
|||
"type" : "index",
|
||||
"firehose" : {
|
||||
"type" : "local",
|
||||
"baseDir" : "quickstart/",
|
||||
"baseDir" : "quickstart/tutorial/",
|
||||
"filter" : "wikiticker-2015-09-12-sampled.json.gz"
|
||||
},
|
||||
"appendToExisting" : false
|
||||
|
|
|
@ -49,7 +49,7 @@
|
|||
"type" : "index",
|
||||
"firehose" : {
|
||||
"type" : "local",
|
||||
"baseDir" : "quickstart/",
|
||||
"baseDir" : "quickstart/tutorial/",
|
||||
"filter" : "wikiticker-2015-09-12-sampled.json.gz"
|
||||
},
|
||||
"appendToExisting" : false
|
||||
|
|
Loading…
Reference in New Issue