mirror of https://github.com/apache/druid.git
ingestion and tutorial doc update (#10202)
This commit is contained in:
parent
266243ac75
commit
d4bd6e5207
|
@ -284,7 +284,7 @@ The following table shows how each ingestion method handles partitioning:
|
||||||
## Ingestion specs
|
## Ingestion specs
|
||||||
|
|
||||||
No matter what ingestion method you use, data is loaded into Druid using either one-time [tasks](tasks.html) or
|
No matter what ingestion method you use, data is loaded into Druid using either one-time [tasks](tasks.html) or
|
||||||
ongoing "supervisors" (which run and supervised a set of tasks over time). In any case, part of the task or supervisor
|
ongoing "supervisors" (which run and supervise a set of tasks over time). In any case, part of the task or supervisor
|
||||||
definition is an _ingestion spec_.
|
definition is an _ingestion spec_.
|
||||||
|
|
||||||
Ingestion specs consists of three main components:
|
Ingestion specs consists of three main components:
|
||||||
|
|
|
@ -261,7 +261,7 @@ The three `partitionsSpec` types have different characteristics.
|
||||||
The recommended use case for each partitionsSpec is:
|
The recommended use case for each partitionsSpec is:
|
||||||
- If your data has a uniformly distributed column which is frequently used in your queries,
|
- If your data has a uniformly distributed column which is frequently used in your queries,
|
||||||
consider using `single_dim` partitionsSpec to maximize the performance of most of your queries.
|
consider using `single_dim` partitionsSpec to maximize the performance of most of your queries.
|
||||||
- If your data doesn't a uniformly distributed column, but is expected to have a [high rollup ratio](./index.md#maximizing-rollup-ratio)
|
- If your data doesn't have a uniformly distributed column, but is expected to have a [high rollup ratio](./index.md#maximizing-rollup-ratio)
|
||||||
when you roll up with some dimensions, consider using `hashed` partitionsSpec.
|
when you roll up with some dimensions, consider using `hashed` partitionsSpec.
|
||||||
It could reduce the size of datasource and query latency by improving data locality.
|
It could reduce the size of datasource and query latency by improving data locality.
|
||||||
- If the above two scenarios are not the case or you don't need to roll up your datasource,
|
- If the above two scenarios are not the case or you don't need to roll up your datasource,
|
||||||
|
|
|
@ -205,7 +205,7 @@ We've included a sample of Wikipedia edits from September 12, 2015 to get you st
|
||||||
To load this data into Druid, you can submit an *ingestion task* pointing to the file. We've included
|
To load this data into Druid, you can submit an *ingestion task* pointing to the file. We've included
|
||||||
a task that loads the `wikiticker-2015-09-12-sampled.json.gz` file included in the archive.
|
a task that loads the `wikiticker-2015-09-12-sampled.json.gz` file included in the archive.
|
||||||
|
|
||||||
Let's submit the `wikipedia-index-hadoop-.json` task:
|
Let's submit the `wikipedia-index-hadoop.json` task:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
bin/post-index-task --file quickstart/tutorial/wikipedia-index-hadoop.json --url http://localhost:8081
|
bin/post-index-task --file quickstart/tutorial/wikipedia-index-hadoop.json --url http://localhost:8081
|
||||||
|
|
Loading…
Reference in New Issue