ingestion and tutorial doc update (#10202)

This commit is contained in:
mans2singh 2020-07-21 20:52:23 -04:00 committed by GitHub
parent 266243ac75
commit d4bd6e5207
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 3 additions and 3 deletions

View File

@ -284,7 +284,7 @@ The following table shows how each ingestion method handles partitioning:
## Ingestion specs ## Ingestion specs
No matter what ingestion method you use, data is loaded into Druid using either one-time [tasks](tasks.html) or No matter what ingestion method you use, data is loaded into Druid using either one-time [tasks](tasks.html) or
ongoing "supervisors" (which run and supervised a set of tasks over time). In any case, part of the task or supervisor ongoing "supervisors" (which run and supervise a set of tasks over time). In any case, part of the task or supervisor
definition is an _ingestion spec_. definition is an _ingestion spec_.
Ingestion specs consists of three main components: Ingestion specs consists of three main components:

View File

@ -261,7 +261,7 @@ The three `partitionsSpec` types have different characteristics.
The recommended use case for each partitionsSpec is: The recommended use case for each partitionsSpec is:
- If your data has a uniformly distributed column which is frequently used in your queries, - If your data has a uniformly distributed column which is frequently used in your queries,
consider using `single_dim` partitionsSpec to maximize the performance of most of your queries. consider using `single_dim` partitionsSpec to maximize the performance of most of your queries.
- If your data doesn't a uniformly distributed column, but is expected to have a [high rollup ratio](./index.md#maximizing-rollup-ratio) - If your data doesn't have a uniformly distributed column, but is expected to have a [high rollup ratio](./index.md#maximizing-rollup-ratio)
when you roll up with some dimensions, consider using `hashed` partitionsSpec. when you roll up with some dimensions, consider using `hashed` partitionsSpec.
It could reduce the size of datasource and query latency by improving data locality. It could reduce the size of datasource and query latency by improving data locality.
- If the above two scenarios are not the case or you don't need to roll up your datasource, - If the above two scenarios are not the case or you don't need to roll up your datasource,

View File

@ -205,7 +205,7 @@ We've included a sample of Wikipedia edits from September 12, 2015 to get you st
To load this data into Druid, you can submit an *ingestion task* pointing to the file. We've included To load this data into Druid, you can submit an *ingestion task* pointing to the file. We've included
a task that loads the `wikiticker-2015-09-12-sampled.json.gz` file included in the archive. a task that loads the `wikiticker-2015-09-12-sampled.json.gz` file included in the archive.
Let's submit the `wikipedia-index-hadoop-.json` task: Let's submit the `wikipedia-index-hadoop.json` task:
```bash ```bash
bin/post-index-task --file quickstart/tutorial/wikipedia-index-hadoop.json --url http://localhost:8081 bin/post-index-task --file quickstart/tutorial/wikipedia-index-hadoop.json --url http://localhost:8081