Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Victoria Lim <lim.t.victoria@gmail.com>
* Various documentation updates.
1) Split out "data management" from "ingestion". Break it into thematic pages.
2) Move "SQL-based ingestion" into the Ingestion category. Adjust content so
all conceptual content is in concepts.md and all syntax content is in reference.md.
Shorten the known issues page to the most interesting ones.
3) Add SQL-based ingestion to the ingestion method comparison page. Remove the
index task, since index_parallel is just as good when maxNumConcurrentSubTasks: 1.
4) Rename various mentions of "Druid console" to "web console".
5) Add additional information to ingestion/partitioning.md.
6) Remove a mention of Tranquility.
7) Remove a note about upgrading to Druid 0.10.1.
8) Remove no-longer-relevant task types from ingestion/tasks.md.
9) Move ingestion/native-batch-firehose.md to the hidden section. It was previously deprecated.
10) Move ingestion/native-batch-simple-task.md to the hidden section. It is still linked in some
places, but it isn't very useful compared to index_parallel, so it shouldn't take up space
in the sidebar.
11) Make all br tags self-closing.
12) Certain other cosmetic changes.
13) Update to node-sass 7.
* make travis use node12 for docs
Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
* Update data-formats.md
Per Suneet, "Since you're editing this file can you also fix the json on line 177 please - it's missing a comma after the }"
* Light text cleanup
* Removing discussion of sample data, since it's repeated in the data loading tutorial, and not immediately relevant here.
* Update index.md
* original quickstart full first pass
* original quickstart full first pass
* first pass all the way through
* straggler
* image touchups and finished old tutorial
* a bit of finishing up
* Review comments
* fixing links
* spell checking gymnastics
* Tutorials use new ingestion spec where possible
There are 2 main changes
* Use task type index_parallel instead of index
* Remove the use of parser + firehose in favor of inputFormat + inputSource
index_parallel is the preferred method starting in 0.17. Setting the job to
index_parallel with the default maxNumConcurrentSubTasks(1) is the equivalent
of an index task
Instead of using a parserSpec, dimensionSpec and timestampSpec have been
promoted to the dataSchema. The format is described in the ioConfig as the
inputFormat.
There are a few cases where the new format is not supported
* Hadoop must use firehoses instead of the inputSource and inputFormat
* There is no equivalent of a combining firehose as an inputSource
* A Combining firehose does not support index_parallel
* fix typo