Commit Graph

104 Commits

Author SHA1 Message Date
Katya Macedo da6158c166
[Docs] Improve the "Update existing data" tutorial (#16081)
* Modify update data tutorial

* Update after review

* Add append topic

* Update after review

* Add upsert to spelling
2024-03-14 16:31:33 -07:00
Katya Macedo 0f29ece6a9
[Docs] Refactor streaming ingestion section (#15591)
Merging the work so far. @ektravel , @vogievetsky if there are additional improvements, let's track them & make another pr.



* Refactor streaming ingestion docs

* Update property definition

* Update after review

* Update known issues

* Move kinesis and kafka topics to ingestion, add redirects

* Saving changes

* Saving

* Add input format text

* Update after review

* Minor text edit

* Update example syntax

* Revert back to colon

* Fix merge conflicts

* Fix broken links

* Fix spelling error
2024-02-12 13:52:42 -08:00
Charles Smith 2a42b11660
remove legacy Jupyter tutorial files (#15834)
* remove legacy files

* redirection for the jupyter tutorial page

* remove tutorial from sidebar

* remove redirection
2024-02-12 13:45:47 -08:00
Karan Kumar 5036af6fb3
Doc fixes for query from deep storage and MSQ (#15313)
Minor updates to the documentation.

    Added prerequisites.
    Removed a known issue in MSQ since its no longer valid.

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2023-11-03 10:52:20 +05:30
Charles Smith 3860052de0
remove references to Jupyter notebooks within the Druid repo (#15143)
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>
2023-11-01 13:17:06 -07:00
317brian 265c811963
docs: remove experimental note from query from deep storage docs (#15132) 2023-10-12 11:51:02 +05:30
317brian 263e106714
docs: remove experimental note from unnest docs (#15123)
* docs: remove experimental note from unnest docs

* remove flag needed to use unnest
2023-10-10 16:52:51 -07:00
317brian 2164dafb99
docs: update unnest to use crossjoin instead of comma (#15074) 2023-10-05 09:01:08 -07:00
Giulio Talarico 76e5048aab
fix supervisor spec api submission commands (#14877) 2023-08-23 14:38:09 +05:30
Benedict Jin 18f7cb6926
Fixed broken URL of python api tutorial (#14881) 2023-08-22 09:53:41 +05:30
Katya Macedo 5f74ef56f1
Clean up Kafka supervisor topic (#14651)
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-08-21 11:55:38 -07:00
317brian 6b4dda964d
Docusaurus2 upgrade for master (#14411)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-08-16 19:01:21 -07:00
Tejaswini Bandlamudi a45b25fa1d
Removes support for Hadoop 2 (#14763)
Removing Hadoop 2 support as discussed in https://lists.apache.org/list?dev@druid.apache.org:lte=1M:hadoop
2023-08-09 17:47:52 +05:30
Suneet Saldanha 590734b5eb
Update tutorial-kafka.md (#14749) 2023-08-04 10:56:33 -07:00
317brian 3b5b6c6a41
docs: query from deep storage (#14609)
* cold tier wip

* wip

* copyedits

* wip

* copyedits

* copyedits

* wip

* wip

* update rules page

* typo

* typo

* update sidebar

* moves durable storage info to its own page in operations

* update screenshots

* add apache license

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* add query from deep storage tutorial stub

* address some of the feedback

* revert screenshot update. handled in separate pr

* load rule update

* wip tutorial

* reformat deep storage endpoints

* rest of tutorial

* typo

* cleanup

* screenshot and sidebar for tutorial

* add license

* typos

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* rest of review comments

* clarify where results are stored

* update api reference for durablestorage context param

* Apply suggestions from code review

Co-authored-by: Karan Kumar <karankumar1100@gmail.com>

* comments

* incorporate #14720

* address rest of comments

* missed one

* Update docs/api-reference/sql-api.md

* Update docs/api-reference/sql-api.md

---------

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: demo-kratia <56242907+demo-kratia@users.noreply.github.com>
Co-authored-by: Karan Kumar <karankumar1100@gmail.com>
2023-08-04 11:10:08 +05:30
Nhi Pham a764ed7fde
Update Jupyter notebook tutorial instructions for ARM devices (#14459)
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-07-11 10:01:20 -07:00
Gian Merlino 63ee69b4e8
Claim full support for Java 17. (#14384)
* Claim full support for Java 17.

No production code has changed, except the startup scripts.

Changes:

1) Allow Java 17 without DRUID_SKIP_JAVA_CHECK.

2) Include the full list of opens and exports on both Java 11 and 17.

3) Document that Java 17 is both supported and preferred.

4) Switch some tests from Java 11 to 17 to get better coverage on the
   preferred version.

* Doc update.

* Update errorprone.

* Update docker_build_containers.sh.

* Update errorprone in licenses.yaml.

* Add some more run-javas.

* Additional run-javas.

* Update errorprone.

* Suppress new errorprone error.

* Add exports and opens in ForkingTaskRunner for Java 11+.

Test, doc changes.

* Additional errorprone updates.

* Update for errorprone.

* Restore old fomatting in LdapCredentialsValidator.

* Copy bin/ too.

* Fix Java 15, 17 build line in docker_build_containers.sh.

* Update busybox image.

* One more java command.

* Fix interpolation.

* IT commandline refinements.

* Switch to busybox 1.34.1-glibc.

* POM adjustments, build and test one IT on 17.

* Additional debugging.

* Fix silly thing.

* Adjust command line.

* Add exports and opens one more place.

* Additional harmonization of strong encapsulation parameters.
2023-07-07 12:52:35 -07:00
Victoria Lim 50b7e5d20e
docs: fix links (#14504) 2023-07-05 12:29:47 -07:00
317brian 2012a6bd8e
Docs: fix broken link to Python API jupyter notebook (#14332) 2023-05-31 08:12:27 +05:30
Katya Macedo 269137c682
Update Ingestion section (#14023)
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Victoria Lim <lim.t.victoria@gmail.com>
2023-05-19 09:42:27 -07:00
Abhishek Radhakrishnan 7400ed3c93
Fixup data deletion tutorial docs (#14283)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-05-17 17:05:35 -07:00
Victoria Lim 66d4ea014c
Docs: Tutorial for streaming ingestion using Kafka + Docker file to use with Jupyter tutorials (#13984) 2023-05-15 15:20:52 -07:00
317brian 6254658f61
docs: fix links (#14111) 2023-05-12 09:59:16 -07:00
Charles Smith 166cb6203b
Remove unnecessary python topic. Style changes to quickstart. (#13647)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-04-07 09:55:52 -07:00
Charles Smith 1c2744b31e
Fix querying sql (#14026)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-04-06 14:50:06 -07:00
317brian 7e572eef08
docs: sql unnest and cleanup unnest datasource (#13736)
Co-authored-by: Elliott Freis <elliottfreis@Elliott-Freis.earth.dynamic.blacklight.net>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Paul Rogers <paul-rogers@users.noreply.github.com>
Co-authored-by: Jill Osborne <jill.osborne@imply.io>
Co-authored-by: Anshu Makkar <83963638+anshu-makkar@users.noreply.github.com>
Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Elliott Freis <108356317+imply-elliott@users.noreply.github.com>
Co-authored-by: Nicholas Lippis <nick.lippis@imply.io>
Co-authored-by: Rohan Garg <7731512+rohangarg@users.noreply.github.com>
Co-authored-by: Karan Kumar <karankumar1100@gmail.com>
Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Co-authored-by: Clint Wylie <cwylie@apache.org>
Co-authored-by: Adarsh Sanjeev <adarshsanjeev@gmail.com>
Co-authored-by: Laksh Singla <lakshsingla@gmail.com>
2023-04-04 13:07:54 -07:00
Gian Merlino 4b1ffbc452
Various changes and fixes to UNNEST. (#13892)
* Various changes and fixes to UNNEST.

Native changes:

1) UnnestDataSource: Replace "column" and "outputName" with "virtualColumn".
   This enables pushing expressions into the datasource. This in turn
   allows us to do the next thing...

2) UnnestStorageAdapter: Logically apply query-level filters and virtual
   columns after the unnest operation. (Physically, filters are pulled up,
   when possible.) This is beneficial because it allows filters and
   virtual columns to reference the unnested column, and because it is
   consistent with how the join datasource works.

3) Various documentation updates, including declaring "unnest" as an
   experimental feature for now.

SQL changes:

1) Rename DruidUnnestRel (& Rule) to DruidUnnestRel (& Rule). The rel
   is simplified: it only handles the UNNEST part of a correlated join.
   Constant UNNESTs are handled with regular inline rels.

2) Rework DruidCorrelateUnnestRule to focus on pulling Projects from
   the left side up above the Correlate. New test testUnnestTwice verifies
   that this works even when two UNNESTs are stacked on the same table.

3) Include ProjectCorrelateTransposeRule from Calcite to encourage
   pushing mappings down below the left-hand side of the Correlate.

4) Add a new CorrelateFilterLTransposeRule and CorrelateFilterRTransposeRule
   to handle pulling Filters up above the Correlate. New tests
   testUnnestWithFiltersOutside and testUnnestTwiceWithFilters verify
   this behavior.

5) Require a context feature flag for SQL UNNEST, since it's undocumented.
   As part of this, also cleaned up how we handle feature flags in SQL.
   They're now hooked into EngineFeatures, which is useful because not
   all engines support all features.
2023-03-10 16:42:08 +05:30
Paul Rogers a580aca551
Python Druid API for use in notebooks (#13787)
Python Druid API for use in notebooks

Revises existing notebooks and readme to reference
the new API.

Notebook to explain the new API.

Split README into a console version and a notebook
version to work around lack of a nice display for
md files.

Update the REST API notebook to use simpler Requests calls

Converted the SQL tutorial to use the Python library

README file, converted to using properties

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-03-04 18:25:19 -08:00
317brian b4b354b658
docs: fix html nits (#13835) 2023-03-02 11:19:32 -08:00
Win Min Soe 70f9052f1d
docs: update correct config base on server spec (#13832)
Co-authored-by: Winn Minn <winn.minn@grabtaxi.com>
2023-02-23 08:50:47 -08:00
Abhishek Radhakrishnan 17a3cd0b68
Remove the additional backtick that's causing a SA issue. (#13838) 2023-02-23 09:01:08 +05:30
Katya Macedo 1595653e6f
docs: add a link for the Druid SQL tutorial (#13468)
* docs: add juptyer API tutorial for API and jupyter tutorial index (#3)

(cherry picked from commit aeb8d9e3390fa26d9c533dce0862295b80c58583)

* update prereqs and fix jupyterlab name

* Removing notebook since 13345 has it

13345 should be merged first

* update contributing instructions

* docs: link to the  Druid SQL tutorial

* Add link to partitioning

* fix merge conflict

* Saving

* Update docs/tutorials/tutorial-jupyter-index.md

* Remove partitioning

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
Co-authored-by: brian.le <brian.le@imply.io>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-02-22 09:36:13 -08:00
Guy ☀️ Moore 306997be87
Add Perl 5 to druid requirements (#13708)
Without perl 5 I was unable to start druid using the instructions in the quickstart guide. I'm not certain what versions it might require, but the one that I got working was perl 5

> This is perl 5, version 36, subversion 0 (v5.36.0) built for x86_64-linux-thread-multi
2023-02-13 13:34:49 -08:00
Victoria Lim 33efd5ab1d
docs: Refresh the update data tutorial (#13641)
Merging regardless of nit since topic is in better shape.

* refresh the update data tutorial

* Apply suggestions from code review

Co-authored-by: Jill Osborne <jill.osborne@imply.io>

---------

Co-authored-by: Jill Osborne <jill.osborne@imply.io>
2023-02-01 18:18:16 -08:00
Jill Osborne 356b0e37cf
Tutorial: Query view (#13565)
* Tutorial: Query view

* Removed duplicate file

* Update tutorial-sql-query-view.md

* Update tutorial-sql-query-view.md

* Update tutorial-sql-query-view.md

* Updated after review

* Update docs/tutorials/tutorial-sql-query-view.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update tutorial-sql-query-view.md

Update title

* Update sidebars.json

fix merge conflict w/ sidebar

* address spelling ci

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-01-27 14:29:43 -08:00
317brian 9021161c8c
doc: fix markdown spacing (#13683)
* doc: fix markdown spacing

* fix spacing
2023-01-25 16:22:49 -08:00
Vadim Ogievetsky f97bcc69d3
Docs: reword single server page (#13659)
* reword single server page

* fix typo

* Update docs/operations/single-server.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* spelling

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-01-11 21:12:52 -08:00
317brian 6bbf4266b2
docs: documentation for unnest datasource (#13479)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-01-06 11:41:11 -08:00
Kashif Faraz 0d97e658b2
Docs: Update quickstart instructions (#13611)
Changes:
- Remove specification of a Druid version in the quickstart, because the previous step
instructs downloading the latest version anyway.
- Mention usage of memory parameter in the quickstart
2022-12-22 11:51:08 +05:30
Vadim Ogievetsky 07597c687d
Docs: Remove large data file (#13595) 2022-12-19 13:14:22 +05:30
317brian d9c27d6102
docs: add index page and related stuff for jupyter tutorials (#13342) 2022-12-16 13:33:50 -08:00
Vadim Ogievetsky 2729e25295
Link to java docs (#13478)
* add link to page about selecting a JRE

* add link to script also

* simplify text
2022-12-14 11:45:23 -08:00
Rishabh Singh 4ebdfe226d
Druid automated quickstart (#13365)
* Druid automated quickstart

* remove conf/druid/single-server/quickstart/_common/historical/jvm.config

* Minor changes in python script

* Add lower bound memory for some services

* Additional runtime properties for services

* Update supervise script to accept command arguments, corresponding changes in druid-quickstart.py

* File end newline

* Limit the ability to start multiple instances of a service, documentation changes

* simplify script arguments

* restore changes in medium profile

* run-druid refactor

* compute and pass middle manager runtime properties to run-druid
supervise script changes to process java opts array
use argparse, leave free memory, logging

* Remove extra quotes from mm task javaopts array

* Update logic to compute minimum memory

* simplify run-druid

* remove debug options from run-druid

* resolve the config_path provided

* comment out service specific runtime properties which are computed in the code

* simplify run-druid

* clean up docs, naming changes

* Throw ValueError exception on illegal state

* update docs

* rename args, compute_only -> compute, run_zk -> zk

* update help documentation

* update help documentation

* move task memory computation into separate method

* Add validation checks

* remove print

* Add validations

* remove start-druid bash script, rename start-druid-main

* Include tasks in lower bound memory calculation

* Fix test

* 256m instead of 256g

* caffeine cache uses 5% of heap

* ensure min task count is 2, task count is monotonic

* update configs and documentation for runtime props in conf/druid/single-server/quickstart

* Update docs

* Specify memory argument for each profile in single-server.md

* Update middleManager runtime.properties

* Move quickstart configs to conf/druid/base, add bash launch script, support python2

* Update supervise script

* rename base config directory to auto

* rename python script, changes to pass repeated args to supervise

* remove exmaples/conf/druid/base dir

* add docs

* restore changes in conf dir

* update start-druid-auto

* remove hashref for commands in supervise script

* start-druid-main java_opts array is comma separated

* update entry point script name in python script

* Update help docs

* documentation changes

* docs changes

* update docs

* add support for running indexer

* update supported services list

* update help

* Update python.md

* remove dir

* update .spelling

* Remove dependency on psutil and pathlib

* update docs

* Update get_physical_memory method

* Update help docs

* update docs

* update method to get physical memory on python

* udpate spelling

* update .spelling

* minor change

* Minor change

* memory comptuation for indexer

* update start-druid

* Update python.md

* Update single-server.md

* Update python.md

* run python3 --version to check if python is installed

* Update supervise script

* start-druid: echo message if python not found

* update anchor text

* minor change

* Update condition in supervise script

* JVM not jvm in docs
2022-12-09 11:04:02 -08:00
317brian cc2e4a80ff
doc: add a basic JDBC tutorial (#13343)
* initial commit for jdbc tutorial

(cherry picked from commit 04c4adad71e5436b76c3425fe369df03aaaf0acb)

* add commentary

* address comments from charles

* add query context to example

* fix typo

* add links

* Apply suggestions from code review

Co-authored-by: Frank Chen <frankchen@apache.org>

* fix datatype

* address feedback

* add parameterize to spelling file. the past tense version was already there

Co-authored-by: Frank Chen <frankchen@apache.org>
2022-11-30 16:25:35 -08:00
Jill Osborne b0db2a87d8
Update Kafka ingestion tutorial (#13261)
* Update Kafka ingestion tutorial

* Update tutorial-kafka.md

Updated location of sample data file

* Added sample data file

* Update tutorial-kafka.md

* Add sample data file

* Update tutorial-kafka.md

Updated sample file location in curl commands

* Update and reuploading sample data files

* Updated spelling file

* Delete .spelling

* Added spelling file

* Update docs/tutorials/tutorial-kafka.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/tutorials/tutorial-kafka.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Updated after review

* Update tutorial-kafka.md

* Updated

* Update tutorial-kafka.md

* Update tutorial-kafka.md

* Update tutorial-kafka.md

* Updated sample data file and command

* Add files via upload

* Delete kttm-nested-data.json.tgz

* Delete kttm-nested-data.json.tgz

* Add files via upload

* Update tutorial-kafka.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2022-11-11 14:47:54 -08:00
Vadim Ogievetsky edc444a4bc
fix quickstart (#13126) 2022-09-20 17:44:21 -07:00
Charles Smith b366a6c5a4
Add clarification around docker environment #8926 (#13084)
* Add clarification around docker environment #8926

* fix spelling

* Update docs/tutorials/docker.md

Co-authored-by: Frank Chen <frankchen@apache.org>

* Update docs/tutorials/docker.md

Co-authored-by: Frank Chen <frankchen@apache.org>

* fix nano quickstart

Co-authored-by: Frank Chen <frankchen@apache.org>
2022-09-17 20:44:24 +08:00
Gian Merlino d4967c38f8
Various documentation updates. (#13107)
* Various documentation updates.

1) Split out "data management" from "ingestion". Break it into thematic pages.

2) Move "SQL-based ingestion" into the Ingestion category. Adjust content so
   all conceptual content is in concepts.md and all syntax content is in reference.md.
   Shorten the known issues page to the most interesting ones.

3) Add SQL-based ingestion to the ingestion method comparison page. Remove the
   index task, since index_parallel is just as good when maxNumConcurrentSubTasks: 1.

4) Rename various mentions of "Druid console" to "web console".

5) Add additional information to ingestion/partitioning.md.

6) Remove a mention of Tranquility.

7) Remove a note about upgrading to Druid 0.10.1.

8) Remove no-longer-relevant task types from ingestion/tasks.md.

9) Move ingestion/native-batch-firehose.md to the hidden section. It was previously deprecated.

10) Move ingestion/native-batch-simple-task.md to the hidden section. It is still linked in some
    places, but it isn't very useful compared to index_parallel, so it shouldn't take up space
    in the sidebar.

11) Make all br tags self-closing.

12) Certain other cosmetic changes.

13) Update to node-sass 7.

* make travis use node12 for docs

Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
2022-09-16 21:58:11 -07:00
Vadim Ogievetsky 2493eb17bf
Doc fixes around msq (#13090)
* remove things that do not apply

* fix more things

* pin node to a working version

* fix

* fixes

* known issues tidy up

* revert auto formatting changes

* remove management-uis page which is 100% lies

* don't mention the Coordinator console (that no longer exits)

* goodies

* fix typo
2022-09-16 02:15:26 -07:00
Atul Mohan a8fd3a9077
Provide service specific log4j overrides in containerized deployments (#13020)
* Provide service specific log4j overrides

* Clarify comments

* Add docs
2022-09-14 11:47:11 +08:00