Commit Graph

31 Commits

Author SHA1 Message Date
Clint Wylie 9b1779734b
fix website mvn build (#14458)
changes:
* fix website mvn build
* remove the i18n/en.json file add to gitignore
* add spellcheck to mvn test phase
2023-06-22 12:14:23 -07:00
Victoria Lim 66d4ea014c
Docs: Tutorial for streaming ingestion using Kafka + Docker file to use with Jupyter tutorials (#13984) 2023-05-15 15:20:52 -07:00
frankgrimes97 2f98675285
Tuple sketch SQL support (#13887)
This PR is a follow-up to #13819 so that the Tuple sketch functionality can be used in SQL for both ingestion using Multi-Stage Queries (MSQ) and also for analytic queries against Tuple sketch columns.
2023-03-28 18:47:12 +05:30
Victoria Lim ede9903ff4
pip install for Python Druid API (#13938)
Broken test appears unrelated to this PR

* make druidapi pip installable

* include druidapi in prerequisites

* add license to setup.py

* updates from Paul's review

* note about editable install

* Apply suggestions from code review

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

* update install instructions

* found unrelated typos

* standardize install cmd with pip

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
2023-03-21 11:37:39 -07:00
Paul Rogers a580aca551
Python Druid API for use in notebooks (#13787)
Python Druid API for use in notebooks

Revises existing notebooks and readme to reference
the new API.

Notebook to explain the new API.

Split README into a console version and a notebook
version to work around lack of a nice display for
md files.

Update the REST API notebook to use simpler Requests calls

Converted the SQL tutorial to use the Python library

README file, converted to using properties

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-03-04 18:25:19 -08:00
Paul Rogers 5dadbdf4d0
Generate the IT docker-compose.yaml files (#13669)
Generate IT docker-compose.sh files

Generates test-specific docker-compose.sh files using a simple
Python template script.
2023-02-21 15:03:02 -08:00
317brian d9c27d6102
docs: add index page and related stuff for jupyter tutorials (#13342) 2022-12-16 13:33:50 -08:00
317brian 668d1fad6b
docs: notebook only for API tutorial (#13345)
* docs: notebook for API tutorial

* Apply suggestions from code review

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* address the other comments

* typo

* add commentary to outputs

* address feedback from will

* delete unnecessary comment

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-12-15 13:16:07 -08:00
Vadim Ogievetsky 2a039e7e6a
Add CTA and fix typo (#13009)
* Add CTA and fix typo

* resolve hostname better
2022-09-06 11:16:50 -07:00
Paul Rogers cfed036091
Add the new integration test framework (#12368)
This commit is a first draft of the revised integration test framework which provides:
- A new directory, integration-tests-ex that holds the new integration test structure. (For now, the existing integration-tests is left unchanged.)
- Maven module druid-it-tools to hold code placed into the Docker image.
- Maven module druid-it-image to build the Druid-only test image from the tarball produced in distribution. (Dependencies live in their "official" image.)
- Maven module druid-it-cases that holds the revised tests and the framework itself. The framework includes file-based test configuration, test-specific clients, test initialization and updated versions of some of the common test support classes.

The integration test setup is primarily a huge mass of details. This approach refactors many of those details: from how the image is built and configured to how the Docker Compose scripts are structured to test configuration. An extensive set of "readme" files explains those details. Rather than repeat that material here, please consult those files for explanations.
2022-08-24 17:03:23 +05:30
Gian Merlino ca4e64aea3
Frame processing and channels. (#12848)
* Frame processing and channels.

Follow-up to #12745. This patch adds three new concepts:

1) Frame channels are interfaces for doing nonblocking reads and writes
   of frames.

2) Frame processors are interfaces for doing nonblocking processing of
   frames received from input channels and sent to output channels.

3) Cluster-by keys, which can be used for sorting or partitioning.

The patch also adds SuperSorter, a user of these concepts, both to
illustrate how they are used, and also because it is going to be useful
in future work.

Central classes:

- ReadableFrameChannel. Implementations include
  BlockingQueueFrameChannel (in-memory channel that implements both interfaces),
  ReadableFileFrameChannel (file-based channel),
  ReadableByteChunksFrameChannel (byte-stream-based channel), and others.

- WritableFrameChannel. Implementations include BlockingQueueFrameChannel
  and WritableStreamFrameChannel (byte-stream-based channel).

- ClusterBy, a sorting or partitioning key.

- FrameProcessor, nonblocking processor of frames. Implementations include
  FrameChannelBatcher, FrameChannelMerger, and FrameChannelMuxer.

- FrameProcessorExecutor, an executor service that runs FrameProcessors.

- SuperSorter, a class that uses frame channels and processors to
  do parallel external merge sort of any amount of data (as long as there
  is enough disk space).

* Additional tests, fixes.

* Changes from review.

* Better implementation for ReadableInputStreamFrameChannel.

* Rename getFrameFileReference -> newFrameFileReference.

* Add InterruptedException to runIncrementally; add more tests.

* Cancellation adjustments.

* Review adjustments.

* Refactor BlockingQueueFrameChannel, rename doneReading and doneWriting to close.

* Additional changes from review.

* Additional changes.

* Fix test.

* Adjustments.

* Adjustments.
2022-08-04 21:29:04 -07:00
Gian Merlino ef6811ef88
Improved Java 17 support and Java runtime docs. (#12839)
* Improved Java 17 support and Java runtime docs.

1) Add a "Java runtime" doc page with information about supported
   Java versions, garbage collection, and strong encapsulation..

2) Update asm and equalsverifier to versions that support Java 17.

3) Add additional "--add-opens" lines to surefire configuration, so
   tests can pass successfully under Java 17.

4) Switch openjdk15 tests to openjdk17.

5) Update FrameFile to specifically mention Java runtime incompatibility
   as the cause of not being able to use Memory.map.

6) Update SegmentLoadDropHandler to log an error for Errors too, not
   just Exceptions. This is important because an IllegalAccessError is
   encountered when the correct "--add-opens" line is not provided,
   which would otherwise be silently ignored.

7) Update example configs to use druid.indexer.runner.javaOptsArray
   instead of druid.indexer.runner.javaOpts. (The latter is deprecated.)

* Adjustments.

* Use run-java in more places.

* Add run-java.

* Update .gitignore.

* Exclude hadoop-client-api.

Brought in when building on Java 17.

* Swap one more usage of java.

* Fix the run-java script.

* Fix flag.

* Include link to Temurin.

* Spelling.

* Update examples/bin/run-java

Co-authored-by: Xavier Léauté <xl+github@xvrl.net>

Co-authored-by: Xavier Léauté <xl+github@xvrl.net>
2022-08-03 23:16:05 -07:00
Paul Rogers ffcb996468
Cleanup changes pulled out of PR #12368 (#12672)
This commit contains the cleanup needed for the new integration test framework.

Changes:
- Fix log lines, misspellings, docs, etc.
- Allow the use of some of Druid's "JSON config" objects in tests
- Fix minor bug in `BaseNodeRoleWatcher`
2022-06-23 23:19:50 +05:30
Paul Rogers 34a3d45737
Refactor ResponseContext (#11828)
* Refactor ResponseContext

Fixes a number of issues in preparation for request trailers
and the query profile.

* Converts keys from an enum to classes for smaller code
* Wraps stored values in functions for easier capture for other uses
* Reworks the "header squeezer" to handle types other than arrays.
* Uses metadata for visibility, and ability to compress,
  to replace ad-hoc code.
* Cleans up JSON serialization for the response context.
* Other miscellaneous cleanup.

* Handle unknown keys in deserialization

Also, make "Visibility" into a boolean.

* Revised comment

* Renamd variable
2021-12-06 17:03:12 -08:00
Paul Rogers a66f10eea1
Code cleanup from query profile project (#11822)
* Code cleanup from query profile project

* Fix spelling errors
* Fix Javadoc formatting
* Abstract out repeated test code
* Reuse constants in place of some string literals
* Fix up some parameterized types
* Reduce warnings reported by Eclipse

* Reverted change due to lack of tests
2021-11-30 11:35:38 -08:00
Sandeep b1de56a3be
update Druid Chart README doc and removes unnecessary lock file (#11945)
* update Druid Chart README doc and removes unnecessary lock file

* update Druid Chart README doc and removes unnecessary lock file
2021-11-22 21:34:26 +08:00
Chi Cao Minh 84c1c2505d
Web console basic end-to-end-test (#9595)
Load data and query (i.e., automate
https://druid.apache.org/docs/latest/tutorials/tutorial-batch.html) to
have some basic checks ensuring the web console is wired up to druid
correctly.

The new end-to-end tests (tutorial-batch.spec.ts) are added to
`web-console/e2e-tests`. Within that directory:
- `components` represent the various tabs of the web console. Currently,
  abstractions for `load data`, `ingestion`, `datasources`, and `query`
  are implemented.
- `components/load-data/data-connector` contains abstractions for the
  different data source options available to the data loader's `Connect`
  step. Currently, only the `Local file` data source connector is
  implemented.
- `components/load-data/config` contains abstractions for the different
  configuration options available for each step of the data loader flow.
  Currently, the `Configure Schema`, `Partition`, and `Publish` steps
  have initial implementation of their configuration options.
- `util` contains various helper methods for the tests and does not
  contain abstractions of the web console.

Changes to add the new tests to CI:
- `.travis.yml`: New "web console end-to-end tests" job
- `web-console/jest.*.js`: Refactor jest configurations to have
  different flavors for unit tests and for end-to-end tests. In
  particular, the latter adds a jest setup configuration to wait for the
  web console to be ready (`web-console/e2e-tests/util/setup.ts`).
- `web-console/package.json`: Refactor run scripts to add new script for
  running end-to-end tests.
- `web-console/script/druid`: Utility scripts for building, starting,
  and stopping druid.

Other changes:
- `pom.xml`: Refactor various settings disable java static checks and to
  disable java tests into two new maven profiles. Since the same
  settings are used in several places (e.g., .travis.yml, Dockerfiles,
  etc.), having them in maven profiles makes it more maintainable.
- `web-console/src/console-application.tsx`: Fix typo ("the the").
2020-04-09 12:38:09 -07:00
Clint Wylie 010f70b371
autogenerate NOTICE.BINARY from NOTICE and licenses.yaml (#8306)
* migrate binary notice entries to live in licenses.yaml, use licenses.yaml and NOTICE to generate NOTICE.BINARY at distribution time

* +x

* move release scripts to distribution/bin, fixup notice script, trim dependencies for avro and kerberos in licenses.yaml

* add missing hdfs-storage dependencies

* revert to old syntax, fixes

* formatting

* update notices for recently updated dependencies
2019-08-21 12:46:27 -07:00
Roman Leventov 782863ed0f Fix some problems reported by PVS-Studio (#7738)
* Fix some problems reported by PVS-Studio

* Address comments
2019-05-29 11:20:45 -07:00
Clint Wylie 6a6c6d573d
Add plain text README.txt, use relative link from README.md to build.md (#7611)
* use relative link to build instructions from top level readme

* add textfile to readme

* formatting

* make README.BINARY plaintext, move LABELS.md to LABELS, README.txt to README

* exclude README.BINARY still

* remove jdk links/recommmendations

* add script to use DRUIDVERSION in textfile README instead of latest, add links to recommended jdk to build.md

* license

* better readme template, links to latest if does not detect an apache release version

* fix
2019-05-09 21:29:26 -07:00
Jonathan Wei 5486c2abf8
Update LICENSE and NOTICE files (#7026)
* Update LICENSE and NOTICE files

* Update react-table version
2019-03-04 18:45:22 -08:00
QiuMM 765f46af5b git ignore dependency-reduced-pom.xml (#4711) 2017-08-23 10:10:50 -07:00
Bingkun Guo dffa89a5bf move distribution artifacts to distribution/target 2015-10-30 12:40:05 -05:00
Gian Merlino 022a577b7b .gitignore new distribution artifacts. 2015-10-29 12:44:18 -07:00
cheddar 8b480e55db Add docs from github wiki 2013-09-13 17:20:39 -05:00
cheddar 2361e0112a Make it all compile again... 2013-08-02 10:14:46 -07:00
Dhruv Parthasarathy 4e4a0a7953 removed DS_Store files 2013-06-25 14:58:12 -07:00
Eric Tschetter 0a0e2a6cc1 1) Try to fix the dependency issues for running the HadoopDruidIndexer locally. 2012-11-08 17:06:02 -08:00
Ian Brandt bd8d5bddd8 Ignored all Eclipse .settings folders. 2012-10-30 22:21:55 -07:00
Matt Croydon 9575326c00 Added twitter and rand example output files to .gitignore. 2012-10-24 11:54:24 -05:00
Eric Tschetter 9d41599967 Initial commit of OSS Druid Code 2012-10-24 03:39:51 -04:00