Much improved table functions
* Revises properties, definitions in the catalog
* Adds a "table function" abstraction to model such functions
* Specific functions for HTTP, inline, local and S3.
* Extended SQL types in the catalog
* Restructure external table definitions to use table functions
* EXTEND syntax for Druid's extern table function
* Support for array-valued table function parameters
* Support for array-valued SQL query parameters
* Much new documentation
Support both indexer and MM in ITs
Support for the DRUID_INTEGRATION_TEST_INDEXER variable
Conditional client cluster configuration
Cleanup of OVERRIDE_ENV file handling
Enforce setting of test-specific env vars
Cleanup of unused bits
* Semantic Implementations for ArrayListRAC
This adds implementations of semantic interfaces
to optimize (eliminate object creation) the
window processing on top of an ArrayListSegment.
Tests are also added to cover the interplay
between the semantic interfaces that are expected
for this use case
* migrate UTs form Travis to GHA
* update permissions
* rename file
* set fetch depth to 1
* debugs remote branches
* test with github.ref variable
* fetch github.base_ref for diff
* nit
* test git diff
* run tests
* test code coverage failure scenario
* nit
* nit
* revert code changes
* revert code changes
* Setup diff-test-coverage before tests
* build distribution module at end in packaging check
* nit
* remove redundant steps in static-checks workflow
* drop jdk8 unit tests from Travis
* Kinesis: More robust default fetch settings.
1) Default recordsPerFetch and recordBufferSize based on available memory
rather than using hardcoded numbers. For this, we need an estimate
of record size. Use 10 KB for regular records and 1 MB for aggregated
records. With 1 GB heaps, 2 processors per task, and nonaggregated
records, recordBufferSize comes out to the same as the old
default (10000), and recordsPerFetch comes out slightly lower (1250
instead of 4000).
2) Default maxRecordsPerPoll based on whether records are aggregated
or not (100 if not aggregated, 1 if aggregated). Prior default was 100.
3) Default fetchThreads based on processors divided by task count on
Indexers, rather than overall processor count.
4) Additionally clean up the serialized JSON a bit by adding various
JsonInclude annotations.
* Updates for tests.
* Additional important verify.
* single typed "root" only nested columns now mimic "regular" columns of those types
* incremental index can now use nested column indexer instead of string indexer for discovered columns
* reword single server page
* fix typo
* Update docs/operations/single-server.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* spelling
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Quote and escape table, key and column names.
* fix typo.
* More select statements.
* Derby lookup tests create quoted identifiers so it's compatible.
* Use Stringutils.replace() utility.
* quote the filter string.
* Squish doubly quote usage into a single function.
* Add parameterized test with reserved identifiers.
* few changes.
* Addition of NaiveSortMaker and Default implementation
Add the NaiveSortMaker which makes a sorter
object and a default implementation of the
interface.
This also allows us to plan multiple different window
definitions on the same query.
* Validate response headers and fix exception logging
A class of QueryException were throwing away their
causes making it really hard to determine what's
going wrong when something goes wrong in the SQL
planner specifically. Fix that and adjust tests
to do more validation of response headers as well.
We allow 404s and 307s to be returned even without
authorization validated, but others get converted to 403
* New IT Framework - InputSource and InputFormat Tests
* Fixing checkstyle errors
* Updating InputSource setup
* Updating queries to use druid DB
* Making metadata setup queries to be idempotent
* Restore intellij files
Changes:
- Remove specification of a Druid version in the quickstart, because the previous step
instructs downloading the latest version anyway.
- Mention usage of memory parameter in the quickstart
* perf: provide a custom utf8 specific buffered line iterator (benchmark)
Benchmark Mode Cnt Score Error Units
JsonLineReaderBenchmark.baseline avgt 15 3459.871 ± 106.175 us/op
* perf: provide a custom utf8 specific buffered line iterator
Benchmark Mode Cnt Score Error Units
JsonLineReaderBenchmark.baseline avgt 15 3022.053 ± 51.286 us/op
* perf: provide a custom utf8 specific buffered line iterator (more tests)
* perf: provide a custom utf8 specific buffered line iterator (pr feedback)
Ensure field visibility is as limited as possible
Null check for buffer in constructor
* perf: provide a custom utf8 specific buffered line iterator (pr feedback)
Remove additional 'finished' variable.
* perf: provide a custom utf8 specific buffered line iterator (more tests and bugfix)
* Unify the handling of HTTP between SQL and Native
The SqlResource and QueryResource have been
using independent logic for things like error
handling and response context stuff. This
became abundantly clear and painful during a
change I was making for Window Functions, so
I unified them into using the same code for
walking the response and serializing it.
Things are still not perfectly unified (it would
be the absolute best if the SqlResource just
took SQL, planned it and then delegated the
query run entirely to the QueryResource), but
this refactor doesn't take that fully on.
The new code leverages async query processing
from our jetty container, the different
interaction model with the Resource means that
a lot of tests had to be adjusted to align with
the async query model. The semantics of the
tests remain the same with one exception: the
SqlResource used to not log requests that failed
authorization checks, now it does.