Commit Graph

12481 Commits

Author SHA1 Message Date
Tejaswini Bandlamudi 7103cb4b9d
Removes FiniteFirehoseFactory and its implementations (#12852)
The FiniteFirehoseFactory and InputRowParser classes were deprecated in 0.17.0 (#8823) in favor of InputSource & InputFormat. This PR removes the FiniteFirehoseFactory and all its implementations along with classes solely used by them like Fetcher (Used by PrefetchableTextFilesFirehoseFactory). Refactors classes including tests using FiniteFirehoseFactory to use InputSource instead.
Removing InputRowParser may not be as trivial as many classes that aren't deprecated depends on it (with no alternatives), like EventReceiverFirehoseFactory. Hence FirehoseFactory, EventReceiverFirehoseFactory, and Firehose are marked deprecated.
2023-03-02 18:07:17 +05:30
Nicholas Lippis 1aae37f7d6
Fix expectedSingleiContainerOutput.yaml spelling (#13870) 2023-03-02 00:07:15 -08:00
Abhishek Radhakrishnan 775f89c75b
Include workaround for CycloneDX is causing POM build errors to web-console as well (#13874) 2023-03-02 00:06:49 -08:00
Clint Wylie 38ac71ee56
one version of mockito is more than enough (#13871) 2023-03-01 23:27:18 -08:00
Nicholas Lippis d32dc1b0c9
Remove K8sOverlordConfig.java (#13866) 2023-03-02 09:43:48 +05:30
Elliott Freis d046cee3e4
Workaround for CycloneDX is causing POM build errors (#13867)
Co-authored-by: Elliott Freis <elliottfreis@Elliott-Freis.earth.dynamic.blacklight.net>
2023-03-01 16:47:13 -08:00
Apoorv Gupta b26f1b4a5d
Update datasources.md: Fix Documentation. (#13865)
Fixed documentation to clarify that union query cant be run over query datasources.
2023-03-01 20:29:15 +05:30
Laksh Singla ca68fd93a6
Generate tombstones when running MSQ's replace (#13706)
*When running REPLACE queries, the segments which contain no data are dropped (marked as unused). This PR aims to generate tombstones in place of segments which contain no data to mark their deletion, as is the behavior with the native ingestion.

This will cause InsertCannotReplaceExistingSegmentFault to be removed since it was generated if the interval to be marked unused didn't fully overlap one of the existing segments to replace.
2023-03-01 12:01:30 +05:30
Clint Wylie 6cf754b0e0
move numeric null value coercion out of expression processing engine (#13809)
* move numeric null value coercion out of expression processing engine
* add ExprEval.valueOrDefault() to allow consumers to automatically coerce to default values
* rename Expr.buildVectorized as Expr.asVectorProcessor more consistent naming with Function and ApplyFunction; javadocs for some stuff
2023-02-28 18:10:07 -08:00
Elliott Freis faf602108b
We never want to restore caches that aren't an exact match to the commit SHA (#13863)
Co-authored-by: Elliott Freis <elliottfreis@Elliott-Freis.earth.dynamic.blacklight.net>
2023-02-28 13:59:49 -08:00
AdheipSingh 22e516fd53
Update kubernetes.md (#13858) 2023-02-28 11:20:24 -08:00
Clint Wylie 1d8fff4096
sampler + type detection = bff (#13711)
* sampler + type detection = bff
* split logical and physical dimensions, tidy up
2023-02-28 04:14:30 -08:00
Vadim Ogievetsky 13721f5998
upgrade druid query toolkit (#13848) 2023-02-28 14:34:21 +05:30
Kashif Faraz 12f62e2c42
Clarify doc of ingest/handoff/time metric (#13856) 2023-02-28 10:37:47 +05:30
Gian Merlino aeb1187a7d
Fix NPE in KinesisSupervisor#setupRecordSupplier. (#13859)
* Fix NPE in KinesisSupervisor#setupRecordSupplier.

PR #13539 refactored record supplier creation and introduced a bug:
this method would throw NPE when recordsPerFetch was not provided
by the user. recordsPerFetch isn't needed in this context at all,
since the supervisor-side supplier doesn't fetch records. So this
patch sets it to zero.

* Remove unused imports.
2023-02-27 19:55:28 -08:00
Tejaswini Bandlamudi e2461c21c4
fix flaky BatchIndex IT failures. (#13855) 2023-02-27 17:23:14 -08:00
Gian Merlino 6f7f391762
Remove unused imports. (#13860)
Crept in during #13842. Possibly logical conflict with another PR.
2023-02-27 15:14:34 -08:00
Suneet Saldanha 31c7de1087
Make CompactionSearchPolicy injectable (#13842)
* Make CompactionSearchPolicy injectable

A small refactoring that makes the search policy for compaction injectable.

Future changes can introduce new search policies that can be configured and
injected so that operators can choose which search policy is best suited for
their cluster.

This will also allow us to de-couple the scheduling of compaction jobs from
the CompactSegments duty, allowing the co-ordinator to schedule compaction
jobs faster than the duty lifecycle.

This PR is made so that it easy to review the future changes.

* fix tests
2023-02-27 07:57:03 -08:00
Abhishek Agarwal 48f4330100
Make leader redirection work when both plainText and TLS ports are set (#13847)
When both plainText and TLS ports are set in druid, the redirection to a different leader node can fail. This is caused by how we compare a redirect path and the leader locations registered with a druid node. While the registered location has both plainText and TLS port set, the redirect path only has one port since it's a URI.
2023-02-26 21:23:29 +05:30
Kashif Faraz 54da38b508
Add missing license for jakarta.activation against module druid-avro-extensions (#13845) 2023-02-26 17:06:23 +05:30
Karan Kumar 6bb5effa7b
Better logging for MSQ worker task (#13790)
* Adding more logs to MSQ worker implementation which makes it easier to debug.
2023-02-26 03:24:24 +05:30
Victoria Lim e46379ba7a
Docs: Update name of the metadata tables (#13734)
* Update name of the metadata tables

* emend spelling file

* fix spelling

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-02-23 13:57:59 -08:00
tejasparbat d74d6824ec
update LDAP endpoint (#13839)
Current DOC at step https://druid.apache.org/docs/latest/operations/auth-ldap.html#add-an-ldap-user-to-druid-and-assign-a-role
Example request to add the LDAP user myuser to Druid:
curl -i -v  -H "Content-Type: application/json" -u internal -X POST http://localhost:8081/druid-ext/basic-security/authentication/db/ldap/users/myuser 
Example request to assign the myuser user to the queryRole role:
curl -i -v  -H "Content-Type: application/json" -u internal -X POST http://localhost:8081/druid-ext/basic-security/authentication/db/ldap/users/myuser/roles/queryRole

Expected:
Example request to add the LDAP user myuser to Druid:
curl -i -v  -H "Content-Type: application/json" -u internal -X POST http://localhost:8081/druid-ext/basic-security/authorization/db/ldapauth/users/myuser 
Example request to assign the myuser user to the queryRole role
curl -i -v  -H "Content-Type: application/json" -u internal -X POST http://localhost:8081/druid-ext/basic-security/authorization/db/ldapauth/users/myuser/roles/queryRole
2023-02-23 13:55:06 -08:00
Vadim Ogievetsky e4e6c7ed01
make completions smarter (#13830) 2023-02-23 10:17:10 -08:00
Win Min Soe 70f9052f1d
docs: update correct config base on server spec (#13832)
Co-authored-by: Winn Minn <winn.minn@grabtaxi.com>
2023-02-23 08:50:47 -08:00
Elliott Freis 471cb82af3
Use built-in java rather than spending time to install (#13834)
Co-authored-by: Elliott Freis <elliottfreis@Elliott-Freis.earth.dynamic.blacklight.net>
2023-02-23 08:50:04 -08:00
Jason Witkowski f7a5fcf30f
helm: Add serviceAccounts, rbac, and small fixes (#13747)
Update suggested segment-cache path, Allow for per-service serviceAccounts in druid helm chart and finer-grained RBAC, and add a default annotation to historical statefulset.
2023-02-23 11:42:03 +05:30
Abhishek Radhakrishnan 786172947e
Skip tests when there are markdown only changes. (#13836) 2023-02-23 11:39:53 +05:30
hqx871 79f04e71a1
Hadoop based batch ingestion support range partition (#13303)
This pr implements range partitioning for hadoop-based ingestion. For detail about multi dimension range partition can be seen #11848.
2023-02-23 11:38:03 +05:30
Abhishek Radhakrishnan 17a3cd0b68
Remove the additional backtick that's causing a SA issue. (#13838) 2023-02-23 09:01:08 +05:30
benkrug 66034dd8bc
Update default for finalize in query-context.md (#13763)
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

---------

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
2023-02-22 12:35:36 -08:00
Paul Rogers 914eebb4b7
Wire up the catalog resolver (#13788)
Introduces the catalog resolver interface
Wires the resolver up to the planner factory
Refactors planner factory
2023-02-22 11:42:32 -08:00
Katya Macedo 1595653e6f
docs: add a link for the Druid SQL tutorial (#13468)
* docs: add juptyer API tutorial for API and jupyter tutorial index (#3)

(cherry picked from commit aeb8d9e3390fa26d9c533dce0862295b80c58583)

* update prereqs and fix jupyterlab name

* Removing notebook since 13345 has it

13345 should be merged first

* update contributing instructions

* docs: link to the  Druid SQL tutorial

* Add link to partitioning

* fix merge conflict

* Saving

* Update docs/tutorials/tutorial-jupyter-index.md

* Remove partitioning

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
Co-authored-by: brian.le <brian.le@imply.io>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-02-22 09:36:13 -08:00
Adarsh Sanjeev aceeac91d4
Fix MSQ IT test (#13808) 2023-02-22 08:14:46 -08:00
Abhishek Agarwal d2dbb8b2c0
Fix infinite checkpointing between tasks and overlord (#13825)
If the intermediate handoff period is less than the task duration and there is no new data in the input topic, task will continuously checkpoint the same offsets again and again. This PR fixes that bug by resetting the checkpoint time even when the task receives the same end offset request again.
2023-02-22 19:25:59 +05:30
Kashif Faraz 3a67a43c8a
Add method SegmentTimeline.addSegments (#13831) 2023-02-21 23:58:01 -08:00
317brian 07883e311e
doc: fix unnecessary link (#13785)
CI errors look unrelated to this change.
2023-02-21 17:34:46 -08:00
zachjsh 665dee43bf
Revert "Operator conversion deny list (#13766)" (#13829)
This reverts commit 38e620aa4c.
2023-02-21 15:14:49 -08:00
Abhishek Radhakrishnan 9e9976001c
Add ANSI_QUOTES propety to DBI init in lookups. (#13826) 2023-02-21 15:13:22 -08:00
Abhishek Radhakrishnan 8595271b55
Fixup typos in integration-test README. (#13828) 2023-02-21 15:12:37 -08:00
Paul Rogers 5dadbdf4d0
Generate the IT docker-compose.yaml files (#13669)
Generate IT docker-compose.sh files

Generates test-specific docker-compose.sh files using a simple
Python template script.
2023-02-21 15:03:02 -08:00
benkrug c6b1576fc1
Update clean-metadata-store.md (#13131)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-02-21 12:53:54 -08:00
Clint Wylie 614205f3bc
fix some intellij inspections in druid-processing (#13823)
fix some intellij inspections in druid-processing
2023-02-21 09:02:02 +05:30
Lucas Capistrant 46eafa57e1
Improve client change counter management in HTTP Server View (#13010)
* Avoid calling resolveWaitingFutures if there are no changes made

* Avoid telling HTTP serveview client to reset counter when their counter is valid
2023-02-20 17:32:27 +05:30
Tejaswini Bandlamudi e788f1ae6b
Add option to run standard & revised ITs manually on PRs (#13814)
Create the docker image in case of maven dependencies cache restore failure too as env.sh file is removed on maven rebuild.
Increase java heap size for security IT failing with error
2023-02-20 16:15:15 +05:30
Gian Merlino 882ae9f002
Speed up composite key joins on IndexedTable. (#13516)
* Speed up composite key joins on IndexedTable.

Prior to this patch, IndexedTable indexes are sorted IntList. This works
great when we have a single-column join key: we simply retrieve the list
and we know what rows match. However, when we have a composite key, we
need to merge the sorted lists. This is inefficient when one is very dense
and others are very sparse.

This patch switches from sorted IntList to IntSortedSet, and changes
to the following intersection algorithm:

1) Initialize the intersection set to the smallest matching set from the
   various parts of the composite key.

2) For each element in that smallest set, check other sets for that element.
   If any do *not* include it, then remove the element from the intersection
   set.

This way, complexity scales with the size of the smallest set, not the
largest one.

* RangeIntSet stuff.
2023-02-17 22:01:01 -08:00
Paul Rogers 85d36be085
Information schema now uses numeric column types (#13777)
Change to use SQL schemas to allow null numeric columns

* Updated docs
2023-02-17 14:39:31 -08:00
Clint Wylie 08b5951cc5
merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything (#13698)
* merge druid-core, extendedset, and druid-hll into druid-processing to simplify everything
* fix poms and license stuff
* mockito is evil
* allow reset of JvmUtils RuntimeInfo if tests used static injection to override
2023-02-17 14:27:41 -08:00
Abhishek Agarwal 8d03ace1b4
Use K3S instead of minikube for integration tests (#13782)
We are seeing failures on GHA while using minikube so switching to K3S instead.
2023-02-17 23:06:30 +05:30
Katya Macedo bc8b710b7e
Fix broken link (#13767) 2023-02-17 09:02:12 -08:00