druid

Commit Graph

Author	SHA1	Message	Date
Kashif Faraz	2d77e1a3c6	Add support for multi dimension range partitioning (#11848 ) This PR adds support for range partitioning on multiple dimensions. It extends on the concept and implementation of single dimension range partitioning. The new partition type added is range which corresponds to a set of Dimension Range Partition classes. single_dim is now treated as a range type partition with a single partition dimension. The start and end values of a DimensionRangeShardSpec are represented by StringTuples, where each String in the tuple is the value of a partition dimension.	2021-11-06 12:50:17 +05:30
Gian Merlino	1c12dd97dc	Add javadocs to StringUtils.fromUtf8. (#11881 ) They clarify that the methods advance the position of the buffer.	2021-11-05 15:27:24 -07:00
Gian Merlino	8971056763	Properly count segment references in tests. (#11870 )	2021-11-05 12:49:10 -07:00
Clint Wylie	907e4ca0c5	use correct DimensionSpec with for column value selectors created from dictionary encoded column indexers (#11873 ) * use correct dimension spec for column value selectors of dictionary encoded column indexers	2021-11-05 01:51:15 -07:00
zachjsh	1d6df48145	Warn if cache size of lookup is beyond max size (#11863 ) Enhanced the ExtractionNamespace interface in lookups-cached-global core extension with the ability to set a maxHeapPercentage for the cache of the respective namespace. The reason for adding this functionality, is make it easier to detect when a lookup table grows to a size that the underlying service cannot handle, because it does not have enough memory. The default value of maxHeap for the interface is -1, which indicates that no maxHeapPercentage has been set. For the JdbcExtractionNamespace and UriExtractionNamespace implementations, the default value is null, which will cause the respective service that the lookup is loaded in, to warn when its cache is beyond mxHeapPercentage of the service's configured max heap size. If a positive non-null value is set for the namespace's maxHeapPercentage config, this value will be honored for all services that the respective lookup is loaded onto, and consequently log warning messages when the cache of the respective lookup grows beyond this respective percentage of the services configured max heap size. Warnings are logged every time that either Uri based or Jdbc based lookups are regenerated, if the maxHeapPercentage constraint is violated. No other implementations will log warnings at this time. No error is thrown when the size exceeds the maxHeapPercentage at this time, as doing so could break functionality for existing users. Previously the JdbcCacheGenerator generated its cache by materializing all rows of the underling table in memory at once; this made it difficult to log warning messages in the case that the results from the jdbc query were very large and caused the service to run out of memory. To help with this, this pr makes it so that the jdbc query results are instead streamed through an iterator.	2021-11-03 21:32:22 -04:00
Abhishek Agarwal	652e1491e0	Update default values for tuning parameters in kinesis data loader (#11867 )	2021-11-02 23:51:28 +05:30
Karan Kumar	cf27366b35	Fixing typos in docker build scripts (#11866 )	2021-11-02 23:50:52 +05:30
andreacyc	88bbc8e9e1	Add info for compation config dialog (#11847 ) * add-info-for-compation-config-dialog * correct the info * remove space typo * Revert "remove space typo" This reverts commit `28b28733ae`. * remove typo space * update snapshots for jest-test	2021-11-02 10:03:29 -07:00
Kashif Faraz	a22687ecbe	Add Broker config `druid.broker.segment.watchRealtimeNodes` (#11732 ) The new config is an extension of the concept of "watchedTiers" where the Broker can choose to add the info of only the specified tiers to its timeline. Similarly, with this config, Broker can choose to skip the realtime nodes and thus it would query only Historical processes for any given segment.	2021-11-02 12:38:42 +05:30
Katya Macedo	5e1dc843d1	Fix quickstart link (#11864 )	2021-11-02 13:27:53 +08:00
Nolan Emirot	cd6867844f	docs: update helm flag (#11721 ) In helm v3 the --name doesn't exist	2021-11-02 13:25:49 +08:00
Sandeep	52539de521	fixes data validation error using correct way to comment the license under templates (#11839 )	2021-11-02 09:32:47 +08:00
Maytas Monsereenusorn	ba2874ee1f	Support changing query granularity in Auto Compaction (#11856 ) * add queryGranularity * fix checkstyle * fix test	2021-11-01 15:18:44 -07:00
Clint Wylie	9bd2ccbb9b	SqlAggregationModuleTest now extends CalciteTestBase to ensure consistent string encoding (#11861 )	2021-11-01 15:11:40 -07:00
Will Xu	7af36fecff	Fix travis' link behind build badge (#11858 )	2021-11-01 07:26:30 -07:00
Karan Kumar	90640bb316	Support for hadoop 3 via maven profiles (#11794 ) Add support for hadoop 3 profiles . Most of the details are captured in #11791 . We use a combination of maven profiles and resource filtering to achieve this. Hadoop2 is supported by default and a new maven profile with the name hadoop3 is created. This will allow the user to choose the profile which is best suited for the use case.	2021-10-30 22:46:24 +05:30
Maytas Monsereenusorn	33d9d9bd74	Add rollup config to auto and manual compaction (#11850 ) * add rollup to auto and manual compaction * add unit tests * add unit tests * add IT * fix checkstyle	2021-10-29 10:22:25 -07:00
Jonathan Wei	a96aed021e	Fix indefinite WAITING batch task when lock is revoked (#11788 ) * Fix indefinite WAITING batch task when lock is revoked * Use revoked property on TaskLock * Update TimeChunkLockAcquireAction to return TaskLock for revoked locks	2021-10-27 17:49:15 -05:00
Liran Funaro	9ca8f1ec97	Remove IncrementalIndex template modifier (#11160 ) Co-authored-by: Liran Funaro <liran.funaro@verizonmedia.com>	2021-10-27 13:10:37 -07:00
Gian Merlino	fc95c92806	Remove OffheapIncrementalIndex and clarify aggregator thread-safety needs. (#11124 ) * Remove OffheapIncrementalIndex and clarify aggregator thread-safety needs. This patch does the following: - Removes OffheapIncrementalIndex. - Clarifies that Aggregators are required to be thread safe. - Clarifies that BufferAggregators and VectorAggregators are not required to be thread safe. - Removes thread safety code from some DataSketches aggregators that had it. (Not all of them did, and that's OK, because it wasn't necessary anyway.) - Makes enabling "useOffheap" with groupBy v1 an error. Rationale for removing the offheap incremental index: - It is only used in one rare scenario: groupBy v1 (which is non-default) in "useOffheap" mode (also non-default). So you have to go pretty deep into the wilderness to get this code to activate in production. It is never used during ingestion. - Its existence complicates developer efforts to reason about how aggregators get used, because the way it uses buffer aggregators is so different from how every other query engine uses them. - It doesn't have meaningful testing. By the way, I do believe that the given way the offheap incremental index works, it actually didn't require buffer aggregators to be thread-safe. It synchronizes on "aggregate" and doesn't call "get" until it has stopped calling "aggregate". Nevertheless, this is a bother to think about, and for the above reasons I think it makes sense to remove the code anyway. * Remove things that are now unused. * Revert removal of getFloat, getLong, getDouble from BufferAggregator. * OAK-related warnings, suppressions. * Unused item suppressions.	2021-10-26 08:05:56 -07:00
Vadim Ogievetsky	8ea9309168	Web console: update typescript 4.4 for faster build speeds (#11725 ) * update typescript * do not show pagination when there is only one page * update snapshots * fix pagination	2021-10-25 21:53:38 -07:00
Đặng Minh Dũng	4baebb231b	add `prometheus-emitter` to distribution (#11812 ) * add `prometheus-emitter` to distribution Signed-off-by: Đặng Minh Dũng <dungdm93@live.com> * add `druid-momentsketch` to distribution Signed-off-by: Đặng Minh Dũng <dungdm93@live.com>	2021-10-25 21:16:17 -07:00
Jihoon Son	07a232d7b4	Bump netty4 to 4.1.68; suppress CVE-2021-37136 and CVE-2021-37137 for netty3 (#11844 ) * bump netty4 to 4.1.68 * suppress CVE-2021-37136 and CVE-2021-37137 for netty3 * license	2021-10-25 21:09:15 -07:00
Vadim Ogievetsky	f2106d7621	Web console: Add segment size in bytes column and hide it by default (#11797 ) * add segment size column * allow hidden default column * fix tests * update e2e tests	2021-10-25 13:24:44 -07:00
Sergio Ferragut	000a5551fa	docker mem reqs (#11827 ) * docker mem reqs * Update docs/tutorials/docker.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: Sergio Ferragut <sergio.ferragut@imply.io> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2021-10-25 12:23:25 -07:00
Gian Merlino	8276c031c5	Add druid.sql.approxCountDistinct.function property. (#11181 ) * Add druid.sql.approxCountDistinct.function property. The new property allows admins to configure the implementation for APPROX_COUNT_DISTINCT and COUNT(DISTINCT expr) in approximate mode. The motivation for adding this setting is to enable site admins to switch the default HLL implementation to DataSketches. For example, an admin can set: druid.sql.approxCountDistinct.function = APPROX_COUNT_DISTINCT_DS_HLL * Fixes * Fix tests. * Remove erroneous cannotVectorize. * Remove unused import. * Remove unused test imports.	2021-10-25 12:16:21 -07:00
Lucas Capistrant	43383c73a8	refactor BalanceSegments#balanceServers to exit early if there is no work to be done (#11768 ) * remove useless call to balanceServers for move from decom servers when there are no decom servers * refactor approach to this PR but accomplish the same thing	2021-10-25 10:06:35 -05:00
Kashif Faraz	abac9e39ed	Revert permission changes to Supervisor and Task APIs (#11819 ) * Revert "Require Datasource WRITE authorization for Supervisor and Task access (#11718)" This reverts commit `f2d6100124`. * Revert "Require DATASOURCE WRITE access in SupervisorResourceFilter and TaskResourceFilter (#11680)" This reverts commit `6779c4652d`. * Fix docs for the reverted commits * Fix and restore deleted tests * Fix and restore SystemSchemaTest	2021-10-25 14:50:38 +05:30
Charles Smith	10c5fa93f1	remove dupe sentence (#11821 )	2021-10-25 14:48:20 +05:30
Vadim Ogievetsky	4354e43983	Use existing queryId if it exists (#11834 )	2021-10-23 19:02:39 -07:00
Gian Merlino	d4cace385f	SQL: Allow Scans to be used as outer queries. (#11831 ) * SQL: Allow Scans to be used as outer queries. This has been possible in the native query system for a while, but the capability hasn't yet propagated into the SQL layer. One example of where this is useful is a query like: SELECT * FROM (... LIMIT X) WHERE <filter> Because this expands the kinds of subquery structures the SQL layer will consider, it was also necessary to improve the cost calculations. These changes appear in PartialDruidQuery and DruidOuterQueryRel. The ideas are: - Attach per-column penalties to the output signature of each query, instead of to the initial projection that starts a query. This encourages moving projections into subqueries instead of leaving them on outer queries. - Only attach penalties to projections if there are actually expressions happening. So, now, projections that simply reorder or remove fields are free. - Attach a constant penalty to every outer query. This discourages creating them when they are not needed. The changes are generally beneficial to the test cases we have in CalciteQueryTest. Most plans are unchanged, or are changed in purely cosmetic ways. Two have changed for the better: - testUsingSubqueryWithLimit now returns a constant from the subquery, instead of returning every column. - testJoinOuterGroupByAndSubqueryHasLimit returns a minimal set of columns from the innermost subquery; two unnecessary columns are no longer there. * Fix various DS operator conversions. These were all implemented as direct conversions, which isn't appropriate because they do not actually map onto native functions. These are only usable as post-aggregations. * Test case adjustment.	2021-10-23 17:18:43 -07:00
Gian Merlino	98ecbb21cd	Remove CloseQuietly and migrate its usages to other methods. (#10247 ) * Remove CloseQuietly and migrate its usages to other methods. These other methods include: 1) New method CloseableUtils.closeAndWrapExceptions, which wraps IOExceptions in RuntimeExceptions for callers that just want to avoid dealing with checked exceptions. Most usages were migrated to this method, because it looks like they were mainly attempts to avoid declaring a throws clause, and perhaps were unintentionally suppressing IOExceptions. 2) New method CloseableUtils.closeInCatch, designed to properly close something in a catch block without losing exceptions. Some usages from catch blocks were migrated here, when it seemed that they were intended to avoid checked exception handling, and did not really intend to also suppress IOExceptions. 3) New method CloseableUtils.closeAndSuppressExceptions, which sends all exceptions to a "chomper" that consumes them. Nothing is thrown or returned. The behavior is slightly different: with this method, _all_ exceptions are suppressed, not just IOExceptions. Calls that seemed like they had good reason to suppress exceptions were migrated here. 4) Some calls were migrated to try-with-resources, in cases where it appeared that CloseQuietly was being used to avoid throwing an exception in a finally block. 🎵 You don't have to go home, but you can't stay here... 🎵 * Remove unused import. * Fix up various issues. * Adjustments to tests. * Fix null handling. * Additional test. * Adjustments from review. * Fixup style stuff. * Fix NPE caused by holder starting out null. * Fix spelling. * Chomp Throwables too.	2021-10-23 17:03:21 -07:00
Clint Wylie	44a7b09190	Revert "Missing Loader parameter in generate-binary-license and generate-binary-notice py scripts (#11815 )" (#11832 ) This reverts commit `a7ee646927`.	2021-10-23 08:34:26 -07:00
Gian Merlino	b7a4c79314	Null handling fixes for DS HLL and Theta sketches. (#11830 ) * Null handling fixes for DS HLL and Theta sketches. For HLL, this fixes an NPE when processing a null in a multi-value dimension. For both, empty strings are now properly treated as nulls (and ignored) in replace-with-default mode. Behavior in SQL-compatible mode is unchanged. * Fix expectation.	2021-10-22 19:09:00 -07:00
Gian Merlino	cb9bc15e95	Fix task report streaming in https setups. (#11739 ) * Fix task report streaming in https setups. * Trivial change to re-trigger ITs.	2021-10-22 19:07:29 -07:00
Clint Wylie	02b2057371	extract generic dictionary encoded column indexing and merging stuffs (#11829 ) * extract generic dictionary encoded column indexing and merging stuffs to pave the path towards supporting other types of dictionary encoded columns * spotbugs and inspections fixes * friendlier * javadoc * better name * adjust	2021-10-22 17:31:22 -07:00
Victoria Lim	43103632fb	Docs - add description on time origin (#11826 ) * add description on time origin * reorder parameter descriptions * add example of origin value	2021-10-22 14:57:13 -07:00
Clint Wylie	741b4ed516	add output type information to ExpressionPostAggregator (#11818 ) * add ColumnInspector argument to PostAggregator.getType to allow post-aggs to compute their output type based on input types * add test for test for coverage * simplify * Remove unused imports. Co-authored-by: Gian Merlino <gian@imply.io>	2021-10-22 13:52:51 -07:00
Arun Ramani	df4894afff	Fallback to /sys/fs root when looking for cgroups (#11810 ) ProcCgroupDiscoverer builds the cgroup directory by concatenating the proc mounts and proc cgroup paths together. This doesn't seem to work in Kubernetes if the execution context is within the container. Also this isn't consistent across all Linux OSes. The fix is to fallback to / as the root and it seems to work empirically.	2021-10-21 09:51:16 +05:30
Alexander Saydakov	8cf1cbc4a9	latest datasketches-java and datasketches-memory (#11773 ) * latest datasketches-java and datasketches-memory * updated versions of datasketches-java and datasketches-memory Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>	2021-10-19 23:42:30 -07:00
David Ferlay	a7ee646927	Missing Loader parameter in generate-binary-license and generate-binary-notice py scripts (#11815 )	2021-10-20 00:25:17 +05:30
Clint Wylie	187df58e30	better types (#11713 ) * better type system * needle in a haystack * ColumnCapabilities is a TypeSignature instead of having one, INFORMATION_SCHEMA support * fixup merge * more test * fixup * intern * fix * oops * oops again * ... * more test coverage * fix error message * adjust interning, more javadocs * oops * more docs more better	2021-10-19 01:47:25 -07:00
Sandeep	17459a84d3	Update link to helm chart quickstart guide (#11801 )	2021-10-19 14:10:40 +05:30
David Bar	7d4841471f	Optimize supervisor history retrieval for specific id (#11807 ) Optimization. Fetch from the metadata store only the relevant history items for the requested supervisor id.	2021-10-19 14:08:25 +05:30
TSFenwick	9c15f938fd	fix test issue where JettyTest would fail if JettyWithResponseFilterEnabledTest ran before it (#11803 ) this change ensures that JettyTest is setting the properties it needs in case some other test overwrites them this also changes up the ordering of the call for setProperties to call super's first in case super is setting the same property	2021-10-18 12:42:41 -07:00
Charles Smith	938c1493e5	edits to kafka inputFormat (#11796 ) * edits to kafka inputFormat * revise conflict resolution description * tweak for clarity * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * style fixes * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/ingestion/data-formats.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2021-10-15 14:01:10 -07:00
Charles Smith	6089a168ea	Docs - update dynamic config provider topic (#11795 ) * update dynamic config provider * update topic * add examples for dynamic config provider: * Update docs/development/extensions-core/kafka-ingestion.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/development/extensions-core/kafka-ingestion.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/development/extensions-core/kafka-ingestion.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/operations/dynamic-config-provider.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/operations/dynamic-config-provider.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/operations/dynamic-config-provider.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/operations/dynamic-config-provider.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/development/extensions-core/kafka-ingestion.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/operations/dynamic-config-provider.md Co-authored-by: Clint Wylie <cjwylie@gmail.com> * Update docs/operations/dynamic-config-provider.md Co-authored-by: Clint Wylie <cjwylie@gmail.com> * Update kafka-ingestion.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2021-10-14 17:51:32 -07:00
Abhishek Agarwal	4f62905be0	Fix the travis build (#11799 )	2021-10-14 16:31:51 +05:30
Agustin Gonzalez	887cecf29e	Simplify ITHttpInputSourceTest to mitigate flakiness (#11751 ) * Increment retry count to add more time for tests to pass * Re-enable ITHttpInputSourceTest * Restore original count * This test is about input source, hash partitioning takes longer and not required thus changing to dynamic * Further simplify by removing sketches	2021-10-12 11:51:27 -05:00
andreacyc	adb2237628	Fix CVE-2021-3749 reported in security vulnerabilities job (#11786 ) * Fix CVE-2021-3749 reported in security vulnerabilities job * test why test fail * update axios * remove console log for testing	2021-10-08 23:02:58 -07:00

1 2 3 4 5 ...

11325 Commits All Branches Search

11325 Commits

All Branches