druid

Commit Graph

Author	SHA1	Message	Date
Arun Ramani	15789137a3	Add cpu/cpuset cgroup and procfs data gathering (#11763 ) * cpu/cpuset cgroup and procfs data gathering * Renames and default values * Formatting * Trigger Build * Add cgroup monitors * Return 0 if no period * Update Co-authored-by: arunramani-imply <84351090+arunramani-imply@users.noreply.github.com>	2021-10-06 20:27:36 -07:00
Maytas Monsereenusorn	a04b08e45c	Add new config to filter internal Druid-related messages from Query API response (#11711 ) * add impl * add impl * add tests * add unit test * fix checkstyle * address comments * fix checkstyle * fix checkstyle * fix checkstyle * fix checkstyle * fix checkstyle * address comments * address comments * address comments * fix test * fix test * fix test * fix test * fix test * change config name * change config name * change config name * address comments * address comments * address comments * address comments * address comments * address comments * fix compile * fix compile * change config * add more tests * fix IT	2021-09-29 12:55:49 +07:00
Clint Wylie	fe1d8c206a	bump version to 0.23.0-SNAPSHOT (#11670 )	2021-09-08 15:56:04 -07:00
Atul Mohan	dcee99df78	Improve error message when buckets are null for cloud objects (#11644 ) * Add error message * Add test * Checkstyle	2021-09-07 17:31:17 -07:00
Sandeep	ac2b65e837	fixes possible data truncation (#11462 ) * fixes possible data truncation * fixes possible data truncation * add unit test case to catch the possible data truncation	2021-08-26 20:16:26 +08:00
Jihoon Son	2a658acad4	Put sleep in an extension (#11632 ) * Put sleep in an extension * dependency	2021-08-25 01:27:45 -07:00
Jihoon Son	78b4be467e	Add sleep function for testing (#11626 ) * Add sleep function for testing * sql function * javadoc	2021-08-24 14:30:31 +07:00
Yi Yuan	bf863343f8	delete some code (#11552 ) Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-08-16 10:40:40 -07:00
Parag Jain	c7b46671b3	option to use deep storage for storing shuffle data (#11507 ) Fixes #11297. Description Description and design in the proposal #11297 Key changed/added classes in this PR DataSegmentPusher ShuffleClient PartitionStat PartitionLocation *IntermediaryDataManager	2021-08-13 16:40:25 -04:00
frank chen	e40be0ae28	Add SQL functions to format numbers into human readable format (#10635 ) * add binary_byte_format/decimal_byte_format/decimal_format * clean code * fix doc * fix review comments * add spelling check rules * remove extra param * improve type handling and null handling * remove extra zeros * fix tests and add space between unit suffix and number as most size-format functions do * fix tests * add examples * change function names according to review comments * fix merge Signed-off-by: frank chen <frank.chen021@outlook.com> * no need to configure NullHandling explicitly for tests Signed-off-by: frank chen <frank.chen021@outlook.com> * fix tests in SQL-Compatible mode Signed-off-by: frank chen <frank.chen021@outlook.com> * Resolve review comments * Update SQL test case to check null handling * Fix intellij inspections * Add more examples * Fix example	2021-08-13 10:27:49 -07:00
Harini Rajendran	ccd362d228	Fix FileIteratingFirehoseTest to extend NullHandlingTest (#11581 )	2021-08-12 08:26:04 -07:00
Yi Yuan	23d7d71ea5	Add Environment Variable DynamicConfigProvider (#11377 ) * add_environment_variable_DynamicConfigProvider * fix code * code fixed * code fixed * add document * fix doc * fix doc * add more unit test * fix style * fix document * bug fixed * fix unit test * fix comment * fix test Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-08-04 20:26:58 -07:00
wx930910	578625b771	Replace TestInputRowHandler with mocking object (#11529 ) * Replace TestInputRowHandler with mocking object * Change EasyMock object to Mockito object. Make test logic concise * correct code format	2021-08-04 16:45:22 -07:00
Yi Yuan	aa7cb50f24	Add DynamicConfigProvider for Schema Registry (#11362 ) * add_DynamicConfigProvider_for_schema_registry * bug fixed * add document * fix document * fix spot bug * fix document * inject ObjectMapper * add DynamicConfigProviderUtils * add UT * bug fixed Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-08-03 13:24:52 -07:00
Agustin Gonzalez	a2da407b70	Add error msg to parallel task's TaskStatus (#11486 ) * Add error msg to parallel task's TaskStatus * Consolidate failure block * Add failure test * Make it fail * Add fail while stopped * Simplify hash task test using a runner that fails after so many runs (parameter) * Remove unthrown exception * Use runner names to identify phase * Added range partition kill test & fixed a timing bug with the custom runner * Forbidden api * Style * Unit test code cleanup * Added message to invalid state exception and improved readability of the phase error messages for the parallel task failure unit tests	2021-08-02 12:11:28 -07:00
Xavier Léauté	4bca7f014e	update error-prone to 2.8.0 with fix for crashing check (#11494 ) * error-prone 2.8.0 fixes https://github.com/google/error-prone/issues/2396 * fix for a few ignored return values * fix unknown args in sub-modules	2021-07-29 09:13:46 -07:00
Jihoon Son	8729b40893	Add the error message in taskStatus for task failures in overlord (#11419 ) * add error messages in taskStatus for task failures in overlord * unused imports * add helper message for logs to look up * fix tests * fix counting the same task failures more than once * same fix for HttpRemoteTaskRunner	2021-07-15 13:14:28 -07:00
Suneet Saldanha	49e8732e4f	Display errors for invalid timezones in TIME_FORMAT (#11423 ) Users sometimes make typos when picking timezones - like `America/Los Angeles` instead of `America/Los_Angeles` instead of defaulting to UTC, this change makes it so that an error is thrown instead notifying the user of their mistake.	2021-07-09 06:07:13 -07:00
Clint Wylie	63fcd77c38	support using mariadb connector with mysql extensions (#11402 ) * support using mariadb connector with mysql extensions * cleanup and more tests * fix test * javadocs, more tests, etc * style and more test * more test more better * missing pom * more pom	2021-07-08 12:25:37 -07:00
Clint Wylie	17efa6f556	add single input string expression dimension vector selector and better expression planning (#11213 ) * add single input string expression dimension vector selector and better expression planning * better * fixes * oops * rework how vector processor factories choose string processors, fix to be less aggressive about vectorizing * oops * javadocs, renaming * more javadocs * benchmarks * use string expression vector processor with vector size 1 instead of expr.eval * better logging * javadocs, surprising number of the the * more * simplify	2021-07-06 11:20:49 -07:00
frank chen	906a704c55	Eliminate ambiguities of KB/MB/GB in the doc (#11333 ) * GB ---> GiB * suppress spelling check * MB --> MiB, KB --> KiB * Use IEC binary prefix * Add reference link * Fix doc style	2021-06-30 13:42:45 -07:00
Clint Wylie	df9b57aa1a	bitwise aggregators, better null handling options for expression agg (#11280 ) * bitwise aggregators, better nulls for expression agg * correct behavior * rework deserialize, better names * fix json, share mask	2021-06-25 16:51:16 -07:00
Xavier Léauté	712f2a5d00	upgrade error-prone to 2.7.1 and support checks with Java 11+ (#11363 ) * upgrade error-prone to 2.7.1 and support checks with Java 11+ - upgrade error-prone to 2.7.1 - support running error-prone with Java 11 and above using -Xplugin instead of custom compiler - add compiler arguments to ignore warnings/errors in Java 15/16 - introduce strictCompile property to enable strict profiles since we now need multiple strict profiles for Java 8 - properly exclude all generated source files from error-prone - fix druid-processing overriding annotation processors from parent pom - fix druid-core disabling most non-default checks - align plugin and annotation errorprone versions - fix / suppress additional issues found by error-prone: * fix bug in SeekableStreamSupervisor initializing ArrayList size with the taskGroupdId * fix missing @Override annotations - remove outdated compiler plugin in benchmarks - remove deleted ParameterPackage error-prone rule - re-enable checks on benchmark module as well * fix IntelliJ inspections * disable LongFloatConversion due to bug in error-prone with JDK 8 * add comment about InsecureCrypto	2021-06-16 12:55:34 -07:00
Clint Wylie	bfbd7ec432	fix a bugs related to SQL type inference return type nullability (#11327 ) * fix a bunch of type inference nullability bugs * fixes * style * fix test * fix concat	2021-06-15 12:26:59 -07:00
Clint Wylie	920aa414ca	enrich expression cache key information to support expressions which depend on external state (#11358 ) * enrich expression cache key information to support expressions which depend on external state such as lookups * cache rules everything around me * low carb * rename	2021-06-14 17:26:43 -07:00
dependabot[bot]	167044f715	Bump fastutil from 8.2.3 to 8.5.4 (#11347 ) * Bump fastutil from 8.2.3 to 8.5.4 Bumps [fastutil](https://github.com/vigna/fastutil) from 8.2.3 to 8.5.4. - [Release notes](https://github.com/vigna/fastutil/releases) - [Changelog](https://github.com/vigna/fastutil/blob/master/CHANGES) - [Commits](https://github.com/vigna/fastutil/compare/8.2.3...8.5.4) --- updated-dependencies: - dependency-name: it.unimi.dsi:fastutil dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * update licenses.yaml * update maven dependency list for -core and -extra libraries to pass maven dependency checks Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xavier Léauté <xvrl@apache.org>	2021-06-10 07:43:18 -07:00
Maytas Monsereenusorn	e5633d7842	Fix bug: 502 bad gateway thrown when we edit/delete any auto compaction config created 0.21.0 or before (#11311 ) * fix bug * add test * fix IT * fix checkstyle * address comments	2021-05-27 16:34:32 -07:00
Clint Wylie	2bfcee5824	Fix issue with empty array converting to string expression instead of string array (#11270 )	2021-05-22 09:31:28 +08:00
Clint Wylie	6d08a7051e	fix bug with aggregator expressions on realtime index with string columns always producing 0 values (#11185 ) * fix bug with aggregator expressions on realtime index with string columns always producing 0 values * more test * rework some stuff * javadocs	2021-05-17 11:59:13 -07:00
Clint Wylie	3649c608d2	array handling improvements (#11233 ) * fix jdbc array handling, split handling for some array and multi value operator, split and add more tests * formatting	2021-05-13 18:50:32 -07:00
Clint Wylie	790262e5d0	add estimated byte size limit enforcement for heap based expression aggregator (#11236 )	2021-05-12 01:21:50 -07:00
Maytas Monsereenusorn	3455352241	Add feature to automatically remove compaction configurations for inactive datasources (#11232 ) * add auto cleanup * add auto cleanup * add auto cleanup * add tests * add tests * use retryutils * use retryutils * use retryutils * address comments	2021-05-11 18:49:18 -07:00
Maytas Monsereenusorn	3a660bc6ee	Make sure updating coordinator config is protected against race condition (#11144 ) * Make sure changing coordinator config is protected against concurrent updates * Make sure updating coordinator config is protected against race condition * add retry * fix checkstyle * add tests * add tests * add more tests * add tests * fix * fix checkstyle	2021-05-10 13:58:08 -07:00
Jihoon Son	2df42143ae	Fix idempotence of segment allocation and task report apis in native batch ingestion (#11189 ) * Fix idempotence of segment allocation and task report apis in native batch ingestion * better error and javadoc * checkstyle and dependency * fix tests and add more tests * task config instead of context; add doc * unused import and dependency * typo in doc * fix unintended changes * fix wrong import * remove unnecessary error handling * add task context back * default task context * fix test and doc * address comments * unused imports	2021-05-07 14:29:48 -07:00
Clint Wylie	554f1ffeee	ARRAY_AGG sql aggregator function (#11157 ) * ARRAY_AGG sql aggregator function * add javadoc * spelling * review stuff, return null instead of empty when nil input * review stuff * Update sql.md * use type inference for finalize, refactor some things	2021-05-03 22:17:10 -07:00
Gian Merlino	ad028de538	InDimFilter: Fix NPE involving certain Set types. (#11169 ) * InDimFilter: Fix NPE involving certain Set types. Normally, InDimFilters that come from JSON have HashSets for "values". However, programmatically-generated filters (like the ones from #11068) may use other set types. Some set types, like TreeSets with natural ordering, will throw NPE on "contains(null)", which causes the InDimFilter's ValueMatcher to throw NPE if it encounters a null value. This patch adds code to detect if the values set can support contains(null), and if not, wrap that in a null-checking lambda. Also included: - Remove unneeded NullHandling.needsEmptyToNull method. - Update IndexedTableJoinable to generate a TreeSet that does not require lambda-wrapping. (This particular TreeSet is how I noticed the bug in the first place.) * Test fixes. * Improve test coverage	2021-04-28 14:13:42 -07:00
Clint Wylie	57ff1f9cdb	expression aggregator (#11104 ) * add experimental expression aggregator * add test * fix lgtm * fix test * adjust test * use not null constant * array_set_concat docs * add equals and hashcode and tostring * fix it * spelling * do multi-value magic for expression agg, more javadocs, tests * formatting * fix inspection * more better * nullable	2021-04-22 18:30:16 -07:00
Maytas Monsereenusorn	6d2b5cdd7e	Add feature to automatically remove audit logs based on retention period (#11084 ) * add docs * add impl * fix checkstyle * fix test * add test * fix checkstyle * fix checkstyle * fix test * Address comments * Address comments * fix spelling * fix docs	2021-04-20 17:10:43 -07:00
Maytas Monsereenusorn	f968400170	Introduce a new configuration that skip storing audit payload if payload size exceed limit and skip storing null fields for audit payload (#11078 ) * Add config to skip storing audit payload if exceed limit * fix checkstyle * change config name * skip null fields for audit payload * fix checkstyle * address comments * fix guice * fix test * add tests * address comments * address comments * address comments * fix checkstyle * address comments * fix test * fix test * address comments * Address comments Co-authored-by: Jihoon Son <jihoonson@apache.org> Co-authored-by: Jihoon Son <jihoonson@apache.org>	2021-04-13 20:18:28 -07:00
chenyuzhi459	b8423a38df	add round test (#11088 ) * add round test * code style * handle null val for round function * handle null val for round function * support null for round * fix compatiblity * fix test * fix test * code style * optimize format	2021-04-13 11:36:32 -07:00
Lucas Capistrant	8264203cee	Allow client to configure batch ingestion task to wait to complete until segments are confirmed to be available by other (#10676 ) * Add ability to wait for segment availability for batch jobs * IT updates * fix queries in legacy hadoop IT * Fix broken indexing integration tests * address an lgtm flag * spell checker still flagging for hadoop doc. adding under that file header too * fix compaction IT * Updates to wait for availability method * improve unit testing for patch * fix bad indentation * refactor waitForSegmentAvailability * Fixes based off of review comments * cleanup to get compile after merging with master * fix failing test after previous logic update * add back code that must have gotten deleted during conflict resolution * update some logging code * fixes to get compilation working after merge with master * reset interrupt flag in catch block after code review pointed it out * small changes following self-review * fixup some issues brought on by merge with master * small changes after review * cleanup a little bit after merge with master * Fix potential resource leak in AbstractBatchIndexTask * syntax fix * Add a Compcation TuningConfig type * add docs stipulating the lack of support by Compaction tasks for the new config * Fixup compilation errors after merge with master * Remove erreneous newline	2021-04-08 21:03:00 -07:00
Clint Wylie	338886fd5f	vector group by support for string expressions (#11010 ) * vector group by support for string expressions * fix test * comments, javadoc	2021-04-08 19:23:39 -07:00
Xavier Léauté	15bdd6bc2f	Fix unit tests and GC settings for Java 15 (#11074 ) * JavaScript script engine support was removed in JDK 15: skip those tests for JDKs without it * Fix flaky HTTP client tests with Java 15 * Switch from CMS to G1GC in integration tests, since CMS is no longer available in JDK 15	2021-04-08 10:33:37 -07:00
Jihoon Son	cfcebc40f6	Allow list for JDBC connection properties to address CVE-2021-26919 (#11047 ) * Allow list for JDBC connection properties to address CVE-2021-26919 * fix tests for java 11	2021-04-01 17:30:47 -07:00
Jihoon Son	43ea184b74	Add explicit EOF and use assert instead of exception (#11041 )	2021-03-31 09:41:57 -07:00
Gian Merlino	bf20f9e979	DruidInputSource: Fix issues in column projection, timestamp handling. (#10267 ) * DruidInputSource: Fix issues in column projection, timestamp handling. DruidInputSource, DruidSegmentReader changes: 1) Remove "dimensions" and "metrics". They are not necessary, because we can compute which columns we need to read based on what is going to be used by the timestamp, transform, dimensions, and metrics. 2) Start using ColumnsFilter (see below) to decide which columns we need to read. 3) Actually respect the "timestampSpec". Previously, it was ignored, and the timestamp of the returned InputRows was set to the `__time` column of the input datasource. (1) and (2) together fix a bug in which the DruidInputSource would not properly read columns that are used as inputs to a transformSpec. (3) fixes a bug where the timestampSpec would be ignored if you attempted to set the column to something other than `__time`. (1) and (3) are breaking changes. Web console changes: 1) Remove "Dimensions" and "Metrics" from the Druid input source. 2) Set timestampSpec to `{"column": "__time", "format": "millis"}` for compatibility with the new behavior. Other changes: 1) Add ColumnsFilter, a new class that allows input readers to determine which columns they need to read. Currently, it's only used by the DruidInputSource, but it could be used by other columnar input sources in the future. 2) Add a ColumnsFilter to InputRowSchema. 3) Remove the metric names from InputRowSchema (they were unused). 4) Add InputRowSchemas.fromDataSchema method that computes the proper ColumnsFilter for given timestamp, dimensions, transform, and metrics. 5) Add "getRequiredColumns" method to TransformSpec to support the above. * Various fixups. * Uncomment incorrectly commented lines. * Move TransformSpecTest to the proper module. * Add druid.indexer.task.ignoreTimestampSpecForDruidInputSource setting. * Fix. * Fix build. * Checkstyle. * Misc fixes. * Fix test. * Move config. * Fix imports. * Fixup. * Fix ShuffleResourceTest. * Add import. * Smarter exclusions. * Fixes based on tests. Also, add TIME_COLUMN constant in the web console. * Adjustments for tests. * Reorder test data. * Update docs. * Update docs to say Druid 0.22.0 instead of 0.21.0. * Fix test. * Fix ITAutoCompactionTest. * Changes from review & from merging.	2021-03-25 10:32:21 -07:00
Jihoon Son	a041933017	Allow overlapping intervals for the compaction task (#10912 ) * Allow overlapping intervals for the compaction task * unused import * line indentation Co-authored-by: Maytas Monsereenusorn <maytasm@apache.org>	2021-03-23 11:21:54 -07:00
Xavier Léauté	1061faa6ba	prefer string concatenation over String.format in performance sensitive code (#10997 ) String.format relies on regex parsing, which makes these calls expensive at higher request volumes.	2021-03-16 22:06:26 -07:00
Clint Wylie	4cd4a22f87	expression filter support for vectorized query engines (#10613 ) * expression filter support for vectorized query engines * remove unused codes * more tests * refactor, more tests * suppress * more * more * more * oops, i was wrong * comment * remove decorate, object dimension selector, more javadocs * style	2021-03-16 11:46:50 -07:00
Abhishek Agarwal	c66951a59e	Add flag in SQL to disable left base filter optimization for joins (#10947 ) * Add flag to disable left base filter * code coverage * Draft * Review comments * code coverage * add docs * Add old tests	2021-03-09 13:07:34 -08:00

1 2 3 4 5 ...

375 Commits