druid

Commit Graph

Author	SHA1	Message	Date
AmatyaAvadhanula	fbd1a07e7e	Fix kinesis IT flakiness (#12821 )	2022-08-03 17:16:16 +05:30
Peter Marshall	0a4ed3ba61	Readme - link fix to build guide (#12849 )	2022-08-03 19:32:37 +08:00
Karan Kumar	3290b49754	Log4j bump to 2.18 due to [LOG4J2-3419] (#12847 ) * Log4j bump to 2.18 due to [LOG4J2-3419] * Fixing license issues	2022-08-02 23:25:40 -07:00
Gian Merlino	2912a36a20	Use nonzero default value of maxQueuedBytes. (#12840 ) * Use nonzero default value of maxQueuedBytes. The purpose of this parameter is to prevent the Broker from running out of memory. The prior default is unlimited; this patch changes it to a relatively conservative 25MB. This may be too low for larger clusters. The risk is that throughput can decrease for queries with large resultsets or large amounts of intermediate data. However, I think this is better than the risk of the prior default, which is that these queries can cause the Broker to go OOM. * Alter calculation.	2022-08-02 17:57:27 -07:00
Gian Merlino	0ca37c20a6	Python 3 support for post-index-task. (#12841 ) * Python 3 support for post-index-task. Useful when running on macOS or any other system that doesn't have Python 2. * Encode JSON returned by read_task_file. * Adjust. * Skip needless loads. * Add a decode. * Additional decodes needed.	2022-08-02 17:53:34 -07:00
Clint Wylie	6981b1cc12	fix bugs with nested column jsonpath parser (#12831 )	2022-08-02 11:38:25 -07:00
Rohan Garg	eabce8a159	Fix flakiness in query-retry ITs (#12818 )	2022-08-02 17:20:16 +05:30
Tejaswini Bandlamudi	cceb2e849e	Perform lazy initialization of parquet extensions module (#12827 ) Historicals and middle managers crash with an `UnknownHostException` on trying to load `druid-parquet-extensions` with an ephemeral Hadoop cluster. This happens because the `fs.defaultFS` URI value cannot be resolved at start up time as the hadoop cluster may not exist at startup time. This commit fixes the error by performing initialization of the filesystem in `ParquetInputFormat.createReader()` whenever a new reader is requested.	2022-08-02 13:41:12 +05:30
Clint Wylie	6046a392b6	add DictionaryEncodedStringValueIndex implementation to NestedFieldLiteralColumnIndexSupplier (#12837 )	2022-08-01 21:40:35 -07:00
Rohan Garg	7ae6cc6e60	Fix string first/last aggregator comparator (#12773 )	2022-08-01 20:54:15 +05:30
317brian	553ff47616	fix: fix broken link to Class TTest (#12836 )	2022-07-31 10:18:14 +08:00
Clint Wylie	d96a9c1e6f	add missing selectors for explicit null columns (#12834 )	2022-07-29 19:08:58 -07:00
Clint Wylie	189e8b9d18	add NumericRangeIndex interface and BoundFilter support (#12830 ) add NumericRangeIndex interface and BoundFilter support changes: * NumericRangeIndex interface, like LexicographicalRangeIndex but for numbers * BoundFilter now uses NumericRangeIndex if comparator is numeric and there is no extractionFn * NestedFieldLiteralColumnIndexSupplier.java now supports supplying NumericRangeIndex for single typed numeric nested literal columns * better faster stronger and (ever so slightly) more understandable * more tests, fix bug * fix style	2022-07-29 18:58:49 -07:00
Paul Rogers	d52abe7b38	Today is that day - Single pass through Calcite planner (#12636 ) * Druid planner now makes only one pass through Calcite planner Resolves the issue that required two parse/plan cycles: one for validate, another for plan. Creates a clone of the Calcite planner and validator to resolve the conflict that prevented the merger.	2022-07-29 18:53:21 -07:00
Charles Smith	efbb58e90e	docs: remove maxRowsPerSegment where appropriate (#12071 ) * remove maxRowsPerSegment where appropriate * fix tutorial, accept suggestions * Update docs/design/coordinator.md * additional tutorial file * fix initial index spec * accept comments * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * add back comment on maxrows per segment * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/tutorials/tutorial-compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * rm duplicate entry * Update native-batch-simple-task.md remove ref to `maxrowspersegment` * Update native-batch.md remove ref to `maxrowspersegment` * final tenticles * Apply suggestions from code review Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2022-07-28 16:52:13 +05:30
Maytas Monsereenusorn	24c345cdf0	Allow dictionary encoded column to use a more generic index interface (#12826 )	2022-07-27 15:23:00 -07:00
Paul Rogers	a8b155e9c6	Fixes for the Avatica JDBC driver (#12709 ) * Fixes for the Avatica JDBC driver Correctly implement regular and prepared statements Correctly implement result sets Fix race condition with contexts Clarify when parameters are used Prepare for single-pass through the planner * Addressed review comments * Addressed review comment	2022-07-27 15:22:40 -07:00
Atul Mohan	93a9a4b1c5	Add retention for file request logs (#12559 ) * Add retention for file request logs * Spelling	2022-07-27 08:17:02 -07:00
Rohan Garg	bf0886a8ab	Fix hash calcuation in RendezvousHasher (#12817 )	2022-07-27 12:16:27 +05:30
Jacques Arnoux	6b0b1d7af3	replaces hard-coded probe delays with helm values (#12805 )	2022-07-26 14:04:06 +05:30
Laksh Singla	2e616e633a	Determine type of `__time` column by RowSignature in case of External Datasource (#12770 ) Some queries like `REPLACE INTO ... SELECT TIME_PARSE("__time") AS __time FROM ...` fail at the Calcite layer because any column with name `__time` is considered to be of type `SqlTypeName.TIMESTAMP`. Changes: - Modify `RowSignatures.toRelDataType()` so that the type of `__time` column is determined by the RowSignature's type.	2022-07-26 12:09:40 +05:30
Charles Smith	d7d4314367	remove ref to plywood repo (#12809 )	2022-07-26 10:12:13 +08:00
PJ Fanning	188b5b0027	Upgrade to jetty 9.4.48.v20220622 due to CVEs (#12801 ) * Upgrade to jetty 9.4.48.v20220622 due to CVEs * Update licenses.yaml	2022-07-26 10:11:48 +08:00
Tejaswini Bandlamudi	5772dfd155	Peons should not report SysMonitor stats since MiddleManager reports them. (#12802 ) Sysmonitor stats (mem, fs, disk, net, cpu, swap, sys, tcp) are reported by all Druid processes, including Peons that are ephemeral in nature. Since Peons always run on the same host as the MiddleManager that spawned them and is unlikely to change, the SyMonitor metrics emitted by Peon are merely duplicates. This is often not a problem except when machines are super-beefy. Imagine a 64-core machine and 32 workers running on this machine. now you will have each Peon reporting metrics for each core. that's an increase of (32 * 64)x in the number of metrics. This leads to a metric explosion. This PR updates MetricsModule to check node role running while registering SysMonitor and not to load any existing SysMonitor$Stats.	2022-07-23 13:32:16 +05:30
Victoria Lim	6394ecfd21	update figure and reference (#12813 )	2022-07-22 15:54:25 -07:00
Maytas Monsereenusorn	5417aa2055	Fix: ParseException swallow cause Exception (#12810 ) * add impl * add impl * fix checkstyle	2022-07-22 13:46:28 -07:00
Kashif Faraz	6c96d09680	Suppress some false alarm CVEs (#12812 ) This commit suppresses the following CVEs: - CVE-2021-43138: false alarm for async-http-client - CVE-2021-34538: applicable to Hive server - CVE-2020-25638: requires hibernate update, which causes Hadoop ingestion failure - CVE-2021-27568: false alarm for accessors-smart which is a dependency of json-smart (already suppressed)	2022-07-22 22:27:31 +05:30
Kashif Faraz	9e5f0109fd	Fix CVE-2022-2048 (jetty) and CVE-2022-31159 (aws-java-sdk-s3) (#12807 ) Changes: - Upgrade aws sdk version from `1.12.37` to `1.12.264` - Upgrade jetty version from `9.4.41.v20210516` to `9.4.47.v20220610`	2022-07-21 13:08:18 +05:30
Katya Macedo	a2be685824	Remove the time bit, fix headings (#12808 ) * Remove the time bit, fix headings * Adopt review suggestions * Edits * Update smoosh file description * Adopt review suggestions * Update spelling	2022-07-20 15:37:57 -07:00
Maytas Monsereenusorn	3bf1e699ff	GREATEST/LEAST function is incorrectly specifying that it cannot return null (#12804 )	2022-07-20 14:41:24 +05:30
Katya Macedo	809bf161ce	Add a note about setting the value of maxNumConcurrentSubTasks (#12772 ) * Add clarification for combining input source * Update inputFormat note * Update maxNumConcurrentSubTasks note * Fix broken link * Update docs/ingestion/native-batch-input-source.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2022-07-19 15:34:21 -07:00
Adarsh Sanjeev	f3272a25f9	Add check for sqlOuterLimit to ingest queries (#12799 ) * Add check for sqlOuterLimit to ingest queries * Fix checkstyle * Add comment	2022-07-19 09:02:43 -07:00
Tejaswini Bandlamudi	cc1ff56ca5	Unregisters `RealtimeMetricsMonitor`, `TaskRealtimeMetricsMonitor` on Indexers after task completion (#12743 ) Few indexing tasks register RealtimeMetricsMonitor or TaskRealtimeMetricsMonitor with the process’s MonitorScheduler when they start. These monitors never unregister themselves (they always return true, they'd need to return false to unregister). Each of these monitors emits a set of metrics once every druid.monitoring.emissionPeriod. As a result, after executing several tasks for a while, Indexer emits metrics of these tasks even after they're long gone. Proposed Solution Since one should be able to obtain the last round of ingestion metrics after the task unregisters the monitor, introducing lastRoundMetricsToBePushed variable to keep track of the same and overriding the AbstractMonitor.monitor method in RealtimeMetricsMonitor, TaskRealtimeMetricsMonitor to implement the new logic.	2022-07-18 14:34:18 +05:30
Atul Mohan	75045970cd	S3 Ingestion from non-default endpoints (#11798 ) * Add endpoint support for s3inputsource * Changes to tests * Fix docs * Fix config * Fix inspections * Fix spelling * Remove password from toString	2022-07-15 11:03:34 -07:00
Jianhuan Liu	d4403c15aa	Upgrade prometheus version, add more labels to PrometheusEmitter (#12769 ) Changes: - Upgrade prometheus to version 0.16.0 - Add optional labels `druid_service` and `host_name` to `PrometheusEmitter`	2022-07-15 14:43:12 +05:30
Vadim Ogievetsky	f2a7970a6c	reindex flow should take order from Druid (#12790 )	2022-07-14 20:03:33 -07:00
Clint Wylie	1e0542626b	add nested column query benchmarks (#12786 )	2022-07-14 18:16:30 -07:00
Paul Rogers	ee15c238cc	Clone Calcite planner to access validator (#12708 ) Done in preparation for the "single-pass" planner.	2022-07-14 18:10:33 -07:00
Yuanli Han	50f1f5840d	show json and add search box (#12784 )	2022-07-14 17:01:30 -07:00
Yuanli Han	82315779ff	fix segment timeline bar chart (#12782 )	2022-07-14 16:58:24 -07:00
Vadim Ogievetsky	14e5b8325c	make tick formatting more robust (#12788 )	2022-07-14 16:56:53 -07:00
Clint Wylie	e25ba00470	fix bug in ObjectFlatteners.toMap which caused null values in avro-stream/avro-ocf/parquet/orc to be converted to {} instead of null in web-console sampler UI (#12785 ) * fix bug in ObjectFlatteners.toMap which caused null values in avro-stream/avro-ocf/parquet/orc to be converted to {} instead of null * fix parquet test that expected wrong behavior, my bad heh	2022-07-14 16:52:01 -07:00
Clint Wylie	05b2e967ed	druid nested data column type (#12753 ) * add new druid nested data column type * fixes and such * fixes * adjustments, more tests * self review * oops * fix and test * more better * style	2022-07-14 12:07:23 -07:00
Frank Chen	a544aff761	Document missed simple granularities (#12768 ) * Document missed simple granularities * Update docs/querying/granularities.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/querying/granularities.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2022-07-14 14:02:28 +08:00
zachjsh	c0380e7b0a	* fix duplicate dimension (#12778 )	2022-07-14 10:39:03 +05:30
Victoria Lim	d8f8c56f94	Docs: Index page with all SQL functions (#12771 ) * list of all functions * add function names to spelling file	2022-07-14 09:59:55 +08:00
Clint Wylie	8c33508eaf	run web-console e2e tests for java changes too (#12776 ) * run web-console e2e tests for java changes too, fix travis stages for web e2e and docs jobs * run the script test on script changes	2022-07-13 16:12:57 -07:00
Vadim Ogievetsky	c1c2104bd6	fix ordering in e2e test (#12775 )	2022-07-13 15:08:00 -07:00
Abhishek Agarwal	2ab20c9fc9	Surface more information about task status in tests (#12759 ) I see some test runs failing because task status is not as expected. It will be helpful to know what error the task has.	2022-07-13 14:53:53 +05:30
TSFenwick	8c02880d5f	Emit metrics for distribution of number of rows per segment (#12730 ) * initial commit of bucket dimensions for metrics return counts of segments that have rowcount in a bucket size for a datasource return average value of rowcount per segment in a datasource added unit test naming could use a lot of work buckets right now are not finalized added javadocs altered metrics.md * fix checkstyle issues * addressed review comments add monitor test move added functionality to new monitor update docs * address comments renamed monitor handle tombstones better update docs added javadocs * Add support for tombstones in the segment distribution * undo changes to tombstone segmentizer factory * fix accidental whitespacing changes * address comments regarding metrics documentation and rename variable to be more accurate * fix tests * fix checkstyle issues * fix broken test * undo removal of timeout	2022-07-12 07:04:42 -07:00

1 2 3 4 5 ...

11945 Commits All Branches Search

11945 Commits

All Branches