druid

Commit Graph

Author	SHA1	Message	Date
Benedict Jin	2f64414ade	Add "REVERSE" / "REPEAT" / "RIGHT" / "LEFT" functions (#7334 ) * Add "REVERSE" / "REPEAT" / "RIGHT" / "LEFT" functions * Fix ImportOrder * Use RuntimeException instead of OutOfMemoryError according to "Effective Java" * Simplify * Patch suggestions	2019-04-10 11:46:29 +08:00
Clint Wylie	89bb43f382	'core' ORC extension (#7138 ) * orc extension reworked to use apache orc map-reduce lib, moved to core extensions, support for flattenSpec, tests, docs * change binary handling to be compatible with avro and parquet, Rows.objectToStrings now converts byte[] to base64, change date handling * better docs and tests * fix it * formatting * doc fix * fix it * exclude redundant dependencies * use latest orc-mapreduce, add hadoop jobProperties recommendations to docs * doc fix * review stuff and fix binaryAsString * cache for root level fields * more better	2019-04-09 09:03:26 -07:00
Gian Merlino	8c104a115c	SQL: Add STRING_FORMAT function. (#7327 )	2019-04-03 17:09:54 -04:00
Jonathan Wei	2bf6fc353a	Update scan benchmark for time ordering (#7385 )	2019-03-30 11:36:25 -07:00
Justin Borromeo	ad7862c58a	Time Ordering On Scans (#7133 ) * Moved Scan Builder to Druids class and started on Scan Benchmark setup * Need to form queries * It runs. * Stuff for time-ordered scan query * Move ScanResultValue timestamp comparator to a separate class for testing * Licensing stuff * Change benchmark * Remove todos * Added TimestampComparator tests * Change number of benchmark iterations * Added time ordering to the scan benchmark * Changed benchmark params * More param changes * Benchmark param change * Made Jon's changes and removed TODOs * Broke some long lines into two lines * nit * Decrease segment size for less memory usage * Wrote tests for heapsort scan result values and fixed bug where iterator wasn't returning elements in correct order * Wrote more tests for scan result value sort * Committing a param change to kick teamcity * Fixed codestyle and forbidden API errors * . * Improved conciseness * nit * Created an error message for when someone tries to time order a result set > threshold limit * Set to spaces over tabs * Fixing tests WIP * Fixed failing calcite tests * Kicking travis with change to benchmark param * added all query types to scan benchmark * Fixed benchmark queries * Renamed sort function * Added javadoc on ScanResultValueTimestampComparator * Unused import * Added more javadoc * improved doc * Removed unused import to satisfy PMD check * Small changes * Changes based on Gian's comments * Fixed failing test due to null resultFormat * Added config and get # of segments * Set up time ordering strategy decision tree * Refactor and pQueue works * Cleanup * Ordering is correct on n-way merge -> still need to batch events into ScanResultValues * WIP * Sequence stuff is so dirty :( * Fixed bug introduced by replacing deque with list * Wrote docs * Multi-historical setup works * WIP * Change so batching only occurs on broker for time-ordered scans Restricted batching to broker for time-ordered queries and adjusted tests Formatting Cleanup * Fixed mistakes in merge * Fixed failing tests * Reset config * Wrote tests and added Javadoc * Nit-change on javadoc * Checkstyle fix * Improved test and appeased TeamCity * Sorry, checkstyle * Applied Jon's recommended changes * Checkstyle fix * Optimization * Fixed tests * Updated error message * Added error message for UOE * Renaming * Finish rename * Smarter limiting for pQueue method * Optimized n-way merge strategy * Rename segment limit -> segment partitions limit * Added a bit of docs * More comments * Fix checkstyle and test * Nit comment * Fixed failing tests -> allow usage of all types of segment spec * Fixed failing tests -> allow usage of all types of segment spec * Revert "Fixed failing tests -> allow usage of all types of segment spec" This reverts commit `ec470288c7`. * Revert "Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge" This reverts commit `57033f36df`, reversing changes made to `8f01d8dd16`. * Check type of segment spec before using for time ordering * Fix bug in numRowsScanned * Fix bug messing up count of rows * Fix docs and flipped boolean in ScanQueryLimitRowIterator * Refactor n-way merge * Added test for n-way merge * Refixed regression * Checkstyle and doc update * Modified sequence limit to accept longs and added test for long limits * doc fix * Implemented Clint's recommendations	2019-03-28 14:37:09 -07:00
Roman Leventov	bca40dcdaf	Fix some IntelliJ inspections (#7273 ) Prepare TeamCity for IntelliJ 2018.3.1 upgrade. Mostly removed redundant exceptions declarations in `throws` clauses.	2019-03-25 21:11:01 -03:00
Gian Merlino	4ca5fe0f60	SQL: Add PARSE_LONG function. (#7326 ) * SQL: Add PARSE_LONG function. * Fix test.	2019-03-22 15:40:10 -07:00
Roman Leventov	dfd27e00c0	Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality (#7185 ) * Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality * Fix DruidCoordinatorTest; Renamed DruidCoordinator.getReplicationStatus() to computeUnderReplicationCountsPerDataSourcePerTier() * More Javadocs, typos, refactor DruidCoordinatorRuntimeParams.createAvailableSegmentsSet() * Style * typo * Disable StaticPseudoFunctionalStyleMethod inspection because of too much false positives * Fixes	2019-03-19 18:22:56 -03:00
Jihoon Son	892d1d35d6	Deprecate NoneShardSpec and drop support for automatic segment merge (#6883 ) * Deprecate noneShardSpec * clean up noneShardSpec constructor * revert unnecessary change * Deprecate mergeTask * add more doc * remove convert from indexMerger * Remove mergeTask * remove HadoopDruidConverterConfig * fix build * fix build * fix teamcity * fix teamcity * fix ServerModule * fix compilation * fix compilation	2019-03-15 23:29:25 -07:00
Furkan KAMACI	7ada1c49f9	Prohibit Throwables.propagate() (#7121 ) * Throw caught exception. * Throw caught exceptions. * Related checkstyle rule is added to prevent further bugs. * RuntimeException() is used instead of Throwables.propagate(). * Missing import is added. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * * Checkstyle definition is improved. * Throwables.propagate() usages are removed. * Checkstyle pattern is changed for only scanning "Throwables.propagate(" instead of checking lookbehind. * Throwable is kept before firing a Runtime Exception. * Fix unused assignments.	2019-03-14 18:28:33 -03:00
Clint Wylie	3895914aa2	consolidate CompressionUtils.java since now in the same jar (#6908 )	2019-03-13 11:02:44 -04:00
Clint Wylie	4d3987c1dd	lifecycle stage refactor to ensure proper start and stop ordering of servers and announcements (#7234 ) * lifecycle stage refactor to ensure proper ordering of servers and announcements * move DerivativeDataSourceManager to Lifecycle.Stage.NORMAL	2019-03-12 07:09:03 -07:00
Xue Yu	65118277a3	support sin cos etc trigonometric function in sql (#7182 ) * support triangle function in sql * feedback address	2019-03-04 19:18:22 -08:00
Roman Leventov	10c9f6d708	Fix and document concurrency of EventReceiverFirehose and TimedShutoffFirehose; Refine concurrency specification of Firehose (#7038 ) #### `EventReceiverFirehoseFactory` Fixed several concurrency bugs in `EventReceiverFirehoseFactory`: - Race condition over putting an entry into `producerSequences` in `checkProducerSequence()`. - `Stopwatch` used to measure time across threads, but it's a non-thread-safe class. - Use `System.nanoTime()` instead of `System.currentTimeMillis()` because the latter are [not suitable](https://stackoverflow.com/a/351571/648955) for measuring time intervals. - `close()` was not synchronized by could be called from multiple threads concurrently. Removed unnecessary `readLock` (protecting `hasMore()` and `nextRow()` which are always called from a single thread). Removed unnecessary `volatile` modifiers. Documented threading model and concurrent control flow of `EventReceiverFirehose` instances. Important: please read the updated Javadoc for `EventReceiverFirehose.addAll()`. It allows events from different requests (batches) to be interleaved in the buffer. Is this OK? #### `TimedShutoffFirehoseFactory` - Fixed a race condition that was possible because `close()` that was not properly synchronized. Documented threading model and concurrent control flow of `TimedShutoffFirehose` instances. #### `Firehose` Refined concurrency contract of `Firehose` based on `EventReceiverFirehose` implementation. Importantly, now it states that `close()` doesn't affect `hasMore()` and `nextRow()` and could be called concurrently with them. In other words, specified that `close()` is for "row supply" side rather than "row consume" side. However, I didn't check that other `Firehose` implementatations adhere to this contract. <hr> This issue is the result of reviewing `EventReceiverFirehose` and `TimedShutoffFirehose` using [this checklist](https://medium.com/@leventov/code-review-checklist-java-concurrency-49398c326154).	2019-03-04 18:50:03 -03:00
Jihoon Son	9a62157a06	Make MapBasedRow immutable (#7130 ) * Make MapBasedRow immutable * add null check	2019-02-28 16:07:14 -08:00
Jihoon Son	cacdc83cad	Improve error message for integer overflow in compaction task (#7131 ) * improve error message for integer overflow in compaction task * fix build	2019-02-28 11:07:37 +08:00
Mirko Jotic	f6a8e030cc	Select query failing if miliseconds used as time for indexing (#6937 ) * [#1332] Fix - select failing if milis used for idx. * Formating correction. * Address comment: throw original exception. * Using constant values in tests - Try converting to Integer and then multiply by 1000L to achieve milis. - If not successful try converting to Long or rethrow original exception. * DateTime#of has to support "2011-01-01T00:00:00" - in addition to seconds and milisecs, this method currently supports even a date string. * Handle only milisec timestamps and ISO8601 strings	2019-02-25 14:36:01 -08:00
Jihoon Son	9a066558a4	Fix exception when the scheme is missing in endpointUrl for S3 (#7129 ) * Fix exception when the scheme is missing in endpointUrl for S3 * add null check	2019-02-25 11:10:35 -08:00
Jihoon Son	4e2b085201	Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file in deep storage (#6911 ) * Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file * delete descriptor.file when killing segments * fix test * Add doc for ha * improve warning	2019-02-20 15:10:29 -08:00
Justin Borromeo	871b9d2f4c	[Benchmarking] Call blackhole#consume() on collections instead of iterating through each element (#7002 ) * Replaced iteration with blackhole#consume(the collection) * Added javadoc on Sequence#toList()	2019-02-20 08:48:06 -08:00
Clint Wylie	cadb6c5280	Missing Overlord and MiddleManager api docs (#7042 ) * document middle manager api * re-arrange * correction * document more missing overlord api calls, minor re-arrange of some code i was referencing * fix it * this will fix it * fixup * link to other docs	2019-02-19 10:52:05 -08:00
Jihoon Son	c9f21bc782	Fix filterSegments for TimeBoundary and DataSourceMetadata queries (#7023 ) * Fix filterSegments for TimeBoundary and DataSourceMetadata queries * add javadoc * fix build	2019-02-08 10:03:02 -08:00
Jonathan Wei	fafbc4a80e	Set version to 0.15.0-incubating-SNAPSHOT (#7014 )	2019-02-07 14:02:52 -08:00
Jonathan Wei	8bc5eaa908	Set version to 0.14.0-incubating-SNAPSHOT (#7003 )	2019-02-04 19:36:20 -08:00
Roman Leventov	0e926e8652	Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager (#6898 ) * Prohibit assigning concurrent maps into Map-types variables and fields; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool * Enforce that if compute(), computeIfAbsent(), computeIfPresent() or merge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java * Remove unnecessary comment * Checkstyle * Fix getFromExtensions() * Add a reference to the comment about guarded computeIfAbsent() optimization; IdentityHashMap optimization * Fix UriCacheGeneratorTest * Workaround issue with MaterializedViewQueryQueryToolChest * Strengthen Appenderator's contract regarding concurrency	2019-02-04 09:18:12 -08:00
Furkan KAMACI	61f165c23f	Try-with-resources should be used since the new syntax is more readable. (#6944 ) * Try-with-resources should be used since the new syntax is more readable. * Fixed checkstyle error.	2019-02-03 10:42:28 +08:00
Egor Riashin	2803fda8b7	Added an allocation rate metric #6604 (#6710 ) Addressing #6604	2019-01-29 20:16:35 +07:00
Benedict Jin	72a571fbf7	For performance reasons, use `java.util.Base64` instead of Base64 in Apache Commons Codec and Guava (#6913 ) * * Add few methods about base64 into StringUtils * Use `java.util.Base64` instead of others * Add org.apache.commons.codec.binary.Base64 & com.google.common.io.BaseEncoding into druid-forbidden-apis * Rename encodeBase64String & decodeBase64String * Update druid-forbidden-apis	2019-01-25 17:32:29 -08:00
Roman Leventov	8eae26fd4e	Introduce SegmentId class (#6370 ) * Introduce SegmentId class * tmp * Fix SelectQueryRunnerTest * Fix indentation * Fixes * Remove Comparators.inverse() tests * Refinements * Fix tests * Fix more tests * Remove duplicate DataSegmentTest, fixes #6064 * SegmentDescriptor doc * Fix SQLMetadataStorageUpdaterJobHandler * Fix DataSegment deserialization for ignoring id * Add comments * More comments * Address more comments * Fix compilation * Restore segment2 in SystemSchemaTest according to a comment * Fix style * fix testServerSegmentsTable * Fix compilation * Add comments about why SegmentId and SegmentIdWithShardSpec are separate classes * Fix SystemSchemaTest * Fix style * Compare SegmentDescriptor with SegmentId in Javadoc and comments rather than with DataSegment * Remove a link, see https://youtrack.jetbrains.com/issue/IDEA-205164 * Fix compilation	2019-01-21 11:11:10 -08:00
Clint Wylie	8ba33b2505	add 'init' lifecycle stage for finer control over startup and shutdown (#6864 ) * add Lifecycle.Stage.INIT, put log shutter downer in init stage, tests, rad startup banner * log cleanup * log changes * add task-master lifecycle to module lifecycle to gracefully stop task-master stuff * fix it the right way * remove announce spam * unused import * one more log * updated comments * wrap leadership lifecycle stop to prevent exceptions from wrecking rest of task master stop * add precondition check	2019-01-21 09:01:36 -08:00
Jonathan Wei	8537a771b0	Some fixes and tests for spaces/non-ASCII chars in datasource names (#6761 ) * Fixes and tests for spaces/non-ASCII datasource names * Some unit test fixes * Fix ITRealtimeIndexTaskTest * Checkstyle * TeamCity * PR comments	2019-01-15 08:33:31 -08:00
Charles Allen	5d2947cd52	Use Guava Compatible immediate executor service (#6815 ) * Use multi-guava version friendly direct executor implementation * Don't use a singleton * Fix strict compliation complaints * Copy Guava's DirectExecutor * Fix javadoc * Imports are the devil	2019-01-11 10:42:19 -08:00
Jihoon Son	a7391b396b	Copy splitToList from Guava (#6800 ) * Copy splitToList from Guava * add comment * fix commit tag	2019-01-03 18:42:13 +08:00
Benedict Jin	e8ddd9942d	Fix wrong counter getFailedSendingTimeCounter method (#6793 ) * Fix wrong counter getFailedSendingTimeCounter method * Add testcases * Add getTimeSumAndCount for testcases	2019-01-02 23:50:54 +08:00
David Glasser	9bbd992885	Update two Javadocs to cite druid.generic.useDefaultValueForNull (#6760 ) See #4349.	2018-12-20 09:39:37 -08:00
Clint Wylie	9505074530	fix log typo (#6755 ) * fix log typo, add DataSegmentUtils.getIdentifiersString util method * fix indecisive oops	2018-12-18 15:10:25 -08:00
Gian Merlino	04e7c7fbdc	FilteredRequestLogger: Fix start/stop, invalid delegate behavior. (#6637 ) * FilteredRequestLogger: Fix start/stop, invalid delegate behavior. Fixes two bugs: 1) FilteredRequestLogger did not start/stop the delegate. 2) FilteredRequestLogger would ignore an invalid delegate type, and instead silently substitute the "noop" logger. This was due to a larger problem with RequestLoggerProvider setup in general; the fix here is to remove "defaultImpl" from the RequestLoggerProvider interface, and instead have JsonConfigurator be responsible for creating the default implementations. It is stricter about things than the old system was, and is only willing to make a noop logger if it doesn't see any request logger configs. Otherwise, it'll raise a provision error. * Remove unneeded annotations.	2018-12-14 16:55:44 +08:00
Gian Merlino	b7709e1245	FileUtils: Sync directory entry too on writeAtomically. (#6677 ) * FileUtils: Sync directory entry too on writeAtomically. See the fsync(2) man page for why this is important: https://linux.die.net/man/2/fsync This also plumbs CompressionUtils's "zip" function through writeAtomically, so the code for handling atomic local filesystem writes is all done in the same place. * Remove unused import. * Avoid FileOutputStream. * Allow non-atomic writes to overwrite. * Add some comments. And no need to flush an unbuffered stream.	2018-12-08 17:12:59 +01:00
Mingming Qiu	607339003b	Add TaskCountStatsMonitor to monitor task count stats (#6657 ) * Add TaskCountStatsMonitor to monitor task count stats * address comments * add file header * tweak test	2018-12-04 13:37:17 -08:00
Clint Wylie	a1c9d0add2	autosize processing buffers based on direct memory sizing by default (#6588 ) * autosize processing buffers based on direct memory sizing * remove oops, more test * max 1gb autosize buffers, test, start of docs * fix oops * revert accidental change * print buffer size in exception * change the things	2018-12-03 18:40:02 -07:00
Roman Leventov	ec38df7575	Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() (#6606 ) * Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() method; prohibit and eliminate some suboptimal Java 8 patterns * Fix style * Fix HttpEmitterTest.timeoutEmptyQueue() * Add DruidNodeDiscovery.Listener.nodeViewInitialized() calls in tests * Clarify code	2018-12-01 01:12:56 +01:00
Jihoon Son	219f0965dc	Remove duplicate DataSegmentTest (#6669 )	2018-11-27 15:13:39 -08:00
Clint Wylie	8f8a569aa2	faster flattening for non-existent paths (#6654 ) * faster flattening for non-existent properties to circumvent upstream json-path issue * fix json provider * revert to using null instead of undefined	2018-11-27 14:14:11 -08:00
Roman Leventov	887c645675	Find duplicate lines with checkstyle; enable some duplicate inspections in IntelliJ (#6558 ) Not putting this to 0.13 milestone because the found bugs are not critical (one is a harmless DI config duplicate, and another is in a benchmark. Change in `DumpSegment` is just an indentation change.	2018-11-26 16:55:42 +01:00
seoeun	22a5bf97a2	Fix issue that tasks tables in metadata storage are not cleared (#6592 ) * tasks tables in metadata storage are not cleared * address comments. remove tasklogs and revert obsolete changes * address comments. change comment and update doc. * address comments. update doc more detailed * address comments. remove redundant log and update doc more detailed. * address comments. update document	2018-11-22 11:50:31 +08:00
Roman Leventov	87b96fb1fd	Add checkstyle rules about imports and empty lines between members (#6543 ) * Add checkstyle rules about imports and empty lines between members * Add suppressions * Update Eclipse import order * Add empty line * Fix StatsDEmitter	2018-11-20 12:42:15 +01:00
Mingming Qiu	93b0d58571	optimize input row parsers (#6590 ) * optimize input row parsers * address comments	2018-11-16 11:48:32 +08:00
Roman Leventov	8f3fe9cd02	Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies (#6607 ) * Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies * Fix bug * Replace checkstyle regexp with IntelliJ inspection	2018-11-15 13:21:34 -08:00
Roman Leventov	0b70c36eb0	Fix bugs in ExprEval (#6617 )	2018-11-14 15:20:52 -08:00
Gian Merlino	154b6fbcef	SQL: Add "POSITION" function. (#6596 ) Also add a "fromIndex" argument to the strpos expression function. There are some -1 and +1 adjustment terms due to the fact that the strpos expression behaves like Java indexOf (0-indexed), but the POSITION SQL function is 1-indexed.	2018-11-13 13:39:00 -08:00

1 2

59 Commits