druid

Commit Graph

Author	SHA1	Message	Date
Himanshu	14aec7fcec	add config to optionally disable all compression in intermediate segment persists while ingestion (#7919 ) * disable all compression in intermediate segment persists while ingestion * more changes and build fix * by default retain existing indexingSpec for intermediate persisted segments * document indexSpecForIntermediatePersists index tuning config * fix build issues * update serde tests	2019-07-10 12:22:24 -07:00
Chi Cao Minh	1166bbcb75	Remove static imports from tests (#8036 ) Make static imports forbidden in tests and remove all occurrences to be consistent with the non-test code. Also, various changes to files affected by above: - Reformat to adhere to druid style guide - Fix various IntelliJ warnings - Fix various SonarLint warnings (e.g., the expected/actual args to Assert.assertEquals() were flipped)	2019-07-06 09:33:12 -07:00
Clint Wylie	42a7b8849a	remove FirehoseV2 and realtime node extensions (#8020 ) * remove firehosev2 and realtime node extensions * revert intellij stuff * rat exclusion	2019-07-04 15:40:22 -07:00
Khwunchai Jaengsawang	fb56f8d53c	Rename io.druid to org.druid in tdigestsketch extension (#7996 ) * Rename io.druid to org.druid in tdigestsketch extension * Fix typo Signed-off-by: Khwunchai Jaengsawang <khwunchai.j@ku.th> * Add packaging check for community extensions Signed-off-by: Khwunchai Jaengsawang <khwunchai.j@ku.th>	2019-07-01 12:46:35 -07:00
Fokko Driesprong	fba91b77d0	InterruptedException is never thrown (#7963 ) And therefore the block will always be false	2019-06-27 22:21:36 +05:30
Xue Yu	b9c6a26c0e	Use ComplexMetrics.registerSerde() across the codebase (#7925 ) * refactor complexmetric registerserde * fix error * feedback address	2019-06-25 11:39:04 +03:00
Clint Wylie	71997c16a2	switch links from druid.io to druid.apache.org (#7914 ) * switch links from druid.io to druid.apache.org * fix it	2019-06-18 09:06:27 -07:00
SandishKumarHN	01881e3a98	Use only com.google.errorprone.annotations.concurrent.GuardedBy, not javax.annotations.concurrent.GuardedBy (#7889 )	2019-06-17 15:58:51 +02:00
Clint Wylie	3fbb0a5e00	Supervisor list api with states and health (#7839 ) * allow optionally listing all supervisors with their state and health * docs * add state to full * clean * casing * format * spelling	2019-06-07 16:26:33 -07:00
Surekha	ea752ef562	Optimize overshadowed segments computation (#7595 ) * Move the overshadowed segment computation to SQLMetadataSegmentManager's poll * rename method in MetadataSegmentManager * Fix tests * PR comments * PR comments * PR comments * fix indentation * fix tests * fix test * add test for SegmentWithOvershadowedStatus serde format * PR comments * PR comments * fix test * remove snapshot updates outside poll * PR comments * PR comments * PR comments * removed unused import	2019-06-07 19:15:54 +02:00
Justin Borromeo	8032c4add8	Add errors and state to stream supervisor status API endpoint (#7428 ) * Add state and error tracking for seekable stream supervisors * Fixed nits in docs * Made inner class static and updated spec test with jackson inject * Review changes * Remove redundant config param in supervisor * Style * Applied some of Jon's recommendations * Add transience field * write test * implement code review changes except for reconsidering logic of markRunFinishedAndEvaluateHealth() * remove transience reporting and fix SeekableStreamSupervisorStateManager impl * move call to stateManager.markRunFinished() from RunNotice to runInternal() for tests * remove stateHistory because it wasn't adding much value, some fixes, and add more tests * fix tests * code review changes and add HTTP health check status * fix test failure * refactor to split into a generic SupervisorStateManager and a specific SeekableStreamSupervisorStateManager * fixup after merge * code review changes - add additional docs * cleanup KafkaIndexTaskTest * add additional documentation for Kinesis indexing * remove unused throws class	2019-05-31 17:16:01 -07:00
Jihoon Son	7abfbb066a	Bump up snapshot version to 0.16.0 (#7802 )	2019-05-30 17:17:33 -07:00
Roman Leventov	782863ed0f	Fix some problems reported by PVS-Studio (#7738 ) * Fix some problems reported by PVS-Studio * Address comments	2019-05-29 11:20:45 -07:00
Xavier Léauté	484afe0afb	exclude hadoop system dependency ambari depends on hadoop-annotation. Under linux, it sometimes activates a maven profile that includes an explicit dependency on ${java.home}/../lib/tools.jar This change excludes this dependency, since this jar is no longer included with Java 9. It also seems seems unlikely that we would require this dependency at runtime, since it is only enabled on some platforms.	2019-05-27 18:40:19 -07:00
awelsh93	6964ac23a2	Adding influxdb emitter as a contrib extension (#7717 ) * Adding influxdb emitter as a contrib extension * addressing code review comments	2019-05-23 11:11:48 -07:00
Merlin Lee	26fad7e06a	Add checkstyle for "Local variable names shouldn't start with capital" (#7681 ) * Add checkstyle for "Local variable names shouldn't start with capital" * Adjust some local variables to constants * Replace StringUtils.LINE_SEPARATOR with System.lineSeparator()	2019-05-23 18:40:28 +02:00
Clint Wylie	ffc2397bcd	fix AggregatorFactory.finalizeComputation implementations to be ok with null inputs (#7731 ) * AggregatorFactory finalizeComputation is nullable with nullable input, make implementations honor this * fixes	2019-05-22 21:13:09 -07:00
Gian Merlino	b6941551ae	Upgrade various build and doc links to https. (#7722 ) * Upgrade various build and doc links to https. Where it wasn't possible to upgrade build-time dependencies to https, I kept http in place but used hardcoded checksums or GPG keys to ensure that artifacts fetched over http are verified properly. * Switch to https://apache.org.	2019-05-21 11:30:14 -07:00
Merlin Lee	5f08b0b474	Add checkstyle for "Prohibit @author tags in Javadoc" (#7682 ) * Add checkstyle for "Prohibit @author tags in Javadoc" * Add "Do not use author tags/information in the code" back to CONTRIBUTING.md	2019-05-20 00:09:51 -07:00
Jonathan Wei	d99f77a01b	Add option to use YARN RM as fallback for JobHistory failure (#7673 ) * Add option to use YARN RM as fallback for job status * PR comments	2019-05-16 13:59:10 -07:00
Denis Boulas	0b739a5787	Added more mappings for new metrics. (#7665 ) Signed-off-by: Denis Boulas <dene14@gmail.com>	2019-05-15 15:04:46 -07:00
Fokko Driesprong	2aa9613bed	Bump Checkstyle to 8.20 (#7651 ) * Bump Checkstyle to 8.20 Moderate severity vulnerability that affects: com.puppycrawl.tools:checkstyle Checkstyle prior to 8.18 loads external DTDs by default, which can potentially lead to denial of service attacks or the leaking of confidential information. Affected versions: < 8.18 * Oops, missed one * Oops, missed a few	2019-05-14 11:53:37 -07:00
Samarth Jain	b542bb9f34	TDigest backed sketch aggregators (#7331 ) * First set of changes for tDigest histogram * Add license * Address code review comments * Add a doc page for new T-Digest sketch aggregators. Minor code cleanup and comments. * Remove synchronization from BufferAggregators. Address code review comments * Fix typo	2019-05-09 17:22:55 -07:00
Xavier Léauté	f7bfe8f269	Update mocking libraries for Java 11 support (#7596 ) * update easymock / powermock for to 4.0.2 / 2.0.2 for JDK11 support * update tests to use new easymock interfaces * fix tests failing due to easymock fixes * remove dependency on jmockit * fix race condition in ResourcePoolTest	2019-05-06 12:28:56 -07:00
Eyal Yurman	f02251ab2d	Contributing Moving-Average Query to open source. (#6430 ) * Contributing Moving-Average Query to open source. * Fix failing code inspections. * See if explicit types will invoke the correct comparison function. * Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter. * Update styling and headers for complience. * Refresh code with latest master changes: * Remove NullDimensionSelector. * Apply changes of RequestLogger. * Apply changes of TimelineServerView. * Small checkstyle fix. * Checkstyle fixes. * Fixing rat errors; Teamcity errors. * Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved. * Implements some of the review fixes. * Contributing Moving-Average Query to open source. * Fix failing code inspections. * See if explicit types will invoke the correct comparison function. * Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter. * Update styling and headers for complience. * Refresh code with latest master changes: * Remove NullDimensionSelector. * Apply changes of RequestLogger. * Apply changes of TimelineServerView. * Small checkstyle fix. * Checkstyle fixes. * Fixing rat errors; Teamcity errors. * Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved. * Implements some of the review fixes. * More fixes for review. * More fixes from review. * MapBasedRow is Unmodifiable. Create new rows instead of modifying existing ones. * Remove more changes related to datasketches support. * Refactor BaseAverager startFrom field and add a comment. * fakeEvents field: Refactor initialization and add comment. * Rename parameters (tiny change). * Fix variable name typo in test (JAN_4). * Fix styling of non camelCase fields. * Fix Preconditions.checkArgument for cycleSize. * Add more documentation to RowBucketIterable and other classes. * key/value comment on in MovingAverageIterable. * Fix anonymous makeColumnValueSelector returning null. * Replace IdentityYieldingAccumolator with Yielders.each(). * * internalNext() should return null instead of throwing exception. * Remove unused variables/prarameters. * Harden MovingAverageIterableTest (Switch anyOf to exact match). * Change internalNext() from recursion to iteration; Simplify next() and hasNext(). * Remove unused imports. * Address review comments. * Rename fakeEvents to emptyEvents. * Remove redundant parameter key from computeMovingAverage. * Check yielder as well in RowBucketIterable#hasNext() * Fix javadoc.	2019-04-26 17:07:48 -07:00
Clint Wylie	89bb43f382	'core' ORC extension (#7138 ) * orc extension reworked to use apache orc map-reduce lib, moved to core extensions, support for flattenSpec, tests, docs * change binary handling to be compatible with avro and parquet, Rows.objectToStrings now converts byte[] to base64, change date handling * better docs and tests * fix it * formatting * doc fix * fix it * exclude redundant dependencies * use latest orc-mapreduce, add hadoop jobProperties recommendations to docs * doc fix * review stuff and fix binaryAsString * cache for root level fields * more better	2019-04-09 09:03:26 -07:00
Clint Wylie	a99f0ff450	prefix no-op aggs with "Noop" (#6960 )	2019-04-02 15:05:07 -07:00
Charles Allen	eeb3dbe79d	Move GCP to a core extension (#6953 ) * Move GCP to a core extension * Don't provide druid-core >.< * Keep AWS and GCP modules separate * Move AWSModule to its own module * Add aws ec2 extension and more modules in more places * Fix bad imports * Fix test jackson module * Include AWS and GCP core in server * Add simple empty method comment * Update version to 15 * One more 0.13.0-->0.15.0 change * Fix multi-binding problem * Grep for s3-extensions and update docs * Update extensions.md	2019-03-27 09:00:43 -07:00
Roman Leventov	bca40dcdaf	Fix some IntelliJ inspections (#7273 ) Prepare TeamCity for IntelliJ 2018.3.1 upgrade. Mostly removed redundant exceptions declarations in `throws` clauses.	2019-03-25 21:11:01 -03:00
Jihoon Son	0c5dcf5586	Fix exclusivity for start offset in kinesis indexing service & check exclusivity properly in IndexerSQLMetadataStorageCoordinator (#7291 ) * Fix exclusivity for start offset in kinesis indexing service * some adjustment * Fix SeekableStreamDataSourceMetadata * Add missing javadocs * Add missing comments and unit test * fix SeekableStreamStartSequenceNumbers.plus and add comments * remove extra exclusivePartitions in KafkaIOConfig and fix downgrade issue * Add javadocs * fix compilation * fix test * remove unused variable	2019-03-21 13:12:22 -07:00
Roman Leventov	dfd27e00c0	Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality (#7185 ) * Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality * Fix DruidCoordinatorTest; Renamed DruidCoordinator.getReplicationStatus() to computeUnderReplicationCountsPerDataSourcePerTier() * More Javadocs, typos, refactor DruidCoordinatorRuntimeParams.createAvailableSegmentsSet() * Style * typo * Disable StaticPseudoFunctionalStyleMethod inspection because of too much false positives * Fixes	2019-03-19 18:22:56 -03:00
Jihoon Son	892d1d35d6	Deprecate NoneShardSpec and drop support for automatic segment merge (#6883 ) * Deprecate noneShardSpec * clean up noneShardSpec constructor * revert unnecessary change * Deprecate mergeTask * add more doc * remove convert from indexMerger * Remove mergeTask * remove HadoopDruidConverterConfig * fix build * fix build * fix teamcity * fix teamcity * fix ServerModule * fix compilation * fix compilation	2019-03-15 23:29:25 -07:00
Furkan KAMACI	7ada1c49f9	Prohibit Throwables.propagate() (#7121 ) * Throw caught exception. * Throw caught exceptions. * Related checkstyle rule is added to prevent further bugs. * RuntimeException() is used instead of Throwables.propagate(). * Missing import is added. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * Throwables are propogated if possible. * * Checkstyle definition is improved. * Throwables.propagate() usages are removed. * Checkstyle pattern is changed for only scanning "Throwables.propagate(" instead of checking lookbehind. * Throwable is kept before firing a Runtime Exception. * Fix unused assignments.	2019-03-14 18:28:33 -03:00
Clint Wylie	3895914aa2	consolidate CompressionUtils.java since now in the same jar (#6908 )	2019-03-13 11:02:44 -04:00
Clint Wylie	4d3987c1dd	lifecycle stage refactor to ensure proper start and stop ordering of servers and announcements (#7234 ) * lifecycle stage refactor to ensure proper ordering of servers and announcements * move DerivativeDataSourceManager to Lifecycle.Stage.NORMAL	2019-03-12 07:09:03 -07:00
Jonathan Wei	0b4f771062	Exclude hadoop-lzo from thrift-extensions build (#7151 )	2019-02-27 19:57:53 -08:00
Jihoon Son	4e2b085201	Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file in deep storage (#6911 ) * Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file * delete descriptor.file when killing segments * fix test * Add doc for ha * improve warning	2019-02-20 15:10:29 -08:00
Furkan KAMACI	9a521526c7	Since notify might not wake up the right thread, notifyAll should be used instead. (#6931 ) * Since notify might not wake up the right thread, notifyAll should be used instead. * Comment is added about why notifyAll() is not used.	2019-02-20 09:02:58 -08:00
Fangyuan Deng	7d1e8f353e	bugfix: when building materialized-view, if taskCount>1, may cause concurrentModificationException (#6690 ) * bugfix: when building materialized-view, if taskCount >1, may cause ConcurrentModificationException * remove entry after iteration instead of using ConcurrentMap, and add unit test * small change * modify unit test for coverage * remove unused method	2019-02-19 13:10:55 -08:00
Surekha	80a2ef7be4	Support kafka transactional topics (#5404 ) (#6496 ) * Support kafka transactional topics * update kafka to version 2.0.0 * Remove the skipOffsetGaps option since it's not used anymore * Adjust kafka consumer to use transactional semantics * Update tests * Remove unused import from test * Fix compilation * Invoke transaction api to fix a unit test * temporary modification of travis.yml for debugging * another attempt to get travis tasklogs * update kafka to 2.0.1 at all places * Remove druid-kafka-eight dependency from integration-tests, remove the kafka firehose test and deprecate kafka-eight classes * Add deprecated in docs for kafka-eight and kafka-simple extensions * Remove skipOffsetGaps and code changes for transaction support * Fix indentation * remove skipOffsetGaps from kinesis * Add transaction api to KafkaRecordSupplierTest * Fix indent * Fix test * update kafka version to 2.1.0	2019-02-18 11:50:08 -08:00
Jonathan Wei	1f29940811	Fix momentsketch build issues (#7074 ) * Fix momentsketch build issues * Remove unused section in pom * Fix test * Remove unused method * Checkstyle	2019-02-13 21:32:43 -08:00
Edward Gan	90c1a54b86	Moments Sketch custom aggregator (#6581 ) * Moments Sketch Integration with Druid * updates, add documentation, fix warnings * nits * disallowed base64 * update to druid 0.14	2019-02-13 14:03:47 -08:00
Jonathan Wei	fafbc4a80e	Set version to 0.15.0-incubating-SNAPSHOT (#7014 )	2019-02-07 14:02:52 -08:00
Jonathan Wei	8bc5eaa908	Set version to 0.14.0-incubating-SNAPSHOT (#7003 )	2019-02-04 19:36:20 -08:00
Roman Leventov	0e926e8652	Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager (#6898 ) * Prohibit assigning concurrent maps into Map-types variables and fields; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool * Enforce that if compute(), computeIfAbsent(), computeIfPresent() or merge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java * Remove unnecessary comment * Checkstyle * Fix getFromExtensions() * Add a reference to the comment about guarded computeIfAbsent() optimization; IdentityHashMap optimization * Fix UriCacheGeneratorTest * Workaround issue with MaterializedViewQueryQueryToolChest * Strengthen Appenderator's contract regarding concurrency	2019-02-04 09:18:12 -08:00
David Glasser	9eaf8f5304	google-storage: retry GoogleTaskLogs inserts (#6918 ) This is an extension of PR #5750 by @drcrallen which added retry to a variety of GCS operations, but not to GoogleTaskLogs, which we have found to occasionally fail in our cluster. Also fixes a typo in a variable name and removes an unused private method parameter. Fixes #6912.	2019-01-31 01:21:35 -08:00
Clint Wylie	a6d81c0d16	Adds bloom filter aggregator to 'druid-bloom-filters' extension (#6397 ) * blooming aggs * partially address review * fix docs * minor test refactor after rebase * use copied bloomkfilter * add ByteBuffer methods to BloomKFilter to allow agg to use in place, simplify some things, more tests * add methods to BloomKFilter to get number of set bits, use in comparator, fixes * more docs * fix * fix style * simplify bloomfilter bytebuffer merge, change methods to allow passing buffer offsets * oof, more fixes * more sane docs example * fix it * do the right thing in the right place * formatting * fix * avoid conflict * typo fixes, faster comparator, docs for comparator behavior * unused imports * use buffer comparator instead of deserializing * striped readwrite lock for buffer agg, null handling comparator, other review changes * style fixes * style * remove sync for now * oops * consistency * inspect runtime shape of selector instead of selector plus, static comparator, add inner exception on serde exception * CardinalityBufferAggregator inspect selectors instead of selectorPluses * fix style * refactor away from using ColumnSelectorPlus and ColumnSelectorStrategyFactory to instead use specialized aggregators for each supported column type, other review comments * adjustment * fix teamcity error? * rename nil aggs to empty, change empty agg constructor signature, add comments * use stringutils base64 stuff to be chill with master * add aggregate combiner, comment	2019-01-29 20:05:17 +07:00
Benedict Jin	72a571fbf7	For performance reasons, use `java.util.Base64` instead of Base64 in Apache Commons Codec and Guava (#6913 ) * * Add few methods about base64 into StringUtils * Use `java.util.Base64` instead of others * Add org.apache.commons.codec.binary.Base64 & com.google.common.io.BaseEncoding into druid-forbidden-apis * Rename encodeBase64String & decodeBase64String * Update druid-forbidden-apis	2019-01-25 17:32:29 -08:00
Roman Leventov	8eae26fd4e	Introduce SegmentId class (#6370 ) * Introduce SegmentId class * tmp * Fix SelectQueryRunnerTest * Fix indentation * Fixes * Remove Comparators.inverse() tests * Refinements * Fix tests * Fix more tests * Remove duplicate DataSegmentTest, fixes #6064 * SegmentDescriptor doc * Fix SQLMetadataStorageUpdaterJobHandler * Fix DataSegment deserialization for ignoring id * Add comments * More comments * Address more comments * Fix compilation * Restore segment2 in SystemSchemaTest according to a comment * Fix style * fix testServerSegmentsTable * Fix compilation * Add comments about why SegmentId and SegmentIdWithShardSpec are separate classes * Fix SystemSchemaTest * Fix style * Compare SegmentDescriptor with SegmentId in Javadoc and comments rather than with DataSegment * Remove a link, see https://youtrack.jetbrains.com/issue/IDEA-205164 * Fix compilation	2019-01-21 11:11:10 -08:00
Jonathan Wei	b18d681551	Use kafka_2.12-0.10.2.2 (#6846 )	2019-01-10 20:52:55 -08:00
pzhdfy	ea3f426171	bugfix: Materialized view not support post-aggregator (#6689 ) * bugfix: Materialized view not support post post-aggregator * add unit test	2019-01-10 14:25:09 -08:00
David Glasser	c08f391605	statsd-emitter: support constant DogStatsD tags (#6791 ) PR #6605 added support to the statsd emitter for DogStatsD tags. This commit lets you specify "constant tags" in the config file which are included with every event. This is helpful if you are running in an environment where you cannot configure your datadog-agent with tags like "cluster name" --- eg, a Kubernetes cluster with a datadog-agent on each node and different Druid deployments in different namespaces but sharing the same datadog-agent daemonset. Also fix the name of an existing boolean getter to start with 'is'.	2019-01-04 15:35:37 +08:00
David Lim	3443f9a008	add missing contrib extensions to distribution packaging, fix influx package name, fix checkstyle plugin config to support Maven < 3.3.9 (#6717 )	2018-12-10 17:56:32 -08:00
Mingming Qiu	e8dd3716b8	add close method in Cache interface (#6540 ) * add close method in Cache interface * address comments * address comments and fix travis-ci * use try-finally	2018-12-06 17:28:41 +08:00
Mingming Qiu	c5405bb592	emit maxLag/avgLag in KafkaSupervisor (#6587 ) * emit maxLag/totalLag/avgLag in KafkaSupervisor * modify ingest/kafka/totalLag to ingest/kafka/lag for backwards compatibility	2018-11-28 02:11:14 -08:00
Roman Leventov	87b96fb1fd	Add checkstyle rules about imports and empty lines between members (#6543 ) * Add checkstyle rules about imports and empty lines between members * Add suppressions * Update Eclipse import order * Add empty line * Fix StatsDEmitter	2018-11-20 12:42:15 +01:00
Deiwin Sarjas	e0d1dc5846	Support DogStatsD style tags in statsd-emitter (#6605 ) * Replace StatsD client library The [Datadog package][1] is a StatsD compatible drop-in replacement for the client library, but it seems to be [better maintained][2] and has support for Datadog DogStatsD specific features, which will be made use of in a subsequent commit. The `count`, `time`, and `gauge` methods are actually exactly compatible with the previous library and the modifications shouldn't be required, but EasyMock seems to have a hard time dealing with the variable arguments added by the DogStatsD library and causes tests to fail if no arguments are provided for the last String vararg. Passing an empty array fixes the test failures. [1]: https://github.com/DataDog/java-dogstatsd-client [2]: https://github.com/tim-group/java-statsd-client/issues/37#issuecomment-248698856 * Retain dimension key information for StatsD metrics This doesn't change behavior, but allows separating dimensions from the metric name in subsequent commits. There is a possible order change for values from `dimsBuilder.build().values()`, but from the tests it looks like it doesn't affect actual behavior and the order of user dimensions is also retained. * Support DogStatsD style tags in statsd-emitter Datadog [doesn't support name-encoded dimensions and uses a concept of _tags_ instead.][1] This change allows Datadog users to send the metrics without having to encode the various dimensions in the metric names. This enables building graphs and monitors with and without aggregation across various dimensions from the same data. As tests in this commit verify, the behavior remains the same for users who don't enable the `druid.emitter.statsd.dogstatsd` configuration flag. [1]: https://www.datadoghq.com/blog/the-power-of-tagged-metrics/#tags-decouple-collection-and-reporting * Disable convertRange behavior for DogStatsD users DogStatsD, unlike regular StatsD, supports floating-point values, so this behavior is unnecessary. It would be possible to still support `convertRange`, even with `dogstatsd` enabled, but that would mean that people using the default mapping would have some of the gauges unnecessarily converted. `time` is in milliseconds and doesn't support floating-point values.	2018-11-19 09:47:57 -08:00
Mingming Qiu	93b0d58571	optimize input row parsers (#6590 ) * optimize input row parsers * address comments	2018-11-16 11:48:32 +08:00
Jihoon Son	d738ce4d2a	Enforce logging when killing a task (#6621 ) * Enforce logging when killing a task * fix test * address comment * address comment	2018-11-16 10:01:56 +08:00
Roman Leventov	8f3fe9cd02	Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies (#6607 ) * Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies * Fix bug * Replace checkstyle regexp with IntelliJ inspection	2018-11-15 13:21:34 -08:00
David Lim	afb239b17a	add missing license headers, in particular to MD files; clean up RAT … (#6563 ) * add missing license headers, in particular to MD files; clean up RAT exclusions * revert inadvertent doc changes * docs * cr changes * fix modified druid-production.svg	2018-11-13 09:38:37 -08:00
Roman Leventov	54351a5c75	Fix various bugs; Enable more IntelliJ inspections and update error-prone (#6490 ) * Fix various bugs; Enable more IntelliJ inspections and update error-prone * Fix NPE * Fix inspections * Remove unused imports	2018-11-06 14:38:08 -08:00
Clint Wylie	1224d8b746	overhaul 'druid-parquet-extensions' module, promoting from 'contrib' to 'core' (#6360 ) * move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql * fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays * remove leftover print * convert micro timestamp to millis * checkstyle * add ignore for .parquet and .parq to rat exclude * fix legit test failure from avro flattern behavior change * fix rebase * add exclusions to pom to cut down on redundant jars * refactor tests, add support for unwrapping lists for parquet-avro, review comments * more comment * fix oops * tweak parquet-avro list handling * more docs * fix style * grr styles	2018-11-05 21:33:42 -08:00
Roman Leventov	2cdce2e2a6	Add RequestLogEventBuilderFactory (#6477 ) This PR allows to control the fields in `RequestLogEvent`, emitted in `EmittingRequestLogger`. In our case, we want to get rid of the `intervals` fields of the query objects that are a part of `DefaultRequestLogEvent`. They are enormous (thousands of segments) and not useful. Related to #5522, FYI @a2l007.	2018-10-31 22:24:37 +01:00
QiuMM	676f5e6d7f	Prohibit some guava collection APIs and use JDK collection APIs directly (#6511 ) * Prohibit some guava collection APIs and use JDK APIs directly * reset files that changed by accident * sort codestyle/druid-forbidden-apis.txt alphabetically	2018-10-29 13:02:43 +01:00
Roman Leventov	84ac18dc1b	Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461 ) * Catch some incorrect method parameter or call argument formatting patterns with checkstyle * Fix DiscoveryModule * Inline parameters_and_arguments.txt * Fix a bug in PolyBind * Fix formatting	2018-10-23 07:17:38 -03:00
Roman Leventov	789c9a1dc7	Prohibit using Object\|Long\|Float\|DoubleColumnSelector in instanceof statements (#6470 ) * Prohibit using Object\|Long\|Float\|DoubleColumnSelector in instanceof statements * Doc fixes	2018-10-15 15:41:43 -07:00
Clint Wylie	84598fba3b	combine druid-api, druid-common, java-util into druid-core (#6443 ) * combine druid-api, druid-common, java-util * spacing	2018-10-14 20:37:37 -07:00
Charles Allen	0f4f5f2877	Cleanup jackson dependency exclusions (#6438 ) * Remove pulling in jackson from fasterxml in a lot of places * Remove codehaus extensions	2018-10-12 17:25:39 -06:00
David Lim	20ab213ba6	change project versions to 0.13.0-incubating-SNAPSHOT (#6453 )	2018-10-11 19:28:01 -07:00
Clint Wylie	1fa045862a	exclude all redundant and unecessary hive-exec depends in orc-extensions pom.xml (#6441 )	2018-10-10 08:24:28 -07:00
Roman Leventov	09126c021a	Remove Aggregator.clone() methods (#6437 ) * Remove Aggregator.clone() methods * Remove CardinalityAggregator.name	2018-10-10 10:07:56 -03:00
Charles Allen	1c4f787ed4	Upgrade Netty to 4.1.x (#6417 ) * Update netty to 4.1.30.Final * Fix compile time problems with new netty * Remove netty-all from rocketmq extension	2018-10-05 12:30:00 -07:00
QiuMM	0b8085aff7	Prohibit jackson ObjectMapper#reader methods which are deprecated (#6386 ) * Prohibit jackson ObjectMapper#reader methods which are deprecated * address comments	2018-10-03 17:55:20 -03:00
Roman Leventov	3ae563263a	Renamed 'Generic Column' -> 'Numeric Column'; Fixed a few resource leaks in processing; misc refinements (#5957 ) This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR. Some of the changes: - Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names. - Generified `ComplexMetricExtractor`	2018-10-02 14:50:22 -03:00
Shiv Toolsidass	a56ffe6ab2	Added backpressure metric to docs and defaultMetricDimensions (#6405 ) * Added backpressure metric to docs and defaultMetricDimensions.json * Reworded description for backpressure metric in docs	2018-09-29 17:57:29 -07:00
Jihoon Son	f09e718c68	Implement MapVirtualColumn.makeDimensionSelector properly (#6396 ) * Implement MapVirtualColumn.makeDimensionSelector properly * address comments	2018-09-29 14:13:05 -07:00
Clint Wylie	0f2dfe6fe8	fix issue with parsing 'date' columns with druid-orc-extensions (#6380 ) * fix issue with parsing date columns with druid-orc-extensions * stuff * fix forbidden api	2018-09-26 13:18:15 -07:00
Jonathan Wei	364bf9d1f9	Fix non org.apache.druid files and add package name checkstyle rule (#6367 ) * Fix non org.apache.druid files and add package name checkstyle rule * PR comment	2018-09-21 17:58:19 -07:00
Slim Bouguerra	028354eea8	Adding licenses and enable apache-rat-plugin. (#6215 ) * Adding licenses and enable apache-rat-plugi. Change-Id: I4685a2d9f1e147855dba69329b286f2d5bee3c18 * restore the copywrite of demo_table and add it to the list of allowed ones Change-Id: I2a9efde6f4b984bc1ac90483e90d98e71f818a14 * revirew comments Change-Id: I0256c930b7f9a5bb09b44b5e7a149e6ec48cb0ca * more fixup Change-Id: I1355e8a2549e76cd44487abec142be79bec59de2 * align Change-Id: I70bc47ecb577bdf6b91639dd91b6f5642aa6b02f	2018-09-18 08:39:26 -07:00
Roman Leventov	0c4bd2b57b	Prohibit some Random usage patterns (#6226 ) * Prohibit Random usage patterns * Fix FlattenJSONBenchmarkUtil	2018-09-14 13:35:51 -07:00
QiuMM	85391e9fb3	fix opentsdb emitter always be running and fail sending tags whose value contains colon (#6251 ) * fix opentsdb emitter always be running * check if emitter started * add more details about consumeDelay in doc * fix possible thread unsafe * fix fail sending tags whose value contain colon	2018-09-14 12:14:15 -07:00
Clint Wylie	91a37c692d	'suspend' and 'resume' support for supervisors (kafka indexing service, materialized views) (#6234 ) * 'suspend' and 'resume' support for kafka indexing service changes: * introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively. * `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise. * `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors * Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever * Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}` * Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate' * move overlord console status to own column in supervisor table so does not look like garbage * spacing * padding * other kind of spacing * fix rebase fail * fix more better * all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests * fix log	2018-09-13 14:42:18 -07:00
QiuMM	84810f6358	correct metric name in emitter configuration files (#6290 )	2018-09-04 14:23:04 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Benedict Jin	3647d4c94a	Make time-related variables more readable (#6158 ) * Make time-related variables more readable * Patch some improvements from the code reviewer * Remove unnecessary boxing of Long type variables	2018-08-21 15:29:40 -07:00
Jihoon Son	2bfe1b6a5a	Fix NPE for taskGroupId when rolling update (#6168 ) * Fix NPE for taskGroupId * missing changes * fix wrong annotation * fix potential race * keep baseSequenceName * make deprecated old param	2018-08-17 10:15:45 -07:00
kaijianding	78fc5b246c	trigger clean dir when unzip fails with IOException (#6178 )	2018-08-17 10:15:10 -07:00
Jihoon Son	ecee3e0a24	Further optimize memory for Travis jobs (#6150 ) * Further optimize memory for Travis jobs * fix build * sudo false	2018-08-10 22:03:36 -07:00
chengchengpei	c3b7704b50	validate non-empty String baseDataSource in MaterializedViewSupervisorSpec, added tests (#6075 )	2018-08-07 17:23:10 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
chengchengpei	914058f288	validate baseDataSource non-empty string in DerivativeDataSourceMetad… (#6072 ) * validate baseDataSource non-empty string in DerivativeDataSourceMetadata; added a test * added another test for validating null baseDataSource * restored the arguments check order	2018-08-03 13:38:18 -07:00
Roman Leventov	0754d78a2e	Prohibit Lists.newArrayList() with a single argument (#6068 ) * Prohibit Lists.newArrayList() with a single argument * Test fixes * Add Javadoc to Node constructor	2018-07-31 20:09:10 -07:00
Gian Merlino	3aa7017975	Remove some unnecessary task storage internal APIs. (#6058 ) * Remove some unnecessary task storage internal APIs. - Remove MetadataStorageActionHandler's getInactiveStatusesSince and getActiveEntriesWithStatus. - Remove TaskStorage's getCreatedDateTimeAndDataSource. - Remove TaskStorageQueryAdapter's getCreatedTime, and getCreatedDateAndDataSource. - Migrated all callers to getActiveTaskInfo and getCompletedTaskInfo. This has one side effect: since getActiveTaskInfo (new) warns and continues when it sees unreadable tasks, but getActiveEntriesWithStatus threw an exception when it encountered those, it means that after this patch bad tasks will be ignored when syncing from metadata storage rather than causing an exception to be thrown. IMO, this is an improvement, since the most likely reason for bad tasks is either: - A new version introduced an additional validation, and a pre-existing task doesn't pass it. - You are rolling back from a newer version to an older version. In both cases, I believe you would want to skip tasks that can't be deserialized, rather than blocking overlord startup. * Remove unused import. * Fix formatting. * Fix formatting.	2018-07-30 18:35:06 -07:00
Benedict Jin	331a0afb98	Remove redundant type parameters and enforce some other style and inspection rules (#5980 ) * Various changes about druid-services module * Patch improvements from reviewer * Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile * Fix ArraysAsListWithZeroOrOneArgument * Fix conflict * Fix ToArrayCallWithZeroLengthArrayArgument * Fix AliEqualsAvoidNull * Remove blank line * Remove unused import clauses * Fix code style in TopNQueryRunnerTest * Fix conflict * Don't use Collections.singletonList when converting the type of array type * Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase * Roll back the latest commit * Add java.io.File#toURL() into druid-forbidden-apis * Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord * Add a new regexp element into stylecode xml file * Fix style error for new regexp * Set the level of ArraysAsListWithZeroOrOneArgument as WARNING * Fix style error for new regexp * Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile * Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR * Add toArray(new Object[0]) regexp into checkstyle config file & fix them * Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it * Add a comment for string equals regexp in checkstyle config * Fix code format * Add RedundantTypeArguments as ERROR level inspection * Fix cannot resolve symbol datasource	2018-07-27 16:56:49 -05:00
Jihoon Son	c48aa74a30	Fix NPE while handling CheckpointNotice in KafkaSupervisor (#5996 ) * Fix NPE while handling CheckpointNotice * fix code style * Fix test * fix test * add a log for creating a new taskGroup * fix backward compatibility in KafkaIOConfig	2018-07-13 17:14:57 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Gian Merlino	948e73da77	Extend various test timeouts. (#5978 ) False failures on Travis due to spurious timeout (in turn due to noisy neighbors) is a bigger problem than legitimate failures taking too long to time out. So it makes sense to extend timeouts.	2018-07-10 13:02:14 -07:00
zhangxinyu	d857345b7d	add method getRequiredColumns for DimFilter (#5872 ) * add method getRequiredColumns for DimFilter * deal with the NullPointerException when DimFilter is null	2018-06-27 15:45:46 -07:00
Surekha	8619adb5b9	Improve task retrieval APIs on Overlord (#5801 ) * Add the new tasks api in overlordResource It takes 4 optional query params * state(pending/running/waiting/compelte) * dataSource * interval (applies to completed tasks) * maxCompletedTasks (applies to completed tasks) If all params are null, the api returns all the tasks * Add the state to each task returned by tasks endpoint * divide active tasks into waiting, pending or running * Add more unit tests * Add UNKNOWN state to TaskState * Fix the authorization calls * WIP: PR comments Added new class to capture task info for caching Other refactoring * Refactoring : move TaskStatus class to druid-api so it can be accessed within server And other related classes like TaskState and TaskStatusPlus are in api * Remove unused class and apis accessing it * Add a separate cache for recently completed tasks This is to mainly capture the task type from payload * Ignore a test * Add a RuntimeTaskState to encompass all states a task can be in * Revert "Add a RuntimeTaskState to encompass all states a task can be in" This reverts commit `2a527a0731`. * Fix wrong api call * Fix and unignore tests * Remove waiting,pending state from TaskState * Add RunnerTaskState * Missed the annotation runnerStatusCode * Fix the creationTime * Fix the createdTime and queueInsertionTime for running/active tasks * Clean up tests * Add javadocs * Potentially fix the teamcity build * Address PR comments Get rid of TaskInfoBuilder Make TaskInfoMapper static nested class Other changes fix import in MaterializedViewSupervisor after merge * Address PR comments on * Replace global cache with local map * combine multiple queries into one * Removed unused code * Fix unit tests Fix a bug in securedTaskStatusPlus * Remove getRecentlyFinishedTaskStatuses method Change TaskInfoMapper signature to add generic type * Address PR comments * Passed datasource as argument to be used in sql query * Other minor fixes * Address PR comments Some minor changes, rename method, spacing changes Add early auth check if datasource is not null * Fix test case * Add max limit to getRecentlyFinishedTaskInfo in HeapMemoryTaskStorage * Add TaskLocation to Anytask object * Address PR comments * Fix a bug in test case causing ClassCastException	2018-06-19 11:34:59 -07:00

1 2 3 4 5 ...

326 Commits