druid

Commit Graph

Author	SHA1	Message	Date
Goh Wei Xiang	f68a0693f3	Allow use of non-threadsafe ObjectCachingColumnSelectorFactory (#4397 ) * Adding a flag to indicate when ObjectCachingColumnSelectorFactory need not be threadsafe. * - Use of computeIfAbsent over putIfAbsent - Replace Maps.newXXXMap() with normal instantiation - Documentations on when is thread-safe required. - Use Builders for On/OffheapIncrementalIndex * - Optimization on computeIfAbsent - Constant EMPTY DimensionsSpec - Improvement on IncrementalIndexSchema.Builder - Remove setting of default values - Use var args for metrics - Correction on On/OffheapIncrementalIndex Builders - Combine On/OffheapIncrementalIndex Builders * - Removing unused imports. * - Helper method for testing with IncrementalIndex.Builder * - Correction on javadoc. * Style fix	2017-06-16 16:04:19 -05:00
Gian Merlino	17ef785618	Speed up sketch tests by merging fewer indexes. (#4413 ) The tests go from 5 minutes to about 10 seconds. 1000 maxRowCount is still low enough to get a few merges, so we're still exercising that functionality.	2017-06-15 14:47:55 -05:00
Roman Leventov	976492c186	Make PolyBind to fail if property value is not found (fixes #4369 ) (#4374 ) * Make PolyBind to fail if property value is not found * Fix test * Add onHeap option in NamespaceExtractionModule * Add PolyBind.createChoiceWithDefaultNoScope() * Fix NPE * Fix * Configure MetadataStorageProvider option for MySQL, PostgreSQL and SQLServer * Deprecate PolyBind.createChoiceWithDefault form with unused defaultKey * Fix NPE	2017-06-13 09:45:43 -07:00
Roman Leventov	c121845102	Avoid using Guava in DataSegmentPushers because of incompatibilities (#4391 ) * Avoid using Guava in DataSegmentPushers because of Hadoop incompatibilities * Clarify comments	2017-06-12 09:58:34 -07:00
Roman Leventov	5285eb961b	Update dependencies (#4313 ) * Update dependencies * Downgrade curator * Rollback aws-java-sdk dependency to 1.10.77 * Revert exclusions in integration-tests * Depend only on aws-java-sdk-ec2 instead of umbrella aws-java-sdk (fixes #4382)	2017-06-09 14:32:07 -07:00
Niketh Sabbineni	2cd91b64d0	Uncompress streams without having to download to tmp first (#4364 ) * Uncompress streams without having to download to tmp first * Remove unused file	2017-06-08 18:08:38 -07:00
Gian Merlino	1f2afccdf8	Expressions: Add ExprMacros. (#4365 ) * Expressions: Add ExprMacros, which have the same syntax as functions, but can convert themselves to any kind of Expr at parse-time. ExprMacroTable is an extension point for adding new ExprMacros. Anything that might need to parse expressions needs an ExprMacroTable, which can be injected through Guice. * Address code review comments.	2017-06-08 09:32:10 -04:00
Roman Leventov	63a897c278	Enable most IntelliJ 'Probable bugs' inspections (#4353 ) * Enable most IntelliJ 'Probable bugs' inspections * Fix in RemoteTestNG * Fix IndexSpec's equals() and hashCode() to include longEncoding * Fix inspection errors * Extract global isntance of natural().nullsFirst(); address comments * Fix * Use noinspection comments instead of SuppressWarnings on method for IntelliJ-specific inspections * Prohibit Ordering.natural().nullsFirst() using Checkstyle	2017-06-07 09:54:25 -07:00
Roman Leventov	31d33b333e	Make using implicit system Charset an error (#4326 ) * Make using implicit system charset an error * Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String() * Use English locale in StringUtils.safeFormat() * Restore comment	2017-06-05 23:57:25 -07:00
David Lim	13ecf90923	Report Kafka lag information in supervisor status report (#4314 ) * refactor lag reporting and report lag at status endpoint * refactor offset reporting logic to fetch offsets periodically vs. at request time * remove JavaCompatUtils * code review changes * code review changes	2017-06-05 13:26:25 -07:00
Slim	a2584d214a	Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support (#4116 ) * Adding s3a schema and s3a implem to hdfs storage module. * use 2.7.3 * use segment pusher to make loadspec * move getStorageDir and makeLoad spec under DataSegmentPusher * fix uts * fix comment part1 * move to hadoop 2.8 * inject deep storage properties * set version to 2.7.3 * fix build issue about static class * fix comments * fix default hadoop default coordinate * fix create filesytem * downgrade aws sdk * bump the version	2017-06-04 00:55:09 -06:00
Roman Leventov	ebabe14fbe	Rename ExtractionNamespaceCacheFactory to CachePopulator (the last part of #3667 ) (#4303 ) * Renamed ExtractionNamespaceCacheFactory to CachePopulator, and related classes * Rename CachePopulator to CacheGenerator	2017-06-03 10:09:44 +09:00
Jihoon Son	da32e1ae53	Reducing testing time for KafkaIndexTaskTest and KafkaSupervisorTest (#4352 )	2017-06-03 00:53:07 +09:00
Jihoon Son	f876246af7	Rename FiniteAppenderatorDriver to AppenderatorDriver (#4356 )	2017-06-03 00:48:44 +09:00
kaijianding	0efd18247b	explicitly unmap hydrant files when abandonSegment to recycle mmap memory (#4341 ) * fix TestKafkaExtractionCluster fail due to port already used * explicitly unmap hydrant files when abandonSegment to recyle mmap memory * address the comments * apply to AppenderatorImpl	2017-06-01 18:15:30 -05:00
Jihoon Son	1150bf7a2c	Refactoring Appenderator Driver (#4292 ) * Refactoring Appenderator 1) Added publishExecutor and handoffExecutor for background publishing and handing segments off 2) Change add() to not move segments out in it * Address comments 1) Remove publishTimeout for KafkaIndexTask 2) Simplifying registerHandoff() 3) Add increamental handoff test * Remove unused variable * Add persist() to Appenderator and more tests for AppenderatorDriver * Remove unused imports * Fix strict build * Address comments	2017-06-02 07:09:11 +09:00
Kenji Noguchi	3400f601db	Protobuf extension (#4039 ) * move ProtoBufInputRowParser from processing module to protobuf extensions * Ported PR #3509 * add DynamicMessage * fix local test stuff that slipped in * add license header * removed redundant type name * removed commented code * fix code style * rename ProtoBuf -> Protobuf * pom.xml: shade protobuf classes, handle .desc resource file as binary file * clean up error messages * pick first message type from descriptor if not specified * fix protoMessageType null check. add test case * move protobuf-extension from contrib to core * document: add new configuration keys, and descriptions * update document. add examples * move protobuf-extension from contrib to core (2nd try) * touch * include protobuf extensions in the distribution * fix whitespace * include protobuf example in the distribution * example: create new pb obj everytime * document: use properly quoted json * fix whitespace * bump parent version to 0.10.1-SNAPSHOT * ignore Override check * touch	2017-05-30 13:11:58 -07:00
Jihoon Son	7889891bd3	Fix integration tests (#4337 ) * Fix integration tests 1) Use the same version of kafka 2) Change ServiceEmitter from LazySingleton to ManageLifecycle * Revert unnecessary change	2017-05-28 08:48:39 -07:00
Gian Merlino	fe42db98ac	URIExtractionNamespace: Avoid problems due to canonicalization of lookup fields. (#4307 ) Disables canonicalization for simpleJson, where expect field names to be unique anyway. Keeps canonicalization enabled for customJson, but avoids sharing the table with the global ObjectMapper.	2017-05-24 17:41:04 -07:00
Jonathan Wei	d49e53e6c2	Timeout and maxScatterGatherBytes handling for queries run by Druid SQL (#4305 ) * Timeout and maxScatterGatherBytes handling for queries run by Druid SQL * Address PR comments * Fix contexts in CalciteQueryTest * Fix contexts in QuantileSqlAggregatorTest	2017-05-23 16:57:51 +09:00
Roman Leventov	7479cbde68	Make CacheScheduler a singleton (#4293 )	2017-05-18 15:46:02 -07:00
Jihoon Son	733dfc9b30	Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193 ) * Add PrefetcheableTextFilesFirehoseFactory * fix comment * exception handling * Fix wrong json property * Remove ReplayableFirehoseFactory and fix misspelling * Defer object initialization * Add a temporaryDirectory parameter to FirehoseFactory.connect() * fix when cache and fetch are disabled * Address comments * Add more test * Increase timeout for test * Add wrapObjectStream * Move methods to Firehose from PrefetchableFirehoseFactory * Cleanup comment * add directory listing to s3 firehose * Rename a variable * Addressing comments * Update document * Support disabling prefetch * Fix race condition * Add fetchLock * Remove ReplayableFirehoseFactoryTest * Fix compilation error * Fix test failure * Address comments * Add default implementation for new method	2017-05-18 15:37:18 +09:00
Himanshu	daa8ef8658	Optional long-polling based segment announcement via HTTP instead of Zookeeper (#3902 ) * Optional long-polling based segment announcement via HTTP instead of Zookeeper * address review comments * make endpoint /druid-internal/v1 instead of /druid/internal so that jetty qos filters can be configured easily when needed * update segment callback initialization to be called only after first segment list fetch has been succeeded from all servers * address review comments * remove size check not required anymore as only segment servers announce themselves and not all peon processes * annouce segment server on historical only after cached segments are loaded * fix checkstyle errors	2017-05-17 16:31:58 -05:00
Roman Leventov	d400f23791	Monomorphic processing of TopN queries with simple double aggregators over historical segments (part of #3798 ) (#4079 ) * Monomorphic processing of topN queries with simple double aggregators and historical segments * Add CalledFromHotLoop annocations to specialized methods in SimpleDoubleBufferAggregator * Fix a bug in Historical1SimpleDoubleAggPooledTopNScannerPrototype * Fix a bug in SpecializationService * In SpecializationService, emit maxSpecializations warning only once * Make GenericIndexed.theBuffer final * Address comments * Newline * Reapply `439c906` (Make GenericIndexed.theBuffer final) * Remove extra PooledTopNAlgorithm.capabilities field * Improve CachingIndexed.inspectRuntimeShape() * Fix CompressedVSizeIntsIndexedSupplier.inspectRuntimeShape() * Don't override inspectRuntimeShape() in subclasses of CompressedVSizeIndexedInts * Annotate methods in specializations of DimensionSelector and FloatColumnSelector with @CalledFromHotLoop * Make ValueMatcher to implement HotLoopCallee * Doc fix * Fix inspectRuntimeShape() impl in ExpressionSelectors * INFO logging of specialization events * Remove modificator * Fix OrFilter * Fix AndFilter * Refactor PooledTopNAlgorithm.scanAndAggregate() * Small refactoring * Add 'nothing to inspect' messages in empty HotLoopCallee.inspectRuntimeShape() implementations * Don't care about runtime shape in tests * Fix accessor bugs in Historical1SimpleDoubleAggPooledTopNScannerPrototype and HistoricalSingleValueDimSelector1SimpleDoubleAggPooledTopNScannerPrototype, cover them with tests * Doc wording * Address comments * Remove MagicAccessorBridge and ensure Offset subclasses are public * Attach error message to element	2017-05-16 16:19:55 -07:00
Roman Leventov	b7a52286e8	Make @Override annotation obligatory (#4274 ) * Make MissingOverride an error * Make travis stript to fail fast * Add missing Override annotations * Comment	2017-05-16 13:30:30 -05:00
David Lim	8333043b7b	add skipOffsetGaps flag (#4256 )	2017-05-16 12:19:28 -06:00
Benedict Jin	e823085866	Improve `collection` related things that reusing a immutable object instead of creating a new object (#4135 )	2017-05-17 01:38:51 +09:00
Jihoon Son	50a4ec2b0b	Add support for headers and skipping thereof for CSV and TSV (#4254 ) * initial commit * small fixes * fix bug * fix bug * address code review * more cr * more cr * more cr * fix * Skip head rows for CSV and TSV * Move checking skipHeadRows to FileIteratingFirehose * Remove checking null iterators * Remove unused imports * Address comments * Fix compilation error * Address comments * Add more tests * Add a comment to ReplayableFirehose * Addressing comments * Add docs and fix typos	2017-05-15 22:57:31 -07:00
Fokko Driesprong	5ca67644e7	Remove slf4j as dependencies (#4233 ) From the kafka-schema-registry-client in the avro extension slf4j will be packaged into the distribution. We don't want this as it will conflict and throw a slf4j multiple bindings warning. This will cause slf4j to fall back to no-operation (NOP) binding.	2017-05-12 15:59:14 +09:00
Roman Leventov	1ebfa22955	Update Error prone configuration; Fix bugs (#4252 ) * Make Errorprone the default compiler * Address comments * Make Error Prone's ClassCanBeStatic rule a error * Preconditions allow only %s pattern * Fix DruidCoordinatorBalancerTester * Try to give the compiler more memory * Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now * Don't show compiler warnings * Try different travis script * Fix travis.yml * Make Error Prone optional again * For error-prone compiler * Increase compiler's maxmem * Don't run Error Prone for benchmarks because of OOM * Skip install step in Travis * Remove MetricHolder.writeToChannel() * In travis.yml, check compilation before tests, because it may fail faster	2017-05-12 15:55:17 +09:00
Roman Leventov	e09e892477	Refactor QueryRunner to accept QueryPlus: Query + QueryMetrics (part of #3798 ) (#4184 ) * Add QueryPlus. Add QueryRunner.run(QueryPlus, Map) method with default implementation, to replace QueryRunner.run(Query, Map). * Fix GroupByMergingQueryRunnerV2 * Fix QueryResourceTest * Expand the comment to Query.run(walker, context) * Remove legacy version of BySegmentSkippingQueryRunner.doRun() * Add LegacyApiQueryRunnerTest and be more specific about legacy API removal plans in Druid 0.11 in Javadocs	2017-05-10 12:25:00 -07:00
Parag Jain	1fd177039d	fix auto reset - pause task instead of putting thread to sleep (#4244 )	2017-05-08 15:08:25 -07:00
Parag Jain	eb8e1b0a97	Prevent interrupted exception from polluting log during supervisor shutdown (#4253 ) * Prevent interrupted exception from polluting log during supervisor shutdown * do nothing in case of InterruptedException	2017-05-08 15:05:25 -07:00
Parag Jain	4502c207af	fix injection bug and documentation (#4243 )	2017-05-03 15:07:43 -05:00
Parag Jain	f9a61ea2ba	Kafka lag emitter - Kafka Indexing Service (#4194 ) * Kafka lag emitter * enforce minimum emit period to a minute * fixed comment	2017-05-02 17:30:07 -06:00
Roman Leventov	0bc18e7906	Make UpdateCounter proof to update count overflow (#4138 ) * Make UpdateCounter proof to update count overflow. * Fix	2017-05-01 09:59:49 -07:00
Bas van Schaik	54463941b9	Fix two alerts from lgtm.com: comparing two boxed primitive values using (#4212 ) the == or != operator compares object identity, which may not be intended Details: `013566ade9/files/extensions-core/datasketches/src/main/java/io/druid/query/aggregation/datasketches/theta/SketchEstimatePostAggregator.java (V144)` `013566ade9/files/extensions-core/datasketches/src/main/java/io/druid/query/aggregation/datasketches/theta/SketchMergeAggregatorFactory.java (V164)`	2017-04-26 14:56:25 -07:00
Akash Dwivedi	a2419654ea	Allow hadoop configurations using runtime properties. (#4189 )	2017-04-26 00:05:27 +05:30
Gian Merlino	3b92220015	Reduce log spam from Avro decoders. (#4205 ) These objects get constructed semi-frequently (any time a parser is deserialized) and so info logs are spammy. They'll still appear in task logs at least once, since they're part of the task definition and will get logged due to that.	2017-04-25 23:59:59 +05:30
Benedict Jin	de815da942	Some code refactor for better performance of `Avro-Extension` (#4092 ) * 1. Collections.singletonList instand of Arrays.asList; 2. close FSDataInputStream/ByteBufferInputStream for releasing resource; 3. convert com.google.common.base.Function into java.util.function.Function; 4. others code refactor * Put each param on its own line for code style * Revert GenericRecordAsMap back about `Function`	2017-04-25 12:46:32 +09:00
satishbhor	d51097c809	Fix lz4 library incompatibility in kafka-indexing-service extension (#4115 ) * Fix lz4 library incompatibility in kafka-indexing-service extension #3266 * Bumped Kafka version to 0.10.2.0 for : Fix lz4 library incompatibility in kafka-indexing-service extension #3266 * Replaced Lists.newArrayList() with Collections.singletonList() For Fix lz4 library incompatibility in kafka-indexing-service extension #4115	2017-04-25 12:23:51 +09:00
Gian Merlino	2ca7b00346	Update versions to 0.10.1-SNAPSHOT. (#4191 )	2017-04-20 18:12:28 -07:00
Jerry Chung	0bcfd9354c	Fix S3 deep storage push and s3 insert-segment-to-db (#4174 ) * Fix S3 deep storage push and s3 insert-segment-to-db * Less verbose checks in S3DataSegmentFinder	2017-04-14 19:42:10 -07:00
Gian Merlino	b2954d5fea	Better groupBy error messages and docs around resource limits. (#4162 ) * Better groupBy error messages and docs around resource limits. * Fix BufferGrouper test from datasketches. * Further clarify.	2017-04-13 10:38:53 -07:00
Roman Leventov	15f3a94474	Copy closer into Druid codebase (fixes #3652 ) (#4153 )	2017-04-10 09:38:45 +09:00
Parag Jain	7e0d4c9555	secure supervisor endpoints (#3985 )	2017-04-05 16:42:32 -07:00
Roman Leventov	73d9b31664	GenericIndexed minor bug fixes, optimizations and refactoring (#3951 ) * Minor bug fixes in GenericIndexed; Refactor and optimize GenericIndexed; Remove some unnecessary ByteBuffer duplications in some deserialization paths; Add ZeroCopyByteArrayOutputStream * Fixes * Move GenericIndexedWriter.writeLongValueToOutputStream() and writeIntValueToOutputStream() to SerializerUtils * Move constructors * Add GenericIndexedBenchmark * Comments * Typo * Note in Javadoc that IntermediateLongSupplierSerializer, LongColumnSerializer and LongMetricColumnSerializer are thread-unsafe * Use primitive collections in IntermediateLongSupplierSerializer instead of BiMap * Optimize TableLongEncodingWriter * Add checks to SerializerUtils methods * Don't restrict byte order in SerializerUtils.writeLongToOutputStream() and writeIntToOutputStream() * Update GenericIndexedBenchmark * SerializerUtils.writeIntToOutputStream() and writeLongToOutputStream() separate for big-endian and native-endian * Add GenericIndexedBenchmark.indexOf() * More checks in methods in SerializerUtils * Use helperBuffer.arrayOffset() * Optimizations in SerializerUtils	2017-03-27 14:17:31 -05:00
Benedict Jin	23f77ebd20	Explain Avro's unnecessary EOFException (#4098 ) (#4100 ) * Explain Avro's unnecessary EOFException (#4098) * add jira link into log message	2017-03-24 10:45:45 -05:00
Gian Merlino	4b9f975f50	Rename SketchAggregationWithSimpleDataTest. (#4105 ) Tests that don't end in "Test" won't get run automatically by Maven.	2017-03-23 14:20:50 -07:00
Akash Dwivedi	ff7f90b02d	relocate method in BufferAggregator. (#4071 ) * relocate method in BufferAggregator. * Unused import. * Detailed javadoc. * using Int2ObjectMap. * batch relocate. * Revert batch relocate. * Unused import. * code comments. * code comment.	2017-03-23 13:07:59 -07:00

1 2 3 4

196 Commits