druid

Commit Graph

Author	SHA1	Message	Date
Roman Leventov	31d33b333e	Make using implicit system Charset an error (#4326 ) * Make using implicit system charset an error * Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String() * Use English locale in StringUtils.safeFormat() * Restore comment	2017-06-05 23:57:25 -07:00
David Lim	13ecf90923	Report Kafka lag information in supervisor status report (#4314 ) * refactor lag reporting and report lag at status endpoint * refactor offset reporting logic to fetch offsets periodically vs. at request time * remove JavaCompatUtils * code review changes * code review changes	2017-06-05 13:26:25 -07:00
Slim	a2584d214a	Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support (#4116 ) * Adding s3a schema and s3a implem to hdfs storage module. * use 2.7.3 * use segment pusher to make loadspec * move getStorageDir and makeLoad spec under DataSegmentPusher * fix uts * fix comment part1 * move to hadoop 2.8 * inject deep storage properties * set version to 2.7.3 * fix build issue about static class * fix comments * fix default hadoop default coordinate * fix create filesytem * downgrade aws sdk * bump the version	2017-06-04 00:55:09 -06:00
Roman Leventov	ebabe14fbe	Rename ExtractionNamespaceCacheFactory to CachePopulator (the last part of #3667 ) (#4303 ) * Renamed ExtractionNamespaceCacheFactory to CachePopulator, and related classes * Rename CachePopulator to CacheGenerator	2017-06-03 10:09:44 +09:00
Jihoon Son	da32e1ae53	Reducing testing time for KafkaIndexTaskTest and KafkaSupervisorTest (#4352 )	2017-06-03 00:53:07 +09:00
Jihoon Son	f876246af7	Rename FiniteAppenderatorDriver to AppenderatorDriver (#4356 )	2017-06-03 00:48:44 +09:00
kaijianding	0efd18247b	explicitly unmap hydrant files when abandonSegment to recycle mmap memory (#4341 ) * fix TestKafkaExtractionCluster fail due to port already used * explicitly unmap hydrant files when abandonSegment to recyle mmap memory * address the comments * apply to AppenderatorImpl	2017-06-01 18:15:30 -05:00
Jihoon Son	1150bf7a2c	Refactoring Appenderator Driver (#4292 ) * Refactoring Appenderator 1) Added publishExecutor and handoffExecutor for background publishing and handing segments off 2) Change add() to not move segments out in it * Address comments 1) Remove publishTimeout for KafkaIndexTask 2) Simplifying registerHandoff() 3) Add increamental handoff test * Remove unused variable * Add persist() to Appenderator and more tests for AppenderatorDriver * Remove unused imports * Fix strict build * Address comments	2017-06-02 07:09:11 +09:00
Kenji Noguchi	3400f601db	Protobuf extension (#4039 ) * move ProtoBufInputRowParser from processing module to protobuf extensions * Ported PR #3509 * add DynamicMessage * fix local test stuff that slipped in * add license header * removed redundant type name * removed commented code * fix code style * rename ProtoBuf -> Protobuf * pom.xml: shade protobuf classes, handle .desc resource file as binary file * clean up error messages * pick first message type from descriptor if not specified * fix protoMessageType null check. add test case * move protobuf-extension from contrib to core * document: add new configuration keys, and descriptions * update document. add examples * move protobuf-extension from contrib to core (2nd try) * touch * include protobuf extensions in the distribution * fix whitespace * include protobuf example in the distribution * example: create new pb obj everytime * document: use properly quoted json * fix whitespace * bump parent version to 0.10.1-SNAPSHOT * ignore Override check * touch	2017-05-30 13:11:58 -07:00
Jihoon Son	7889891bd3	Fix integration tests (#4337 ) * Fix integration tests 1) Use the same version of kafka 2) Change ServiceEmitter from LazySingleton to ManageLifecycle * Revert unnecessary change	2017-05-28 08:48:39 -07:00
Gian Merlino	fe42db98ac	URIExtractionNamespace: Avoid problems due to canonicalization of lookup fields. (#4307 ) Disables canonicalization for simpleJson, where expect field names to be unique anyway. Keeps canonicalization enabled for customJson, but avoids sharing the table with the global ObjectMapper.	2017-05-24 17:41:04 -07:00
Jonathan Wei	d49e53e6c2	Timeout and maxScatterGatherBytes handling for queries run by Druid SQL (#4305 ) * Timeout and maxScatterGatherBytes handling for queries run by Druid SQL * Address PR comments * Fix contexts in CalciteQueryTest * Fix contexts in QuantileSqlAggregatorTest	2017-05-23 16:57:51 +09:00
Roman Leventov	7479cbde68	Make CacheScheduler a singleton (#4293 )	2017-05-18 15:46:02 -07:00
Jihoon Son	733dfc9b30	Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193 ) * Add PrefetcheableTextFilesFirehoseFactory * fix comment * exception handling * Fix wrong json property * Remove ReplayableFirehoseFactory and fix misspelling * Defer object initialization * Add a temporaryDirectory parameter to FirehoseFactory.connect() * fix when cache and fetch are disabled * Address comments * Add more test * Increase timeout for test * Add wrapObjectStream * Move methods to Firehose from PrefetchableFirehoseFactory * Cleanup comment * add directory listing to s3 firehose * Rename a variable * Addressing comments * Update document * Support disabling prefetch * Fix race condition * Add fetchLock * Remove ReplayableFirehoseFactoryTest * Fix compilation error * Fix test failure * Address comments * Add default implementation for new method	2017-05-18 15:37:18 +09:00
Himanshu	daa8ef8658	Optional long-polling based segment announcement via HTTP instead of Zookeeper (#3902 ) * Optional long-polling based segment announcement via HTTP instead of Zookeeper * address review comments * make endpoint /druid-internal/v1 instead of /druid/internal so that jetty qos filters can be configured easily when needed * update segment callback initialization to be called only after first segment list fetch has been succeeded from all servers * address review comments * remove size check not required anymore as only segment servers announce themselves and not all peon processes * annouce segment server on historical only after cached segments are loaded * fix checkstyle errors	2017-05-17 16:31:58 -05:00
Roman Leventov	d400f23791	Monomorphic processing of TopN queries with simple double aggregators over historical segments (part of #3798 ) (#4079 ) * Monomorphic processing of topN queries with simple double aggregators and historical segments * Add CalledFromHotLoop annocations to specialized methods in SimpleDoubleBufferAggregator * Fix a bug in Historical1SimpleDoubleAggPooledTopNScannerPrototype * Fix a bug in SpecializationService * In SpecializationService, emit maxSpecializations warning only once * Make GenericIndexed.theBuffer final * Address comments * Newline * Reapply `439c906` (Make GenericIndexed.theBuffer final) * Remove extra PooledTopNAlgorithm.capabilities field * Improve CachingIndexed.inspectRuntimeShape() * Fix CompressedVSizeIntsIndexedSupplier.inspectRuntimeShape() * Don't override inspectRuntimeShape() in subclasses of CompressedVSizeIndexedInts * Annotate methods in specializations of DimensionSelector and FloatColumnSelector with @CalledFromHotLoop * Make ValueMatcher to implement HotLoopCallee * Doc fix * Fix inspectRuntimeShape() impl in ExpressionSelectors * INFO logging of specialization events * Remove modificator * Fix OrFilter * Fix AndFilter * Refactor PooledTopNAlgorithm.scanAndAggregate() * Small refactoring * Add 'nothing to inspect' messages in empty HotLoopCallee.inspectRuntimeShape() implementations * Don't care about runtime shape in tests * Fix accessor bugs in Historical1SimpleDoubleAggPooledTopNScannerPrototype and HistoricalSingleValueDimSelector1SimpleDoubleAggPooledTopNScannerPrototype, cover them with tests * Doc wording * Address comments * Remove MagicAccessorBridge and ensure Offset subclasses are public * Attach error message to element	2017-05-16 16:19:55 -07:00
Roman Leventov	b7a52286e8	Make @Override annotation obligatory (#4274 ) * Make MissingOverride an error * Make travis stript to fail fast * Add missing Override annotations * Comment	2017-05-16 13:30:30 -05:00
David Lim	8333043b7b	add skipOffsetGaps flag (#4256 )	2017-05-16 12:19:28 -06:00
Benedict Jin	e823085866	Improve `collection` related things that reusing a immutable object instead of creating a new object (#4135 )	2017-05-17 01:38:51 +09:00
Jihoon Son	50a4ec2b0b	Add support for headers and skipping thereof for CSV and TSV (#4254 ) * initial commit * small fixes * fix bug * fix bug * address code review * more cr * more cr * more cr * fix * Skip head rows for CSV and TSV * Move checking skipHeadRows to FileIteratingFirehose * Remove checking null iterators * Remove unused imports * Address comments * Fix compilation error * Address comments * Add more tests * Add a comment to ReplayableFirehose * Addressing comments * Add docs and fix typos	2017-05-15 22:57:31 -07:00
Fokko Driesprong	5ca67644e7	Remove slf4j as dependencies (#4233 ) From the kafka-schema-registry-client in the avro extension slf4j will be packaged into the distribution. We don't want this as it will conflict and throw a slf4j multiple bindings warning. This will cause slf4j to fall back to no-operation (NOP) binding.	2017-05-12 15:59:14 +09:00
Roman Leventov	1ebfa22955	Update Error prone configuration; Fix bugs (#4252 ) * Make Errorprone the default compiler * Address comments * Make Error Prone's ClassCanBeStatic rule a error * Preconditions allow only %s pattern * Fix DruidCoordinatorBalancerTester * Try to give the compiler more memory * Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now * Don't show compiler warnings * Try different travis script * Fix travis.yml * Make Error Prone optional again * For error-prone compiler * Increase compiler's maxmem * Don't run Error Prone for benchmarks because of OOM * Skip install step in Travis * Remove MetricHolder.writeToChannel() * In travis.yml, check compilation before tests, because it may fail faster	2017-05-12 15:55:17 +09:00
Roman Leventov	e09e892477	Refactor QueryRunner to accept QueryPlus: Query + QueryMetrics (part of #3798 ) (#4184 ) * Add QueryPlus. Add QueryRunner.run(QueryPlus, Map) method with default implementation, to replace QueryRunner.run(Query, Map). * Fix GroupByMergingQueryRunnerV2 * Fix QueryResourceTest * Expand the comment to Query.run(walker, context) * Remove legacy version of BySegmentSkippingQueryRunner.doRun() * Add LegacyApiQueryRunnerTest and be more specific about legacy API removal plans in Druid 0.11 in Javadocs	2017-05-10 12:25:00 -07:00
Parag Jain	1fd177039d	fix auto reset - pause task instead of putting thread to sleep (#4244 )	2017-05-08 15:08:25 -07:00
Parag Jain	eb8e1b0a97	Prevent interrupted exception from polluting log during supervisor shutdown (#4253 ) * Prevent interrupted exception from polluting log during supervisor shutdown * do nothing in case of InterruptedException	2017-05-08 15:05:25 -07:00
Parag Jain	4502c207af	fix injection bug and documentation (#4243 )	2017-05-03 15:07:43 -05:00
Parag Jain	f9a61ea2ba	Kafka lag emitter - Kafka Indexing Service (#4194 ) * Kafka lag emitter * enforce minimum emit period to a minute * fixed comment	2017-05-02 17:30:07 -06:00
Roman Leventov	0bc18e7906	Make UpdateCounter proof to update count overflow (#4138 ) * Make UpdateCounter proof to update count overflow. * Fix	2017-05-01 09:59:49 -07:00
Bas van Schaik	54463941b9	Fix two alerts from lgtm.com: comparing two boxed primitive values using (#4212 ) the == or != operator compares object identity, which may not be intended Details: `013566ade9/files/extensions-core/datasketches/src/main/java/io/druid/query/aggregation/datasketches/theta/SketchEstimatePostAggregator.java (V144)` `013566ade9/files/extensions-core/datasketches/src/main/java/io/druid/query/aggregation/datasketches/theta/SketchMergeAggregatorFactory.java (V164)`	2017-04-26 14:56:25 -07:00
Akash Dwivedi	a2419654ea	Allow hadoop configurations using runtime properties. (#4189 )	2017-04-26 00:05:27 +05:30
Gian Merlino	3b92220015	Reduce log spam from Avro decoders. (#4205 ) These objects get constructed semi-frequently (any time a parser is deserialized) and so info logs are spammy. They'll still appear in task logs at least once, since they're part of the task definition and will get logged due to that.	2017-04-25 23:59:59 +05:30
Benedict Jin	de815da942	Some code refactor for better performance of `Avro-Extension` (#4092 ) * 1. Collections.singletonList instand of Arrays.asList; 2. close FSDataInputStream/ByteBufferInputStream for releasing resource; 3. convert com.google.common.base.Function into java.util.function.Function; 4. others code refactor * Put each param on its own line for code style * Revert GenericRecordAsMap back about `Function`	2017-04-25 12:46:32 +09:00
satishbhor	d51097c809	Fix lz4 library incompatibility in kafka-indexing-service extension (#4115 ) * Fix lz4 library incompatibility in kafka-indexing-service extension #3266 * Bumped Kafka version to 0.10.2.0 for : Fix lz4 library incompatibility in kafka-indexing-service extension #3266 * Replaced Lists.newArrayList() with Collections.singletonList() For Fix lz4 library incompatibility in kafka-indexing-service extension #4115	2017-04-25 12:23:51 +09:00
Gian Merlino	2ca7b00346	Update versions to 0.10.1-SNAPSHOT. (#4191 )	2017-04-20 18:12:28 -07:00
Jerry Chung	0bcfd9354c	Fix S3 deep storage push and s3 insert-segment-to-db (#4174 ) * Fix S3 deep storage push and s3 insert-segment-to-db * Less verbose checks in S3DataSegmentFinder	2017-04-14 19:42:10 -07:00
Gian Merlino	b2954d5fea	Better groupBy error messages and docs around resource limits. (#4162 ) * Better groupBy error messages and docs around resource limits. * Fix BufferGrouper test from datasketches. * Further clarify.	2017-04-13 10:38:53 -07:00
Roman Leventov	15f3a94474	Copy closer into Druid codebase (fixes #3652 ) (#4153 )	2017-04-10 09:38:45 +09:00
Parag Jain	7e0d4c9555	secure supervisor endpoints (#3985 )	2017-04-05 16:42:32 -07:00
Roman Leventov	73d9b31664	GenericIndexed minor bug fixes, optimizations and refactoring (#3951 ) * Minor bug fixes in GenericIndexed; Refactor and optimize GenericIndexed; Remove some unnecessary ByteBuffer duplications in some deserialization paths; Add ZeroCopyByteArrayOutputStream * Fixes * Move GenericIndexedWriter.writeLongValueToOutputStream() and writeIntValueToOutputStream() to SerializerUtils * Move constructors * Add GenericIndexedBenchmark * Comments * Typo * Note in Javadoc that IntermediateLongSupplierSerializer, LongColumnSerializer and LongMetricColumnSerializer are thread-unsafe * Use primitive collections in IntermediateLongSupplierSerializer instead of BiMap * Optimize TableLongEncodingWriter * Add checks to SerializerUtils methods * Don't restrict byte order in SerializerUtils.writeLongToOutputStream() and writeIntToOutputStream() * Update GenericIndexedBenchmark * SerializerUtils.writeIntToOutputStream() and writeLongToOutputStream() separate for big-endian and native-endian * Add GenericIndexedBenchmark.indexOf() * More checks in methods in SerializerUtils * Use helperBuffer.arrayOffset() * Optimizations in SerializerUtils	2017-03-27 14:17:31 -05:00
Benedict Jin	23f77ebd20	Explain Avro's unnecessary EOFException (#4098 ) (#4100 ) * Explain Avro's unnecessary EOFException (#4098) * add jira link into log message	2017-03-24 10:45:45 -05:00
Gian Merlino	4b9f975f50	Rename SketchAggregationWithSimpleDataTest. (#4105 ) Tests that don't end in "Test" won't get run automatically by Maven.	2017-03-23 14:20:50 -07:00
Akash Dwivedi	ff7f90b02d	relocate method in BufferAggregator. (#4071 ) * relocate method in BufferAggregator. * Unused import. * Detailed javadoc. * using Int2ObjectMap. * batch relocate. * Revert batch relocate. * Unused import. * code comments. * code comment.	2017-03-23 13:07:59 -07:00
Roman Leventov	84fe91ba0b	Monomorphic processing of TopN queries with 1 and 2 aggregators (key part of #3798 ) (#3889 ) * Monomorphic processing: add HotLoopCallee, CalledFromHotLoop, RuntimeShapeInspector, SpecializationService. Specialize topN queries with 1 or 2 aggregators. Add Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing. * Use Execs.singleThreaded() * RuntimeShapeInspector to support nullable fields * Make CalledFromHotLoop annotation Inherited * Remove unnecessary conversion of array of ColumnSelectorPluses to list and back to array in CardinalityAggregatorFactory * Close InputStream in SpecializationService * Formatting * Test specialized PooledTopNScanners * Set flags in PooledTopNAlgorithm directly * Fix tests, dependent on CountAggragatorFactory toString() form * Fix * Revert CountAggregatorFactory changes * Implement inspectRuntimeShape() for LongWrappingDimensionSelector and FloatWrappingDimensionSelector * Remove duplicate RoaringBitmap dependency in the extendedset pom.xml * Fix * Treat ByteBuffers specially in StringRuntimeShape * Doc fix * Annotate BufferAggregator.init() with CalledFromHotLoop * Make triggerSpecializationIterationsThreshold an int * Remove SpecializationService.PerPrototypeClassState.of() * Add comments * Limit the amount of specializations that SpecializationService could make * Add default implementation for BufferAggregator.inspectRuntimeShape(), for compatibility with extensions * Use more efficient ConcurrentMap's idioms in SpecializationService	2017-03-17 14:44:36 -05:00
Charles Allen	805d85afda	Allow compilation as Java8 source and target (#3328 ) * Allow compilation as Java8 source and target for everything except API * Remove conditions in tests which assume that we may run with Java 7 * Update easymock to 3.4 * Make Animal Sniffer to check Java 1.8 usage; remove redundant druid-caffeine-cache configuration * Use try-with-resources in LargeColumnSupportedComplexColumnSerializerTest.testSanity() * Remove java7 special for druid-api	2017-03-14 22:23:47 -06:00
Gian Merlino	3216134f8c	SQL: Make row extractions extensible and add one for lookups. (#3991 ) This is a reopening of #3989, since that PR was merged to master prematurely and accidentally.	2017-03-13 21:56:16 -07:00
Nishant Bangarwa	adbe89e7d6	Fix race in KafkaIndexTaskTest (#4031 ) task.pause(0) can return early before the task is actually paused. Exception for failure - java.lang.AssertionError: expected:<PAUSED> but was:<READING> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at io.druid.indexing.kafka.KafkaIndexTaskTest.testRunWithOffsetOutOfRangeEx ceptionAndPause(KafkaIndexTaskTest.java:1229) To reproduce add Thread.sleep(10000) in beginning of KafkaIndexTask.possiblypause method.	2017-03-09 07:34:46 -08:00
Gian Merlino	4ca5270e88	Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. (#4004 ) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since #3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.	2017-03-06 12:27:02 -06:00
Akash Dwivedi	bebf9f34c7	HdfsDataSegmentPusher bug fix (#4003 ) * Fix for HdfsDataSegmentPusher. * Add missing loadspec in actual descriptor file. Tests to check actual content of descriptor file.	2017-03-06 00:53:44 -08:00
Gian Merlino	df623ebfe3	Fix a couple bugs due to calling Period.getMillis(). (#4006 )	2017-03-05 18:44:20 +05:30
Roman Leventov	81a5f9851f	TmpFileIOPeons to create files under the merging output directory, instead of java.io.tmpdir (#3990 ) * In IndexMerger and IndexMergerV9, create temporary files under the output directory/tmpPeonFiles, instead of java.io.tmpdir * Use FileUtils.forceMkdir() across the codebase and remove some unused code * Fix test * Fix PullDependencies.run() * Unused import	2017-03-02 14:05:12 -08:00
Gian Merlino	e63eefd7ff	Revert "SQL: Make row extractions extensible and add one for lookups. (#3989 )" The PR was merged to master accidentally. This reverts commit `23927a3c96`.	2017-03-01 17:06:12 -08:00
Gian Merlino	23927a3c96	SQL: Make row extractions extensible and add one for lookups. (#3989 ) * SQL: Make row extractions extensible and add one for lookups. * Fix QuantileSqlAggregatorTest.	2017-03-01 17:03:43 -08:00
Akash Dwivedi	94da5e80f9	Namespace optimization for hdfs data segments. (#3877 ) * NN optimization for hdfs data segments. * HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage format.Docs update. * Common utility function in DataSegmentPusherUtil. * new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper * reuse getHdfsStorageDirUptoVersion in DataSegmentPusherUtil.getHdfsStorageDir() * Addressed comments. * Review comments. * HdfsDataSegmentKiller requested changes. * extra newline * Add maprfs.	2017-03-01 09:51:20 -08:00
Akash Dwivedi	91344cbe57	Enable GenericIndexed V2 for built-in(druid-io managed) complex columns. (#3987 ) * Enable GenericIndexed V2 for complex columns. * SerializerBuilder to use GenericColumnSerializer.	2017-02-28 22:06:54 -08:00
praveev	5ccfdcc48b	Fix testDeadlock timeout delay (#3979 ) * No more singleton. Reduce iterations * Granularities * Fix the delay in the test * Add license header * Remove unused imports * Lot more unused imports from all the rearranging * CR feedback * Move javadoc to constructor	2017-02-28 12:51:41 -06:00
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
Gian Merlino	f21641f0dc	Fix over-optimistic log message. (#3963 ) "Wrote task log" could be logged before the output stream is flushed and closed, which could generate an error and not actually write the log.	2017-02-22 15:02:53 -08:00
Parag Jain	edb032b96d	add datasource in intermediate segment path (#3961 )	2017-02-22 16:31:00 -06:00
Gian Merlino	985203b634	Finalize fields in postaggs (#3957 ) * initial commits for finalizeFieldAccess #2433 * fix some bugs to run a query * change name of method Queries.verifyAggregations to Queries.prepareAggregations * add Uts * fix Ut failures * rebased to master * address comments and add a Ut for arithmetic post aggregators * rebased to the master * address the comment of injection within arithmetic post aggregator * address comments and introduce decorate() in the PostAggregator interface. * Address comments. 1. Implements getComparator in FinalizingFieldAccessPostAggregator and add Uts for it 2. Some minor changes like renaming a method name. * Fix a code style mismatch. * Rebased to the master	2017-02-21 16:32:14 -08:00
Gian Merlino	16ef513c7d	SQL: Add context and contextual functions to planner. (#3919 ) * SQL: Add context and contextual functions to planner. Added support for context parameters specified as JDBC connection properties or a JSON object for SQL-over-JSON-over-HTTP. Also added features that depend on context functionality: - Added CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP functions. - Added support for time zones other than UTC via a "timeZone" context. - Pass down query context to Druid queries too. Also some bug fixes: - Fix DATE handling, it was largely done incorrectly before. - Fix CAST(__time TO DATE) which should do a floor-to-day. - Fix non-equality comparisons to FLOOR(__time TO X). - Fix maxQueryCount property. * Pass down context to nested queries too.	2017-02-15 14:09:14 -08:00
Gian Merlino	78b0d134ae	Require Java 8 and include some Java 8 dependencies. (#3914 ) * Require Java 8 and include some Java 8 dependencies. - Upgrade Jetty to 9.3.16.v20170120. - Upgrade DataSketches to 0.8.4. - Bundle caffeine-cache by default. - Still target Java 7 when compiling base Druid classes. * Update cluster, quickstart docs. * Remove oraclejdk7 from travis.yml.	2017-02-14 12:51:51 -08:00
Akash Dwivedi	8854ce018e	File.deleteOnExit() (#3923 ) * Less use of File.deleteOnExit() * removed deleteOnExit from most of the tests/benchmarks/iopeon * Made IOpeon closable * Formatting. * Revert DeterminePartitionsJobTest, remove cleanup method from IOPeon	2017-02-13 15:12:14 -08:00
Parag Jain	1f263fe50b	alert when resetting offsets (#3931 ) * alert when resetting offsets * add more data to alerts	2017-02-13 13:49:24 -08:00
michaelschiff	c1eee9bbf3	modified "end" column to `end` (#3903 ) * modified "end" column to `end`. "end" is interpretted as a string rather than dereferencing the column value * SQLMetadataConnector.getQuoteString defines the string that should be used to quote string fields * positional arguments for String.format * for Connectors that use " need to include the \ escape as well	2017-02-13 12:36:27 -08:00
Jihoon Son	991e2852da	Add PostAggregators to generator cache keys for top-n queries (#3899 ) * Add PostAggregators to generator cache keys for top-n queries * Add tests for strings * Remove debug comments * Add type keys and list sizes to cache key * Make post aggregators used for sort are considered for cache key generation * Use assertArrayEquals() * Improve findPostAggregatorsForSort() * Address comments * fix test failure * address comments	2017-02-13 12:23:44 -08:00
Parag Jain	8e31a465ad	report hand off count finite appenderator driver (#3925 )	2017-02-13 10:41:24 -08:00
Gian Merlino	12317fd001	Bump version to 0.10.0-SNAPSHOT. (#3913 )	2017-02-06 17:54:35 -08:00
Parag Jain	1aabb45a09	auto reset option for Kafka Indexing service (#3842 ) * auto reset option for Kafka Indexing service in case message at the offset being fetched is not present anymore at kafka brokers * review comments * review comments * reverted last change * review comments * review comments * fix typo	2017-02-02 14:57:45 -06:00
Nishant Bangarwa	a457cded28	Druid Extension to enable Authentication using Kerberos. (#3853 ) * Add extension for supporting kerberos security - This PR adds an extension for supporting druid authentication via Kerberos. - Working on the docs. * Add docs * review comments * more review comments * Block all paths by default * more review comments - use proper Oid * Allow extensions to override httpclient for integration tests * Add kerberos lock to prevent multithreaded issues. * review comment - remove enabled flag and fix router injection * Add Cookie Handling and more detailed docs * review comment - rename DruidKerberosConfig -> AuthKerberosConfig * review comments * fix travis failure on jdk7	2017-02-02 14:55:21 -06:00
Charles Allen	a73f1c9c70	Make s3 work better (#3898 )	2017-02-02 10:04:30 -08:00
Jonathan Wei	e6b95e80aa	Remove deprecated Aggregator/AggregatorFactory methods (#3894 )	2017-02-01 14:43:18 -08:00
Gian Merlino	ac84a3e011	SQL: Add resolution parameter, fix filtering bug with APPROX_QUANTILE (#3868 ) * SQL: Add resolution parameter to quantile agg, rename to APPROX_QUANTILE. * Fix bug with re-use of filtered approximate histogram aggregators. Also add APPROX_QUANTILE tests for filtering and running on complex columns. Includes some slight refactoring to allow tests to make DruidTables that include complex columns. * Remove unused import	2017-01-25 18:39:26 -08:00
Parag Jain	b3dae0efc3	catch all errors (#3844 )	2017-01-24 18:01:30 -07:00
Gian Merlino	d51f5e058d	SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. (#3852 ) * SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. Switched from CalciteConnection to Planner, bringing benefits: - CalciteConnection's JDBC interface no longer sits between the SQL server (HTTP/Avatica) and Druid's query layer. Instead, the SQL servers can use Druid Sequence objects directly, reducing overhead in the query return path. - Implemented our own Planner-based Avatica Meta, letting us control connection timeouts and connection / statement limits. The previous CalciteConnection-based implementation didn't have any limits or timeouts. - The Planner interface lets us override the operator table, opening up SQL language extensions. This patch includes two: APPROX_COUNT_DISTINCT in core, and a QUANTILE aggregator in the druid-histogram extension. Also: - Added INFORMATION_SCHEMA metadata schema. - Added tests for Unicode literals and escapes. * Verify statement is actually open before closing it. * More detailed INFORMATION_SCHEMA docs.	2017-01-19 16:32:20 -08:00
Akash Dwivedi	e550d48772	Using fully qualified hdfs path. (#3705 ) * Using fully qualified hdfs path. * Review changes. * Remove unused imports. * Variable name change.	2017-01-17 14:40:22 -06:00
Jihoon Son	d80bec83cc	Enable auto license checking (#3836 ) * Enable license checking * Clean duplicated license headers	2017-01-10 18:13:47 -08:00
Roman Leventov	49d71e9b38	Fix the build after #3697 (#3807 )	2016-12-26 17:06:48 -06:00
Roman Leventov	33800122ad	Don't return leaked Objects back to StupidPool, because this is dangerous. Reuse Cleaners in StupidPool. Make StupidPools named. Add StupidPool.leakedObjectCount(). Minor fixes (#3631 )	2016-12-26 00:35:35 -06:00
Roman Leventov	76cb06a8d8	Lookup cache refactoring (the main part of #3667 ) (#3697 ) * Lookup cache refactoring (the main part of druid-io/druid#3667) * Use PowerMock's static methods in NamespaceLookupExtractorFactoryTest * Fix KafkaLookupExtractorFactoryTest * Use VisibleForTesting annotation instead of Javadoc comment * Create a NamespaceExtractionCacheManager separately for each test in NamespaceExtractionCacheManagersTest * Rename CacheScheduler.NoCache.ENTRY_DISPOSED to ENTRY_CLOSED * Reduce visibility of NamespaceExtractionCacheManager.cacheCount() and monitor() implementations, and don't run NamespaceExtractionCacheManagerExecutorsTest with off-heap cache (it didn't before) * In NamespaceLookupExtractorFactory, use safer idiom to check if CacheState is NoCache or VersionedCache * More logging in CacheHandler constructor and close(), VersionedCache.close() * PR comments addressed * Make CacheScheduler.EntryImpl AutoCloseable, avoid 'dispose' verb in comments, logging and naming in CacheScheduler in favor of 'close' * More Javadoc comments to CacheScheduler * Fix NPE * Remove logging in OnHeapNamespaceExtractionCacheManager.expungeCollectedCaches() * Make NamespaceExtractionCacheManagersTest.testRacyCreation() to have similar load to what it be before the refactoring * Unwrap NamespaceExtractionCacheManager.scheduledExecutorService from unneeded MoreExecutors.listeningDecorator() and specify that this is ScheduledThreadPoolExecutor, which ensures happens-before between periodic runs of the tasks * More comments on MapDbCacheDisposer.disposed * Replace concat with Long.toString() * Comment on why NamespaceExtractionCacheManager.scheduledExecutorService() returns ScheduledThreadPoolExecutor * Place logging statements in VersionedCache.close() and CacheHandler.close() after actual closing logic, because logging may fail * Make JDBCExtractionNamespaceCacheFactory and StaticMapExtractionNamespaceCacheFactory to try to close newly created VersionedCache if population has failed, as it is done already in URIExtractionNamespaceCacheFactory * Don't close the whole CacheScheduler.Entry, if the cache update task failed * Replace AtomicLong updateCounter and firstRunLatch with Phaser-based UpdateCounter in CacheScheduler.EntryImpl	2016-12-23 18:04:27 -08:00
Himanshu	4ca3b7f1e4	overlord helpers framework and tasklog auto cleanup (#3677 ) * overlord helpers framework and tasklog auto cleanup * review comment changes * further review comments addressed	2016-12-21 15:18:55 -08:00
Gian Merlino	6440ddcbca	Fix #3795 (Java 7 compatibility). (#3796 ) * Fix #3795 (Java 7 compatibility). Also introduce Animal Sniffer checks during build, which would have caught the original problems. * Add Animal Sniffer on caffeine-cache for JDK8.	2016-12-21 10:19:13 -08:00
David Lim	0b9dff0bc1	fix worker thread pool exhaustion bug (#3760 ) * fix worker thread pool exhaustion bug * code review changes * code review changes	2016-12-09 15:23:11 -08:00
David Lim	7f087cdd3b	allow Kafka consumer group.id to be overriden by config (#3765 )	2016-12-08 15:53:13 -08:00
Charles Allen	27ab23ef44	Don't update segment metadata if archive doesn't move anything (#3476 ) * Don't update segment metadata if archive doesn't move anything * Fix restore task to handle potential null values * Don't try to update empty metadata * Address review comments * Move to druid-io java-util	2016-12-01 07:49:28 -08:00
Parag Jain	7ee6bb7410	option to reset offest automatically in case of OffsetOutOfRangeException (#3678 ) * option to reset offset automatically in case of OffsetOutOfRangeException if the next offset is less than the earliest available offset for that partition * review comments * refactoring * refactor * review comments	2016-11-21 16:29:46 -06:00
Roman Leventov	7b56cec3b9	Fix resource leaks (#3702 )	2016-11-18 21:21:36 +05:30
Gian Merlino	7e80d1045a	Exercise v2 engine in the groupBy aggregator and multi-value dimension tests. (#3698 ) This also involved some other test changes: - Added a factory.mergeRunners step to AggregationTestHelper's groupBy chain, since the v2 engine does merging there. - Changed test byteBuffer pools from on-heap to off-heap to work around https://github.com/DataSketches/sketches-core/pull/116 for datasketches tests.	2016-11-16 20:02:25 -08:00
Gian Merlino	bcd20441be	Make buildV9Directly the default. (#3688 )	2016-11-14 09:29:32 -08:00
Roman Leventov	988d97b09c	Unwrap exceptions from RuntimeException in URIExtractionNamespaceCacheFactory.populateCache() (part of #3667 ) (#3668 ) * Unwrap exceptions from RuntimeException in URIExtractionNamespaceCacheFactory.populateCache() * Fix tests	2016-11-11 17:25:41 -08:00
Himanshu	ddc078926b	consolidate different theta sketch representations into SketchHolder (#3671 )	2016-11-11 10:20:41 -08:00
Himanshu	b76b3f8d85	reset-cluster command to clean up druid state stored on metadata and deep storage (#3670 )	2016-11-09 11:07:01 -06:00
Nicolas Colomer	37ecffb648	Add support for Confluent Schema Registry in the avro extension (#3529 )	2016-11-08 16:10:45 -06:00
Gian Merlino	657e4512d2	Checkstyle checks for AvoidStaticImport, UnusedImports. (#3660 ) Excludes tests from AvoidStaticImport, since those are used often there and I didn't want to make this changeset too large. Production code use was minimal and I switched those to non-static imports.	2016-11-05 11:34:36 -07:00
Roman Leventov	22b57ddd60	Make ExtractionNamespaceCacheFactory to populate cache directly instead of returning callable (#3651 ) * Rename ExtractionNamespaceCacheFactory.getCachePopulator() to populateCache() and make it to populate cache itself instead of returning a Callable which populates cache, because this "callback style" is not actually needed. ExtractionNamespaceCacheFactory isn't a "factory" so it should be renamed, but renaming right in this commit would tear the git history for files, because ExtractionNamespaceCacheFactory implementations have too many changed lines. Going to rename ExtractionNamespaceCacheFactory to something like "CachePopulator" in one of subsequent PRs. This commit is a part of a bigger refactoring of the lookup cache subsystem. * Remove unused line and imports	2016-11-04 13:33:16 -07:00
Gian Merlino	4203580290	URIExtractionNamespace: Treat null values in lookup maps as missing entries. (#3512 ) * URIExtractionNamespace: Treat null values in lookup maps as missing entries. This is useful when many logical lookups are derived from the same base JSON file, and some lookups' values may be unknown sometimes. * Add test, logging message, and address other comments. * Update docs.	2016-11-03 13:53:04 -07:00
Himanshu	2362effd8c	use FileSystem.rename(from,to,Rename.NONE) so that tmp dirs from replicating tasks are not moved to the segment directory created by first task (#3650 )	2016-11-02 15:58:55 -07:00
Roman Leventov	36a1543222	Lookup cache bug fixes (#3609 ) * Return better lastVersion from JDBCExtractionNamespaceCacheFactory's cache populator callable * Return the lastVersion if URI lookup last modified date is not later than the last cached, from URIExtractionNamespaceCacheFactory's cache populator callable * Fix a race condition in NamespaceExtractionCacheManager.cancelFuture() * Don't delete cache from NamespaceExtractionCacheManager if the ExtractionNamespaceCacheFactory returned the same version as the last; Better exception treatment in the scheduled cache updater runnable in NamespaceExtractionCacheManager (in particular, don't consume Errors); throw AssertionError in StaticMapExtractionNamespaceCacheFactory if the lastVersion != null) * In NamespaceExtractionCacheManager, put NamespaceImplData.latestVersion update in the same synchronized() block with swapAndClearCache(id, cacheId); Turn getPostRunnable which returns a callback into a simple updateNamespace() method * In StaticMapExtractionNamespaceCacheFactory.getCachePopulator(), check the input directly, not inside a callback * In URIExtractionNamespaceCacheFactory, allow URI last modified time to go backwards * Better logging in NamespaceExtractionCacheManager * Add comment on lastVersion nullability in URIExtractionNamespaceCacheFactory	2016-11-02 09:40:19 -07:00
Himanshu	eb70a12e43	fix cleanup of tmp dir in HdfsDataSegmentPusher (#3636 )	2016-11-01 12:45:38 -05:00
Gian Merlino	89d9c61894	Deprecate Aggregator.getName and AggregatorFactory.getAggregatorStartValue. (#3572 )	2016-10-31 15:24:30 -07:00
Himanshu	23a8e22836	fix SketchMergeAggregatorFactory.finalizeResults, comparator and more UTs for timeseries, topN (#3613 )	2016-10-28 15:48:33 -07:00

1 2 3 4 5

238 Commits