druid

Commit Graph

Author	SHA1	Message	Date
Charles Allen	805d85afda	Allow compilation as Java8 source and target (#3328 ) * Allow compilation as Java8 source and target for everything except API * Remove conditions in tests which assume that we may run with Java 7 * Update easymock to 3.4 * Make Animal Sniffer to check Java 1.8 usage; remove redundant druid-caffeine-cache configuration * Use try-with-resources in LargeColumnSupportedComplexColumnSerializerTest.testSanity() * Remove java7 special for druid-api	2017-03-14 22:23:47 -06:00
Gian Merlino	3216134f8c	SQL: Make row extractions extensible and add one for lookups. (#3991 ) This is a reopening of #3989, since that PR was merged to master prematurely and accidentally.	2017-03-13 21:56:16 -07:00
Nishant Bangarwa	adbe89e7d6	Fix race in KafkaIndexTaskTest (#4031 ) task.pause(0) can return early before the task is actually paused. Exception for failure - java.lang.AssertionError: expected:<PAUSED> but was:<READING> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at io.druid.indexing.kafka.KafkaIndexTaskTest.testRunWithOffsetOutOfRangeEx ceptionAndPause(KafkaIndexTaskTest.java:1229) To reproduce add Thread.sleep(10000) in beginning of KafkaIndexTask.possiblypause method.	2017-03-09 07:34:46 -08:00
Gian Merlino	4ca5270e88	Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. (#4004 ) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since #3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.	2017-03-06 12:27:02 -06:00
Akash Dwivedi	bebf9f34c7	HdfsDataSegmentPusher bug fix (#4003 ) * Fix for HdfsDataSegmentPusher. * Add missing loadspec in actual descriptor file. Tests to check actual content of descriptor file.	2017-03-06 00:53:44 -08:00
Gian Merlino	df623ebfe3	Fix a couple bugs due to calling Period.getMillis(). (#4006 )	2017-03-05 18:44:20 +05:30
Roman Leventov	81a5f9851f	TmpFileIOPeons to create files under the merging output directory, instead of java.io.tmpdir (#3990 ) * In IndexMerger and IndexMergerV9, create temporary files under the output directory/tmpPeonFiles, instead of java.io.tmpdir * Use FileUtils.forceMkdir() across the codebase and remove some unused code * Fix test * Fix PullDependencies.run() * Unused import	2017-03-02 14:05:12 -08:00
Gian Merlino	e63eefd7ff	Revert "SQL: Make row extractions extensible and add one for lookups. (#3989 )" The PR was merged to master accidentally. This reverts commit `23927a3c96`.	2017-03-01 17:06:12 -08:00
Gian Merlino	23927a3c96	SQL: Make row extractions extensible and add one for lookups. (#3989 ) * SQL: Make row extractions extensible and add one for lookups. * Fix QuantileSqlAggregatorTest.	2017-03-01 17:03:43 -08:00
Akash Dwivedi	94da5e80f9	Namespace optimization for hdfs data segments. (#3877 ) * NN optimization for hdfs data segments. * HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage format.Docs update. * Common utility function in DataSegmentPusherUtil. * new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper * reuse getHdfsStorageDirUptoVersion in DataSegmentPusherUtil.getHdfsStorageDir() * Addressed comments. * Review comments. * HdfsDataSegmentKiller requested changes. * extra newline * Add maprfs.	2017-03-01 09:51:20 -08:00
Akash Dwivedi	91344cbe57	Enable GenericIndexed V2 for built-in(druid-io managed) complex columns. (#3987 ) * Enable GenericIndexed V2 for complex columns. * SerializerBuilder to use GenericColumnSerializer.	2017-02-28 22:06:54 -08:00
praveev	5ccfdcc48b	Fix testDeadlock timeout delay (#3979 ) * No more singleton. Reduce iterations * Granularities * Fix the delay in the test * Add license header * Remove unused imports * Lot more unused imports from all the rearranging * CR feedback * Move javadoc to constructor	2017-02-28 12:51:41 -06:00
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
Gian Merlino	f21641f0dc	Fix over-optimistic log message. (#3963 ) "Wrote task log" could be logged before the output stream is flushed and closed, which could generate an error and not actually write the log.	2017-02-22 15:02:53 -08:00
Parag Jain	edb032b96d	add datasource in intermediate segment path (#3961 )	2017-02-22 16:31:00 -06:00
Gian Merlino	985203b634	Finalize fields in postaggs (#3957 ) * initial commits for finalizeFieldAccess #2433 * fix some bugs to run a query * change name of method Queries.verifyAggregations to Queries.prepareAggregations * add Uts * fix Ut failures * rebased to master * address comments and add a Ut for arithmetic post aggregators * rebased to the master * address the comment of injection within arithmetic post aggregator * address comments and introduce decorate() in the PostAggregator interface. * Address comments. 1. Implements getComparator in FinalizingFieldAccessPostAggregator and add Uts for it 2. Some minor changes like renaming a method name. * Fix a code style mismatch. * Rebased to the master	2017-02-21 16:32:14 -08:00
Gian Merlino	16ef513c7d	SQL: Add context and contextual functions to planner. (#3919 ) * SQL: Add context and contextual functions to planner. Added support for context parameters specified as JDBC connection properties or a JSON object for SQL-over-JSON-over-HTTP. Also added features that depend on context functionality: - Added CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP functions. - Added support for time zones other than UTC via a "timeZone" context. - Pass down query context to Druid queries too. Also some bug fixes: - Fix DATE handling, it was largely done incorrectly before. - Fix CAST(__time TO DATE) which should do a floor-to-day. - Fix non-equality comparisons to FLOOR(__time TO X). - Fix maxQueryCount property. * Pass down context to nested queries too.	2017-02-15 14:09:14 -08:00
Gian Merlino	78b0d134ae	Require Java 8 and include some Java 8 dependencies. (#3914 ) * Require Java 8 and include some Java 8 dependencies. - Upgrade Jetty to 9.3.16.v20170120. - Upgrade DataSketches to 0.8.4. - Bundle caffeine-cache by default. - Still target Java 7 when compiling base Druid classes. * Update cluster, quickstart docs. * Remove oraclejdk7 from travis.yml.	2017-02-14 12:51:51 -08:00
Akash Dwivedi	8854ce018e	File.deleteOnExit() (#3923 ) * Less use of File.deleteOnExit() * removed deleteOnExit from most of the tests/benchmarks/iopeon * Made IOpeon closable * Formatting. * Revert DeterminePartitionsJobTest, remove cleanup method from IOPeon	2017-02-13 15:12:14 -08:00
Parag Jain	1f263fe50b	alert when resetting offsets (#3931 ) * alert when resetting offsets * add more data to alerts	2017-02-13 13:49:24 -08:00
michaelschiff	c1eee9bbf3	modified "end" column to `end` (#3903 ) * modified "end" column to `end`. "end" is interpretted as a string rather than dereferencing the column value * SQLMetadataConnector.getQuoteString defines the string that should be used to quote string fields * positional arguments for String.format * for Connectors that use " need to include the \ escape as well	2017-02-13 12:36:27 -08:00
Jihoon Son	991e2852da	Add PostAggregators to generator cache keys for top-n queries (#3899 ) * Add PostAggregators to generator cache keys for top-n queries * Add tests for strings * Remove debug comments * Add type keys and list sizes to cache key * Make post aggregators used for sort are considered for cache key generation * Use assertArrayEquals() * Improve findPostAggregatorsForSort() * Address comments * fix test failure * address comments	2017-02-13 12:23:44 -08:00
Parag Jain	8e31a465ad	report hand off count finite appenderator driver (#3925 )	2017-02-13 10:41:24 -08:00
Gian Merlino	12317fd001	Bump version to 0.10.0-SNAPSHOT. (#3913 )	2017-02-06 17:54:35 -08:00
Parag Jain	1aabb45a09	auto reset option for Kafka Indexing service (#3842 ) * auto reset option for Kafka Indexing service in case message at the offset being fetched is not present anymore at kafka brokers * review comments * review comments * reverted last change * review comments * review comments * fix typo	2017-02-02 14:57:45 -06:00
Nishant Bangarwa	a457cded28	Druid Extension to enable Authentication using Kerberos. (#3853 ) * Add extension for supporting kerberos security - This PR adds an extension for supporting druid authentication via Kerberos. - Working on the docs. * Add docs * review comments * more review comments * Block all paths by default * more review comments - use proper Oid * Allow extensions to override httpclient for integration tests * Add kerberos lock to prevent multithreaded issues. * review comment - remove enabled flag and fix router injection * Add Cookie Handling and more detailed docs * review comment - rename DruidKerberosConfig -> AuthKerberosConfig * review comments * fix travis failure on jdk7	2017-02-02 14:55:21 -06:00
Charles Allen	a73f1c9c70	Make s3 work better (#3898 )	2017-02-02 10:04:30 -08:00
Jonathan Wei	e6b95e80aa	Remove deprecated Aggregator/AggregatorFactory methods (#3894 )	2017-02-01 14:43:18 -08:00
Gian Merlino	ac84a3e011	SQL: Add resolution parameter, fix filtering bug with APPROX_QUANTILE (#3868 ) * SQL: Add resolution parameter to quantile agg, rename to APPROX_QUANTILE. * Fix bug with re-use of filtered approximate histogram aggregators. Also add APPROX_QUANTILE tests for filtering and running on complex columns. Includes some slight refactoring to allow tests to make DruidTables that include complex columns. * Remove unused import	2017-01-25 18:39:26 -08:00
Parag Jain	b3dae0efc3	catch all errors (#3844 )	2017-01-24 18:01:30 -07:00
Gian Merlino	d51f5e058d	SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. (#3852 ) * SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. Switched from CalciteConnection to Planner, bringing benefits: - CalciteConnection's JDBC interface no longer sits between the SQL server (HTTP/Avatica) and Druid's query layer. Instead, the SQL servers can use Druid Sequence objects directly, reducing overhead in the query return path. - Implemented our own Planner-based Avatica Meta, letting us control connection timeouts and connection / statement limits. The previous CalciteConnection-based implementation didn't have any limits or timeouts. - The Planner interface lets us override the operator table, opening up SQL language extensions. This patch includes two: APPROX_COUNT_DISTINCT in core, and a QUANTILE aggregator in the druid-histogram extension. Also: - Added INFORMATION_SCHEMA metadata schema. - Added tests for Unicode literals and escapes. * Verify statement is actually open before closing it. * More detailed INFORMATION_SCHEMA docs.	2017-01-19 16:32:20 -08:00
Akash Dwivedi	e550d48772	Using fully qualified hdfs path. (#3705 ) * Using fully qualified hdfs path. * Review changes. * Remove unused imports. * Variable name change.	2017-01-17 14:40:22 -06:00
Jihoon Son	d80bec83cc	Enable auto license checking (#3836 ) * Enable license checking * Clean duplicated license headers	2017-01-10 18:13:47 -08:00
Roman Leventov	49d71e9b38	Fix the build after #3697 (#3807 )	2016-12-26 17:06:48 -06:00
Roman Leventov	33800122ad	Don't return leaked Objects back to StupidPool, because this is dangerous. Reuse Cleaners in StupidPool. Make StupidPools named. Add StupidPool.leakedObjectCount(). Minor fixes (#3631 )	2016-12-26 00:35:35 -06:00
Roman Leventov	76cb06a8d8	Lookup cache refactoring (the main part of #3667 ) (#3697 ) * Lookup cache refactoring (the main part of druid-io/druid#3667) * Use PowerMock's static methods in NamespaceLookupExtractorFactoryTest * Fix KafkaLookupExtractorFactoryTest * Use VisibleForTesting annotation instead of Javadoc comment * Create a NamespaceExtractionCacheManager separately for each test in NamespaceExtractionCacheManagersTest * Rename CacheScheduler.NoCache.ENTRY_DISPOSED to ENTRY_CLOSED * Reduce visibility of NamespaceExtractionCacheManager.cacheCount() and monitor() implementations, and don't run NamespaceExtractionCacheManagerExecutorsTest with off-heap cache (it didn't before) * In NamespaceLookupExtractorFactory, use safer idiom to check if CacheState is NoCache or VersionedCache * More logging in CacheHandler constructor and close(), VersionedCache.close() * PR comments addressed * Make CacheScheduler.EntryImpl AutoCloseable, avoid 'dispose' verb in comments, logging and naming in CacheScheduler in favor of 'close' * More Javadoc comments to CacheScheduler * Fix NPE * Remove logging in OnHeapNamespaceExtractionCacheManager.expungeCollectedCaches() * Make NamespaceExtractionCacheManagersTest.testRacyCreation() to have similar load to what it be before the refactoring * Unwrap NamespaceExtractionCacheManager.scheduledExecutorService from unneeded MoreExecutors.listeningDecorator() and specify that this is ScheduledThreadPoolExecutor, which ensures happens-before between periodic runs of the tasks * More comments on MapDbCacheDisposer.disposed * Replace concat with Long.toString() * Comment on why NamespaceExtractionCacheManager.scheduledExecutorService() returns ScheduledThreadPoolExecutor * Place logging statements in VersionedCache.close() and CacheHandler.close() after actual closing logic, because logging may fail * Make JDBCExtractionNamespaceCacheFactory and StaticMapExtractionNamespaceCacheFactory to try to close newly created VersionedCache if population has failed, as it is done already in URIExtractionNamespaceCacheFactory * Don't close the whole CacheScheduler.Entry, if the cache update task failed * Replace AtomicLong updateCounter and firstRunLatch with Phaser-based UpdateCounter in CacheScheduler.EntryImpl	2016-12-23 18:04:27 -08:00
Himanshu	4ca3b7f1e4	overlord helpers framework and tasklog auto cleanup (#3677 ) * overlord helpers framework and tasklog auto cleanup * review comment changes * further review comments addressed	2016-12-21 15:18:55 -08:00
Gian Merlino	6440ddcbca	Fix #3795 (Java 7 compatibility). (#3796 ) * Fix #3795 (Java 7 compatibility). Also introduce Animal Sniffer checks during build, which would have caught the original problems. * Add Animal Sniffer on caffeine-cache for JDK8.	2016-12-21 10:19:13 -08:00
David Lim	0b9dff0bc1	fix worker thread pool exhaustion bug (#3760 ) * fix worker thread pool exhaustion bug * code review changes * code review changes	2016-12-09 15:23:11 -08:00
David Lim	7f087cdd3b	allow Kafka consumer group.id to be overriden by config (#3765 )	2016-12-08 15:53:13 -08:00
Charles Allen	27ab23ef44	Don't update segment metadata if archive doesn't move anything (#3476 ) * Don't update segment metadata if archive doesn't move anything * Fix restore task to handle potential null values * Don't try to update empty metadata * Address review comments * Move to druid-io java-util	2016-12-01 07:49:28 -08:00
Parag Jain	7ee6bb7410	option to reset offest automatically in case of OffsetOutOfRangeException (#3678 ) * option to reset offset automatically in case of OffsetOutOfRangeException if the next offset is less than the earliest available offset for that partition * review comments * refactoring * refactor * review comments	2016-11-21 16:29:46 -06:00
Roman Leventov	7b56cec3b9	Fix resource leaks (#3702 )	2016-11-18 21:21:36 +05:30
Gian Merlino	7e80d1045a	Exercise v2 engine in the groupBy aggregator and multi-value dimension tests. (#3698 ) This also involved some other test changes: - Added a factory.mergeRunners step to AggregationTestHelper's groupBy chain, since the v2 engine does merging there. - Changed test byteBuffer pools from on-heap to off-heap to work around https://github.com/DataSketches/sketches-core/pull/116 for datasketches tests.	2016-11-16 20:02:25 -08:00
Gian Merlino	bcd20441be	Make buildV9Directly the default. (#3688 )	2016-11-14 09:29:32 -08:00
Roman Leventov	988d97b09c	Unwrap exceptions from RuntimeException in URIExtractionNamespaceCacheFactory.populateCache() (part of #3667 ) (#3668 ) * Unwrap exceptions from RuntimeException in URIExtractionNamespaceCacheFactory.populateCache() * Fix tests	2016-11-11 17:25:41 -08:00
Himanshu	ddc078926b	consolidate different theta sketch representations into SketchHolder (#3671 )	2016-11-11 10:20:41 -08:00
Himanshu	b76b3f8d85	reset-cluster command to clean up druid state stored on metadata and deep storage (#3670 )	2016-11-09 11:07:01 -06:00
Nicolas Colomer	37ecffb648	Add support for Confluent Schema Registry in the avro extension (#3529 )	2016-11-08 16:10:45 -06:00
Gian Merlino	657e4512d2	Checkstyle checks for AvoidStaticImport, UnusedImports. (#3660 ) Excludes tests from AvoidStaticImport, since those are used often there and I didn't want to make this changeset too large. Production code use was minimal and I switched those to non-static imports.	2016-11-05 11:34:36 -07:00

1 2 3

145 Commits