druid

Commit Graph

Author	SHA1	Message	Date
kaijianding	a029b33499	fix cache populate incorrect content when numBackgroundThreads>1 (#3943 ) * fix cache populate incorrect content when numBackgroundThreads>1 * simplify code by using Futures.allAsList and use CountDownLatch in UT * fix test code style and assert countDownLatch.await()	2017-02-17 20:47:19 +05:30
Gian Merlino	78b0d134ae	Require Java 8 and include some Java 8 dependencies. (#3914 ) * Require Java 8 and include some Java 8 dependencies. - Upgrade Jetty to 9.3.16.v20170120. - Upgrade DataSketches to 0.8.4. - Bundle caffeine-cache by default. - Still target Java 7 when compiling base Druid classes. * Update cluster, quickstart docs. * Remove oraclejdk7 from travis.yml.	2017-02-14 12:51:51 -08:00
Jakub Kukul	28d85702ad	Fix rolling of request log files. (#3916 ) * Use common date format for request log files. * Remove code duplication in creating logging FileWriter.	2017-02-14 09:33:43 -08:00
Akash Dwivedi	8854ce018e	File.deleteOnExit() (#3923 ) * Less use of File.deleteOnExit() * removed deleteOnExit from most of the tests/benchmarks/iopeon * Made IOpeon closable * Formatting. * Revert DeterminePartitionsJobTest, remove cleanup method from IOPeon	2017-02-13 15:12:14 -08:00
Himanshu	9dfcf0763a	disable javascript execution by default (#3818 )	2017-02-13 15:11:18 -08:00
Himanshu	8cf7ad1e3a	druid.coordinator.asOverlord.enabled flag at coordinator to make it an overlord too (#3711 )	2017-02-13 15:03:59 -08:00
michaelschiff	c1eee9bbf3	modified "end" column to `end` (#3903 ) * modified "end" column to `end`. "end" is interpretted as a string rather than dereferencing the column value * SQLMetadataConnector.getQuoteString defines the string that should be used to quote string fields * positional arguments for String.format * for Connectors that use " need to include the \ escape as well	2017-02-13 12:36:27 -08:00
Jihoon Son	991e2852da	Add PostAggregators to generator cache keys for top-n queries (#3899 ) * Add PostAggregators to generator cache keys for top-n queries * Add tests for strings * Remove debug comments * Add type keys and list sizes to cache key * Make post aggregators used for sort are considered for cache key generation * Use assertArrayEquals() * Improve findPostAggregatorsForSort() * Address comments * fix test failure * address comments	2017-02-13 12:23:44 -08:00
Parag Jain	8e31a465ad	report hand off count finite appenderator driver (#3925 )	2017-02-13 10:41:24 -08:00
Jonathan Wei	ca2b04f0fd	Add long/float ColumnSelectorStrategy implementations (#3838 ) * Add long/float ColumnSelectorStrategy implementations * Address PR comments * Add String strategy with internal dictionary to V2 groupby, remove dict from numeric wrapping selectors, more tests * PR comments * Use BaseSingleValueDimensionSelector for long/float wrapping * remove unused import * Address PR comments * PR comments * PR comments * More PR comments * Fix failing calcite histogram subquery tests * ScanQuery test and comment about isInputRaw * Add outputType to extractionDimensionSpec, tweak SQL tests * Fix limit spec optimization for numerics * Add cardinality sanity checks to TopN * Fix import from merge * Add tests for filtered dimension spec outputType * Address PR comments * Allow filtered dimspecs on numerics * More comments	2017-02-08 20:39:29 -08:00
Himanshu	e08cd0066b	verify no duplicate aggregator names in DataSchema (#3917 )	2017-02-08 16:12:07 -08:00
Gian Merlino	12317fd001	Bump version to 0.10.0-SNAPSHOT. (#3913 )	2017-02-06 17:54:35 -08:00
Roman Leventov	ca9f0e2b27	Don't override finalize() and reduce locking in LoadBalancingPool and ReferenceCountedResourceHandler (#3874 ) * Specialize LoadBalancingPool as MemcacheClientPool, reduce locking and don't override Object.finalize() * Remove locking and don't override Object.finalize() in ReferenceCountingResourceHolder * Add leak counts in ReferenceCountingResourceHolder and MemcacheClientPool. Add tests for ReferenceCountingResourceHolder and MemcacheClientPool * Fix a race condition in ReferenceCountingResourceHolder.increment()	2017-02-06 17:14:46 -08:00
Parag Jain	8a13a85765	Introduce SegmentizerFactory (#3901 ) * Introduce SegmentizerFactory - that knows how to deserialize specific type of segment - Default implementation is MMappedQueryableSegmentizerFactory which creates QueryableIndexSegment - Unit test for the default behavior * review comments	2017-02-06 10:05:12 -08:00
DaimonPl	93b71e265e	Extract HLL related code to separate module (#3900 )	2017-02-03 09:45:11 -08:00
Parag Jain	1aabb45a09	auto reset option for Kafka Indexing service (#3842 ) * auto reset option for Kafka Indexing service in case message at the offset being fetched is not present anymore at kafka brokers * review comments * review comments * reverted last change * review comments * review comments * fix typo	2017-02-02 14:57:45 -06:00
Niketh Sabbineni	2b8d3c102b	Remove throttling on drop segments (#3736 ) * Remove throttling on drop * Throttle loadqueuepeon segment change requests to ZK * Make initial delay configurable, add docs, shutdown gracefully * Make loadqueuepeon repeat delay configurable	2017-01-20 10:02:19 -08:00
Roman Leventov	af93a8d189	Sequences refactorings and removed unused code (part of #3798 ) (#3693 ) * Removing unused code from io.druid.java.util.common.guava package; fix #3563 (more consistent and paranoiac resource handing in Sequences subsystem); Add Sequences.wrap() for DRY in MetricsEmittingQueryRunner, CPUTimeMetricQueryRunner and SpecificSegmentQueryRunner; Catch MissingSegmentsException in SpecificSegmentQueryRunner's yielder.next() method (follow up on #3617) * Make Sequences.withEffect() execute the effect if the wrapped sequence throws exception from close() * Fix strange code in MetricsEmittingQueryRunner * Add comment on why YieldingSequenceBase is used in Sequences.withEffect() * Use Closer in OrderedMergeSequence and MergeSequence to close multiple yielders	2017-01-19 20:07:43 -08:00
Gian Merlino	d51f5e058d	SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. (#3852 ) * SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. Switched from CalciteConnection to Planner, bringing benefits: - CalciteConnection's JDBC interface no longer sits between the SQL server (HTTP/Avatica) and Druid's query layer. Instead, the SQL servers can use Druid Sequence objects directly, reducing overhead in the query return path. - Implemented our own Planner-based Avatica Meta, letting us control connection timeouts and connection / statement limits. The previous CalciteConnection-based implementation didn't have any limits or timeouts. - The Planner interface lets us override the operator table, opening up SQL language extensions. This patch includes two: APPROX_COUNT_DISTINCT in core, and a QUANTILE aggregator in the druid-histogram extension. Also: - Added INFORMATION_SCHEMA metadata schema. - Added tests for Unicode literals and escapes. * Verify statement is actually open before closing it. * More detailed INFORMATION_SCHEMA docs.	2017-01-19 16:32:20 -08:00
kaijianding	33ae9dd485	streaming version of select query (#3307 ) * streaming version of select query * use columns instead of dimensions and metrics;prepare for valueVector;remove granularity * respect query limit within historical * use constant * fix thread name corrupted bug when using jetty qtp thread rather than processing thread while working with SpecificSegmentQueryRunner * add some test for scan query * add scan query document * fix merge conflicts * add compactedList resultFormat, this format is better for json ser/der * respect query timeout * respect query limit on broker * use static consts and remove unused code	2017-01-19 16:09:53 -06:00
David Lim	ff52581bd3	IndexTask improvements (#3611 ) * index task improvements * code review changes * add null check	2017-01-18 14:24:37 -08:00
Himanshu	7004f5d499	make ITKafkaTest less non-deterministic (#3856 )	2017-01-17 16:52:51 -06:00
Jihoon Son	d80bec83cc	Enable auto license checking (#3836 ) * Enable license checking * Clean duplicated license headers	2017-01-10 18:13:47 -08:00
Gian Merlino	ce0049d8ff	DruidCoordinator: Leave the ServerInventoryView running when we lose leadership. (#3830 )	2017-01-09 12:36:05 +05:30
Niketh Sabbineni	2aa8385fc0	Include loadqueuepeon size to compute disknormalized cost (#3826 )	2017-01-06 14:30:23 -08:00
Niketh Sabbineni	a5f82e8acf	Randomly choose a server when multiple best servers are available (#3822 ) * Randomly choose a server when multiple best servers are available * Use one pass instead of two * Fix code style issues	2017-01-05 23:34:05 -06:00
Yuusaku Taniguchi	02519d5b64	Exhibitor Support (#3664 ) * allow JsonConfigTesterBase to treat the fields of collections * [Feature] Exhibitor Support (#3664) This patch provides the integration of Druid & Netflix Exhibitor. Druid currently use Apache Curator as ZooKeeper client. Curator can be integrated with Exhibitor to achieve a live/updating list of the ZooKeeper ensemble. This patch enables Druid to use this features.	2017-01-02 09:15:36 -08:00
Roman Leventov	33800122ad	Don't return leaked Objects back to StupidPool, because this is dangerous. Reuse Cleaners in StupidPool. Make StupidPools named. Add StupidPool.leakedObjectCount(). Minor fixes (#3631 )	2016-12-26 00:35:35 -06:00
Gian Merlino	6440ddcbca	Fix #3795 (Java 7 compatibility). (#3796 ) * Fix #3795 (Java 7 compatibility). Also introduce Animal Sniffer checks during build, which would have caught the original problems. * Add Animal Sniffer on caffeine-cache for JDK8.	2016-12-21 10:19:13 -08:00
Himanshu	fd14997b1d	fix MetadataStorage binding so that it is always Noop except for coordinator iff derby is configured (#3789 )	2016-12-19 16:24:09 -08:00
Nishant	35160e5595	Add metrics for Query Count statistics (#3470 ) * Add metrics for Query Count statistics This PR adds a new metrics monitor “QueryCountStatsMonitor” which emits three new metrics - 1) query/success/count - number of successful queries 2) query/failed/count - number of failed queries 3) query/interrupted/count - number of interrupted/timedout queries fix bindings * make fields final * fix imports * AsyncQueryForwardingServlet implement QueryStatsProvider * remove unused import	2016-12-19 09:47:58 -08:00
Gian Merlino	dd63f54325	Built-in SQL. (#3682 )	2016-12-16 17:15:59 -08:00
Nishant	8cfcb95fbc	Add Filtered and Composing request loggers (#3469 ) * Add Filtered and Composing request loggers Add Filtered and Composite Request loggers - enables users to filter request logs for slow queries. fix test * review comments * review comment * remove unused import	2016-12-16 11:18:32 -08:00
Jihoon Son	5e39578eee	Enable parallel test (#3774 ) * Enable parallel test * Remove unnecessary NotThreadSafe annocation * Randomize the start port when finding available ports * Fix test failure * Change to handle all negatives	2016-12-14 21:05:56 -08:00
John Zhang	48b22e261a	support atomic writes for local deep storage (#3521 ) * Use atomic writes for local deep storage * fix pr issues * use defaultObjMapper for test * move tmp pushes to a intermediate dir * minor refactor	2016-12-13 10:03:22 -08:00
kaijianding	4be3eb0ce7	report message gap, source gap and sink count in RealtimePlumber (#3744 ) * report message gap, source gap and sink count in RealtimePlumber * report message gap, sink count in Appenderator * add ingest/events/sourceGap in metrics.md * remove source gap	2016-12-13 11:23:02 -06:00
Navis Ryu	f794246ec1	Trimming out outside of given interval (#2798 ) * Trimming out outside of given interval (Fix for #2659) * addressed comments	2016-12-07 18:05:50 -08:00
Gian Merlino	943982b7b0	Configurable HTTP compression. (#3759 ) * Configurable HTTP compression. * Call real-time nodes real-time processes in docs.	2016-12-07 17:40:39 -08:00
Roman Leventov	70e83bea6d	Fix PathChildrenCache's ExecutorService leak (#3726 ) * Fix PathChildrenCache's executorService leak in Announcer, CuratorInventoryManager and RemoteTaskRunner * Use a single ExecutorService for all workerStatusPathChildrenCaches in RemoteTaskRunner	2016-12-07 13:00:10 -08:00
Himanshu	06d0ef9c6c	allow and load extensions with absolute paths in druid.extensions.loadList (#3747 )	2016-12-06 17:40:23 -08:00
kaijianding	f995b1426f	retry loadSegment with all locations (#3681 )	2016-12-06 12:00:59 -08:00
Himanshu	5440a06b2d	make sure CliCoordinator initializes and starts DerbyMetadataStorage first if configured (#3700 ) * make sure CliCoordinator initializes and starts DerbyMetadataStorage first if configured * Revert "make sure CliCoordinator initializes and starts DerbyMetadataStorage first if configured" This reverts commit 54f5644054626d4a9e2448bb4bd5e6ce9a9fca1d. * make sure CliCoordinator initializes and starts DerbyMetadataStorage first if configured	2016-12-06 10:22:04 -08:00
Niketh Sabbineni	d904c79081	Normalized Cost Balancer (#3632 ) * Normalized Cost Balancer * Adding documentation and renaming to use diskNormalizedCostBalancer * Remove balancer from the strings * Update docs and include random cost balancer * Fix checkstyle issues	2016-12-05 17:18:20 -08:00
Navis Ryu	c74d267f50	Support virtual column for select query (#2511 ) * Support virtual column for select query * Addressed comments	2016-12-05 15:14:35 -08:00
Gian Merlino	a8069f2441	Retry on dataSource metadata CAS failures where retrying might help. (#3728 ) This retries when the start condition is met but SELECT -> INSERT/UPDATE fails, which indicates a race. If the start condition isn't met, there won't be any retrying.	2016-12-01 11:50:15 -07:00
Roman Leventov	c070b4a816	Fix concurrency defects, remove unnecessary volatiles (#3701 )	2016-11-22 16:42:28 -08:00
Gian Merlino	bcd20441be	Make buildV9Directly the default. (#3688 )	2016-11-14 09:29:32 -08:00
Roman Leventov	fbbb55f867	Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics (#3679 ) * Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics, not only query metrics * Remove unused imports * Use empty string instead of "testing-version" as a version placeholder	2016-11-11 17:17:27 -06:00
Himanshu	b76b3f8d85	reset-cluster command to clean up druid state stored on metadata and deep storage (#3670 )	2016-11-09 11:07:01 -06:00
Gian Merlino	657e4512d2	Checkstyle checks for AvoidStaticImport, UnusedImports. (#3660 ) Excludes tests from AvoidStaticImport, since those are used often there and I didn't want to make this changeset too large. Production code use was minimal and I switched those to non-static imports.	2016-11-05 11:34:36 -07:00
Himanshu	0e269ce72a	max row limit is necessary is connector is setup for streaming (#3635 )	2016-11-01 09:39:55 -07:00
Himanshu	32c5494e97	eagerly allocate the intermediate computation buffers (#3628 )	2016-10-31 15:24:07 -07:00
Gian Merlino	9f5c895d5f	Fix imports in DefaultOfflineAppenderatorFactoryTest. (#3624 )	2016-10-31 12:23:19 -07:00
Slim	f6995bc908	offline appenderator factory. (#3483 ) * adding default offline appenderator * adding test * fix comments * fix comments	2016-10-31 10:05:58 -07:00
Himanshu	641469fc38	manage overshadowing efficiently at coordinator (#3584 ) * manage overshadowing efficiently at coordinator * take readlock in VersionedIntervalTimeline.isOvershadowed()	2016-10-24 22:49:08 +05:30
Akash Dwivedi	4b3bd8bd63	Migrating java-util from Metamarkets. (#3585 ) * Migrating java-util from Metamarkets. * checkstyle and updated license on java-util files. * Removed unused imports from whole project. * cherry pick metamx/java-util@826021f. * Copyright changes on java-util pom, address review comments.	2016-10-21 14:57:07 -07:00
Roman Leventov	5dc95389f7	Add Checkstyle framework (#3551 ) * Add Checkstyle framework * Avoid star import * Need braces for control flow statements * Redundant imports * Add NewLineAtEndOfFile check	2016-10-13 13:37:47 -07:00
Gian Merlino	ddc856214d	When inserting segments, mark unused if already overshadowed. (#3499 ) This is useful for the insert-segment-to-db tool, which would otherwise potentially insert a lot of overshadowed segments as "used", causing load and drop churn in the cluster.	2016-10-10 18:10:18 -07:00
Himanshu	7e6824501c	fix: QueryResource thread name includes whole inner query string for nested query (#3549 ) * print exception details from QueryInterruptedException * in QueryResource.java, set thread name to include dataSource names and not whole query string e.g. from QueryDataSource	2016-10-06 18:30:52 -07:00
praveev	43cdc675c7	Add support for timezone in segment granularity (#3528 ) * Add support for timezone in segment granularity * CR feedback. Handle null timezone during equals check. * Include timezone in docs. Add timezone for ArbitraryGranularitySpec.	2016-10-03 08:15:42 -07:00
Gian Merlino	40f2fe7893	Bump versions to 0.9.3-SNAPSHOT (#3524 )	2016-09-29 13:53:32 -07:00
John Zhang	78b06a7d7e	make global http client worker threads configurable (#3514 )	2016-09-28 23:18:51 -07:00
Xiaoyao	91e6ab4fcf	LRU cache guarantee to keep size under limit (#3510 ) * LRU cache guarantee to keep size under limit * address comments * fix failed tests in jdk7	2016-09-27 17:13:06 -07:00
Parag Jain	56b0586097	secure BrokerQueryResource endpoints (#3506 )	2016-09-26 11:27:24 -07:00
David Lim	ca9114b41b	add supervisor reset API (#3484 ) * add supervisor reset API * CR doc changes and kill running tasks / clear offsets from supervisor	2016-09-22 17:51:06 -07:00
Navis Ryu	49c0fe0e8b	Show candidate hosts for the given query (#2282 ) * Show candidate hosts for the given query * Added test cases & minor changes to address comments * Changed path-param to query-pram for intervals/numCandidates	2016-09-22 08:32:38 -07:00
Keuntae Park	54ec4dd584	Support renaming of outputName for cached select and search query results (#3395 ) * support renaming of outputName for cached select and search queries * rebase and resolve conflicts * rollback CacheStrategy interface change * updated based on review comments	2016-09-20 08:19:14 -07:00
Charles Allen	95e08b38ea	[QTL] Reduced Locking Lookups (#3071 ) * Lockless lookups * Fix compile problem * Make stack trace throw instead * Remove non-germane change * * Add better naming to cache keys. Makes logging nicer * Fix #3459 * Move start/stop lock to non-interruptable for readability purposes	2016-09-16 11:54:23 -07:00
Gian Merlino	7a2a4bc6de	JavaScript: Disable now affects worker selection and router strategy too. (#3458 )	2016-09-13 16:37:42 -07:00
Jonathan Wei	df766b2bbd	Add dimension handling interface for ingestion and segment creation (#3217 ) * Add dimension handling interface for ingestion and segment creation * update javadocs for DimensionHandler/DimensionIndexer * Move IndexIO row validation into DimensionHandler * Fix null column skipping in mergerV9 * Add deprecation note for 'numeric_dims' filename pattern in IndexIO v8->v9 conversion * Fix java7 test failure	2016-09-12 12:54:02 -07:00
Gleb Smirnov	8bee07e81e	Respect server-side sorting of tasks in coordinator console (#3404 )	2016-08-28 16:38:29 -07:00
jaehong choi	2e0f253c32	introducing lists of existing columns in the fields of select queries' output (#2491 ) * introducing lists of existing columns in the fields of select queries' output * rebase master * address the comment. add test code for select query caching * change the cache code in SelectQueryQueryToolChest to 0x16	2016-08-25 21:37:53 +05:30
kaijianding	eafafce1aa	fix old usage of dimension as string instead of dimensionSchema in DataSchema (#3365 )	2016-08-16 09:58:04 -07:00
David Lim	ed924bf214	allow registrants to opt out of announcing themselves when registering as a chat handler (#3360 )	2016-08-16 10:51:28 +05:30
rajk-tetration	362b9266f8	Adding filters for TimeBoundary on backend (#3168 ) * Adding filters for TimeBoundary on backend Signed-off-by: Balachandar Kesavan <raj.ksvn@gmail.com> * updating TimeBoundaryQuery constructor in QueryHostFinderTest * add filter helpers * update filterSegments + test * Conditional filterSegment depending on whether a filter exists * Style changes * Trigger rebuild * Adding documentation for timeboundaryquery filtering * added filter serialization to timeboundaryquery cache * code style changes	2016-08-15 10:25:24 -07:00
Jonathan Wei	890e3bdd3f	More informative query unit test names (#3342 )	2016-08-09 22:24:48 -07:00
Gian Merlino	21bce96c4c	More useful query errors. (#3335 ) Follow-up to #1773, which meant to add more useful query errors but did not actually do so. Since that patch, any error other than interrupt/cancel/timeout was reported as `{"error":"Unknown exception"}`. With this patch, the error fields are: - error, one of the specific strings "Query interrupted", "Query timeout", "Query cancelled", or "Unknown exception" (same behavior as before). - errorMessage, the message of the topmost non-QueryInterruptedException in the causality chain. - errorClass, the class of the topmost non-QueryInterruptedException in the causality chain. - host, the host that failed the query.	2016-08-09 16:14:52 +08:00
Navis Ryu	39351fb8d2	Mask properties from logging (#3332 ) * Mask properties from logging * mask "password" by default	2016-08-08 21:36:10 +05:30
kaijianding	50d52a24fc	ability to not rollup at index time, make pre aggregation an option (#3020 ) * ability to not rollup at index time, make pre aggregation an option * rename getRowIndexForRollup to getPriorIndex * fix doc misspelling * test query using no-rollup indexes * fix benchmark fail due to jmh bug	2016-08-02 11:13:05 -07:00
Jonathan Wei	a6105cbb86	Add numeric StringComparator (#3270 ) * Add numeric StringComparator * Only use direct long comparison for numeric ordering in BoundFilter, add time filtering benchmark query * Address PR comments, add multithreaded BoundDimFilter test * Add comment on strlen tie handling * Add timeseries interval filter benchmark * Adjust docs * Use jackson for StringComparator, address PR comments * Add new TopNMetricSpec and SearchSortSpec with tests (WIP) * More TopNMetricSpec and SearchSortSpec tests * Fix NewSearchSortSpec serde * Update docs for new DimensionTopNMetricSpec * Delete NumericDimensionTopNMetricSpec * Delete old SearchSortSpec * Rename NewSearchSortSpec to SearchSortSpec * Add TopN numeric comparator benchmark, address PR comments * Refactor OrderByColumnSpec * Add null checks to NumericComparator and String->BigDecimal conversion function * Add more OrderByColumnSpec serde tests	2016-07-29 15:44:16 -07:00
Charles Allen	d04af6aee4	Add `slf4j` requst logger (#3146 ) * Add `slf4j` requst logger * Address comments * Fix conflicts with master * Fix removed map value	2016-07-29 15:15:41 -07:00
Charles Allen	546e4f79b0	Add size of pending deletes to historical metrics (#3295 ) * Add size of pending deletes to historical metrics	2016-07-27 11:30:47 -07:00
Charles Allen	b1e3fe77f5	More logging around how the coordinator balancer is happening (#3279 ) * More logging around how the coordinator balancer is happening * Address comments * Address code review comments and add actual logging	2016-07-27 13:24:32 +05:30
Gian Merlino	2f275497b6	Fix caching of extension classloaders. (#3289 )	2016-07-26 15:19:15 -07:00
Gian Merlino	8030f1cb67	Be more respectful of maxRowsInMemory. (#3284 ) - Appenderator: Respect maxRowsInMemory across all sinks. - KafkaIndexTask: Respect maxRowsInMemory across all partitions.	2016-07-26 15:02:35 -06:00
Charles Allen	188a4bc89a	Revert "Optionally intern ServerInventoryView inventory objects. (#3238 )" (#3286 ) This reverts commit `a931debf79`. Fixes #3283 The core issue here is that realtime nodes announce their size as 0, so a coordinator which interns the realtime version of the data segment will not be able to see the new sized announcement when handoff occurs. This is caused by the `eauals` method on a `DataSegment` only evaluating the identifier. the `eauals` method should be correct for object equivalence, and things which need to check equivalence of some sub-portion of the object should do so explicitly.	2016-07-26 11:47:34 -07:00
kaijianding	3dc2974894	Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227 ) * save TimestampSpec in metadata.drd * add timestampSpec info in SegmentMetadataQuery	2016-07-25 15:45:30 -07:00
Gian Merlino	b316cde554	Appenderator tests for disjoint query intervals. (#3281 )	2016-07-23 19:48:15 -07:00
Charles Allen	c58bbfa0c6	Intern DataSegments in SQLMetadataSegmentManager (#3267 ) * Helps with heap pressure on coordinator	2016-07-21 16:46:08 -07:00
Parag Jain	fd798d32bc	fix testSecuredGetServer ut (#3262 )	2016-07-20 10:20:13 -07:00
Gian Merlino	06624c40c0	Share query handling between Appenderator and RealtimePlumber. (#3248 ) Fixes inconsistent metric handling between the two implementations. Formerly, RealtimePlumber only emitted query/segmentAndCache/time and query/wait and Appenderator only emitted query/partial/time and query/wait (all per sink). Now they both do the same thing: - query/segmentAndCache/time, query/segment/time are the time spent per sink. - query/cpu/time is the CPU time spent per query. - query/wait/time is the executor waiting time per sink. These generally match historical metrics, except segmentAndCache & segment mean the same thing here, because one Sink may be partially cached and partially uncached and we aren't splitting that out.	2016-07-19 22:15:13 -05:00
Himanshu	3f82108d15	optionally enable coordinator auto kill tasks on all dataSources via dynamic config (#3250 )	2016-07-17 18:47:52 -07:00
Gian Merlino	6cd1f5375b	Better harmonized dimensions for query metrics. (#3245 ) All query metrics now start with toolChest.makeMetricBuilder, and all of those now start with DruidMetrics.makePartialQueryTimeMetric. Also, "id" moved to common code, since all query metrics added it anyway. In particular this will add query-type specific dimensions like "threshold" and "numDimensions" to servlet-originated metrics like query/time.	2016-07-14 11:55:51 -07:00
Hyukjin Kwon	55e7a52475	Replace deprecated usage for StringInputRowParser and JSONParseSpec (#3215 )	2016-07-14 09:19:17 -07:00
Nishant	a1715c8cda	fix-3237 (#3244 ) DruidBroker use FilteredServerInventoryView instead of ServerInventoryView	2016-07-13 22:30:35 -07:00
Charles Allen	a931debf79	Optionally intern ServerInventoryView inventory objects. (#3238 )	2016-07-14 08:49:26 +05:30
Charles Allen	5d9fd0a713	Migrate IndexerSQLMetadataStorageCoordinator.getUnusedSegmentsForInterval to streaming (#3043 ) * Migrate IndexerSQLMetadataStorageCoordinator.getUnusedSegmentsForInterval to streaming * Missed query from #2859 * Make inReadOnlyTransaction part of SQLMetadataConnector	2016-07-06 16:55:27 -07:00
Himanshu	e1313e4b90	add log msg when event recvr firehose buffer is full (#3209 )	2016-07-01 17:35:30 -05:00
Xavier Léauté	485e381387	remove datasource from hadoop output path (#3196 ) fixes #2083, follow-up to #1702	2016-06-29 08:53:45 -07:00
Gian Merlino	4c9aeb7353	Revert "update druid console version (#3189 )" (#3203 ) This reverts commit `496b801bc3`.	2016-06-29 08:29:57 -07:00
Xavier Léauté	496b801bc3	update druid console version (#3189 )	2016-06-27 18:02:40 -07:00
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Gian Merlino	4cc39b2ee7	Alternative groupBy strategy. (#2998 ) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch.	2016-06-24 18:06:09 -07:00
Dave Li	8a08398977	Add segment pruning based on secondary partition dimension (#2982 ) * add get dimension rangeset to filters * add get domain to ShardSpec and added chunk filter in caching clustered client * add null check and modified not filter, started with unit test * add filter test with caching * refactor and some comments * extract filtershard to helper function * fixup * minor changes * update javadoc	2016-06-24 14:52:19 -07:00
Charles Allen	15f833a861	Make extension classloader caching keyed on directory (#3165 ) * Make extension classloaders keyed by extension directory * Fixes #3163 * Add in same-directory-name unit test	2016-06-23 17:13:19 -07:00
michaelschiff	66d8ad36d7	adds new coordinator metrics 'segment/unavailable/count' and (#3176 ) 'segment/underReplicated/count' (#3173)	2016-06-23 14:53:15 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
Nishant	0d427923c0	fix caching for search results (#3119 ) * fix caching for search results properly read count when reading from cache. * fix NPE during merging search count and add test * Update cache key to invalidate prev results	2016-06-09 17:49:47 -07:00
Himanshu	ab4209c82a	killDataSourceWhitelist in CoordinatorDynamicConfig accepts comma separated list of strings in addition to json array of strings so that coordinator console can do the updates correctly (#3095 )	2016-06-07 15:39:41 -07:00
Keuntae Park	e6b32c24ae	bug fix for getNewNodes() in ListenerDiscoverer (#3092 )	2016-06-07 16:32:42 +05:30
Gian Merlino	2db5f49f35	Fix JavaScriptConfig. (#3062 )	2016-06-02 23:59:00 -07:00
Charles Allen	bbc5509078	Limit number of jetty threads that can be used by lookups (#3068 )	2016-06-02 22:33:12 -07:00
Slim	545cdd63ab	ensure the cleaning of overshadowed unloaded segments (#3048 ) * ensure the cleaning of overshadowed unloaded segments * add testing plus comments	2016-06-02 09:03:58 -05:00
Gian Merlino	874a0a4bdd	MetadataResource: Fix handling of includeDisabled. (#3042 )	2016-06-01 11:56:37 -07:00
John Wang	e662efa79f	segment interface refactor for proposal 2965 (#2990 )	2016-05-26 20:36:41 -07:00
John Wang	a004f1e1c5	appenderator plumber work from gian's branch (#2913 )	2016-05-26 14:46:32 -07:00
Charles Allen	847501a939	Add better messages around LookupCoordinatorManager failures (#3027 ) * Add better messages around LookupCoordinatorManager failures * Catches #3026 * A few more little tests * Add more forceful shutdown	2016-05-26 14:32:35 -05:00
jaehong choi	e2653a8cf4	handle a NPE in LookupCoordinatorManager.start() (#3026 )	2016-05-26 09:55:33 -07:00
David Lim	3ef24c03b3	Validate X-Druid-Task-Id header in request/response and support retrying on outdated TaskLocation information, add KafkaIndexTaskClient unit tests (#3006 ) * validate X-Druid-Task-Id header in request and add header to response * modify KafkaIndexTaskClient to take a TaskLocationProvider as the TaskLocation may not remain constant	2016-05-25 22:05:18 -07:00
Nishant	0ac1b27d53	Allow manually setting of shutoffTime for EventReceiverFirehose (#2803 ) * Allow dynamically setting of shutoffTime for EventReceiverFirehose Allow dynamically setting shutoffTime for EventReceiverFirehose review comments and tests * shut down exec on close	2016-05-24 07:24:00 -07:00
Gian Merlino	970614875b	Fix race where results from an IncrementalIndexSegment could be cached. (#2983 )	2016-05-18 13:57:50 +05:30
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	eaaad01de7	[QTL] Datasource as lookupTier (#2955 ) * Datasource as lookup tier * Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for. * Fix bad docs for lookups lookupTier * Add Datasource name holder * Move task and datasource to be pulled from Task file * Make LookupModule pull from bound dataSource * Fix test * Fix code style on imports * Fix formatting * Make naming better * Address code comments about naming	2016-05-17 15:44:42 -07:00
Xavier Léauté	e79284da59	new interval based cost function (#2972 ) * new interval based cost function Addresses issues with balancing of segments in the existing cost function - `gapPenalty` led to clusters of segments ~30 days apart - `recencyPenalty` caused imbalance among recent segments - size-based cost could be skewed by compression New cost function is purely based on segment intervals: - assumes each time-slice of a partition is a constant cost - cost is additive, i.e. cost(A, B union C) = cost(A, B) + cost(A, C) - cost decays exponentially based on distance between time-slices * comments and formatting * add more comments to explain the calculation	2016-05-17 09:56:00 -07:00
Parag Jain	681ffdb417	try to make DruidCoordinatorTest deterministic (#2967 )	2016-05-13 14:43:28 -07:00
Nishant	a9b721a01b	Allow user to set cost balancer threads more than or equal to the number of cores. (#2964 ) * Allow user to set cost balancer threads more than the number of cores. Allow user to set cost balancer threads more than the number of cores. * modify test	2016-05-13 13:27:42 -05:00
Charles Allen	81cab8a7bb	Make lookups more idempotent on update requests. (#2954 ) * No longer fails if an update fails but it shouldn't have replaced it	2016-05-11 11:22:35 -07:00
Jonathan Wei	f2510cf125	Remove DataSchema equals() and hashcode()	2016-05-10 16:09:28 -07:00
Charles Allen	6332bd70f4	Add smile provider (#2951 )	2016-05-10 16:03:39 -07:00
Charles Allen	0c04650e69	Lookup Announcer eager starting (#2944 )	2016-05-10 12:21:47 +05:30
Charles Allen	454bb034f1	Nicer toString on ListneingAnnouncerConfig (#2936 ) * Helps with debugging	2016-05-10 12:21:06 +05:30
David Lim	2cfd337378	Merge pull request #2933 from dclim/SQLMetadataSupervisorManagerTest-fix add uuid to primary key for supervisor table	2016-05-09 10:41:32 -06:00
Nishant	a2dd57cf65	Optimize CostBalancerStrategy (#2910 ) * Optimize CostBalancerStrategy Ignore benchmark test in normal run fix test review comments fix compilation fix test * review comments * review comment	2016-05-05 08:29:08 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
binlijin	841be5c61f	periodically emit metric segment/scan/pending (#2854 )	2016-05-02 22:38:13 -07:00
Himanshu	6c5bf91f9a	publish metrics numJettyConns to see how number of active jetty connections change over time (#2839 ) this can be compared with numer of active queries to see if requests are waiting in jetty queue	2016-05-02 14:08:25 -07:00
Charles Allen	6b957aa072	[QTL] Make URI Exctraction Namespace take more sane arguments (#2738 ) * Make URI Exctraction Namespace take more sane arguments * Fixes https://github.com/druid-io/druid/issues/2669 * Update docs * Rename error message * Undo overzealous deletion of docs * Explain caching mechanism a bit more in docs	2016-05-02 12:54:34 -07:00
David Lim	5f0a9ccc57	fix ClassCastException in FiniteAppenderatorDriver (#2896 )	2016-04-28 18:39:24 -07:00
Parag Jain	0d745ee120	Basic authorization support in Druid (#2424 ) - Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424	2016-04-28 16:50:28 -07:00
Nishant	c29cb7d711	add pending task based resource management strategy (#2086 )	2016-04-27 10:40:53 -07:00
Slim	984a518c9f	Merge pull request #2734 from b-slim/LookupIntrospection2 [QTL][Lookup] adding introspection endpoint	2016-04-21 12:15:57 -05:00
Gian Merlino	c74391e54c	JavaScript: Ability to disable. (#2853 ) Fixes #2852.	2016-04-21 09:43:15 -05:00
Xavier Léauté	5938d9085b	Stream segments from database (#2859 ) * Avoids fetching all segment records into heap by JDBC driver * Set connection to read-only to help database optimize queries * Update JDBC drivers (MySQL has fixes for streaming results)	2016-04-21 05:40:07 +08:00
Xavier Léauté	fc91120b54	Merge pull request #2857 from metamx/upgrade-zk upgrade zookeeper client dependency to 3.4.8	2016-04-20 10:36:07 +05:30
Nishant	dbf63f738f	Add ability to filter segments for specific dataSources on broker without creating tiers (#2848 ) * Add back FilteredServerView removed in `a32906c7fd` to reduce memory usage using watched tiers. * Add functionality to specify "druid.broker.segment.watchedDataSources"	2016-04-19 10:10:06 -07:00
Gian Merlino	08c784fbf6	KafkaIndexTask: Use a separate sequence per Kafka partition in order to make (#2844 ) segment creation deterministic. This means that each segment will contain data from just one Kafka partition. So, users will probably not want to have a super high number of Kafka partitions... Fixes #2703.	2016-04-18 22:29:52 -07:00
Jisoo Kim	7b65ca7889	refactor ClientQuerySegmentWalker (#2837 ) * refactor ClientQuerySegmentWalker * add header to FluentQueryRunnerBuilder * refactor QueryRunnerTestHelper	2016-04-18 14:00:47 -07:00
binlijin	c1e690288c	Improve some log (#2807 )	2016-04-15 09:34:26 -07:00
Nishant	632b21472b	fix test failure (#2818 ) formatting changes	2016-04-14 21:40:19 -07:00
Fangjin Yang	886ee4e30d	Merge pull request #2821 from metamx/review-comments-2784 handle review comments for PR 2784	2016-04-12 10:20:43 -07:00
Fangjin Yang	b486eff6b7	Merge pull request #2805 from metamx/query-time-start request log should reflect time the query was received	2016-04-12 09:44:42 -07:00
Nishant	deb6ecf919	handle review comments for PR 2784 https://github.com/druid-io/druid/pull/2784#discussion_r59062021	2016-04-12 21:52:00 +05:30
Himanshu Gupta	aa6a230c90	remove DruidSQL.g4, its failing with newer version of ANTLR, will bring it back and fix if needed later	2016-04-08 11:46:21 -05:00
Xavier Léauté	d4d1d615c1	request log should reflect time the query was received, as opposed to processed	2016-04-07 12:39:34 -07:00
Nishant	edd74f2b67	Allow Lite DataSegment Announcements separate config for each skipping dimensions, metrics and loadSpec Add test fix test comment Add docs	2016-04-07 18:24:12 +05:30
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Xavier Léauté	728da75224	remove unused code	2016-04-01 13:10:35 -07:00
Fangjin Yang	9cb197adec	Merge pull request #2722 from himanshug/fix_hadoop_jar_upload config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-28 14:49:03 -07:00
Fangjin Yang	7a84c267f7	Merge pull request #2743 from metamx/fixlookuptest Fix LookupCoordinatorManager and Test for alerts	2016-03-28 14:41:37 -07:00
Parag Jain	89a8277ae2	Merge pull request #2712 from guobingkun/make_runnables_pluggable make Coordinator IndexingService helpers pluggable	2016-03-28 12:18:18 -05:00
Charles Allen	05151bc325	Fix LookupCoordinatorManagerTest for alerts * Also fixes bad alerting on missing nodes	2016-03-28 09:41:47 -07:00
Fangjin Yang	3c4691aa5a	Merge pull request #2741 from gianm/examples-wiki Downgrade geoip2, exclude com.google.http-client.	2016-03-25 23:08:38 -07:00
Xavier Léauté	01f3221a62	Merge pull request #2665 from jisookim0513/remove-druid-server-serialization remove serialization of DruidServer	2016-03-25 15:54:05 -07:00
Gian Merlino	977e867ad8	Downgrade geoip2, exclude com.google.http-client. Reverts "Update com.maxmind.geoip2 to 2.6.0" and exclude the google http client from com.maxmind.geoip2. This should satisfy the original need from #2646 (wanting to run Druid along with an upgraded com.google.http-client) while preventing Jackson conflicts pointed out in #2717. Fixes #2717. This reverts commit `21b7572533`.	2016-03-25 14:43:22 -07:00
jisookim	0d3c5a3b6c	remove serialization of Druid Server and add tests for ServersResource	2016-03-25 12:27:27 -07:00
Bingkun Guo	0872448ff0	make Coordinator IndexingService helpers pluggable Fixes #2682 IndexingService helpers are added according to the settings in runtime.properties. Rather than having all the config.isXXX checks there, it makes sense to have a pluggable approach for allowing the dynamic configuration to bring in implementations for helpers without having to have hard-coded sets of available helpers. Plus, it will also make it possible for extensions to plug helpers in. With https://github.com/druid-io/druid-api/pull/76, we could conditionally bind a helper to Coordinator's runlist. The condition is driven by the value set in the runtime.properties.	2016-03-25 11:48:54 -05:00
Himanshu Gupta	e78a469fb7	UTs for ExtensionsConfig	2016-03-25 10:51:28 -05:00
Himanshu Gupta	004b00bb96	config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-25 10:51:28 -05:00
Gian Merlino	713062053c	Filters: Add filter.toFilter method, use that instead of the instanceof chain in Filters. I believe that the instanceof chain in Filters exists because in the past, Filter and DimFilter were in different packages (DimFilter was in druid-client and Filter was in druid-processing). And since druid-client didn't depend on druid-processing, DimFilter couldn't have a toFilter method. But now it can.	2016-03-23 17:03:49 -07:00
kilida	f25b2ed6f8	Duplicate statement in ReservoirSegmentSamplerTest.java	2016-03-22 22:14:36 -04:00
Fangjin Yang	826b371259	Merge pull request #2697 from guobingkun/remove_duplicate_version_converter remove duplicated DruidCoordinatorVersionConverter	2016-03-22 15:48:09 -07:00
Bingkun Guo	a6e9ff48ec	Merge pull request #2688 from pjain1/props_cli do not inject properties directly in module	2016-03-22 15:27:19 -05:00
Bingkun Guo	3778adf1f4	remove duplicated DruidCoordinatorVersionConverter	2016-03-22 14:45:52 -05:00
Parag Jain	7b93195dc6	do not inject properties directly in module	2016-03-22 14:30:10 -05:00
Himanshu	00d7021291	Merge pull request #2607 from jon-wei/dim_schema Support use of DimensionSchema class in DimensionsSpec	2016-03-22 11:53:46 -05:00
Himanshu	3220b109ad	Merge pull request #2570 from binlijin/single_dimension_partitioning Single dimension hash-based partitioning	2016-03-22 11:51:06 -05:00
binlijin	bce600f5d5	Single dimension hash-based partitioning	2016-03-22 13:15:33 +08:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Xavier Léauté	25967d0ed8	fix servlet startup sequence, fixes #2681	2016-03-18 15:06:15 -07:00
Charles Allen	5da9a280b6	Query Time Lookup - Dynamic Configuration	2016-03-18 09:45:05 -07:00
Charles Allen	45c413af7e	Merge pull request #2674 from metamx/fix-broadcast-lockup separate HTTP client pool for cancellation requests	2016-03-17 15:23:42 -07:00
Xavier Léauté	1718a7224b	separate HTTP pool for cancellation requests * reduces contention between queries and cancellation requests * more aggressive timeouts for cancellation requests	2016-03-17 12:11:18 -07:00
Charles Allen	c716af5b04	Merge pull request #2678 from metamx/fixImports Fix some google related imports	2016-03-17 11:53:16 -07:00
Charles Allen	a52c6d3bee	Fix some google related imports	2016-03-17 11:03:29 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Parag Jain	948b19a088	do not silently ingnore rows	2016-03-16 09:30:19 -05:00
Fangjin Yang	ec949d76e3	Merge pull request #2655 from navis/hint-coordinator-client Add hint message for missing `druid.selectors.coordinator.serviceName`	2016-03-14 20:57:40 -07:00
Jonathan Wei	5ec5ac92c6	Merge pull request #2382 from himanshug/broker_segment_tier_selection at broker, if configured, only add segments from specific tiers to the timeline	2016-03-14 16:53:06 -07:00
navis.ryu	83e1d5d7bf	Add hint message for missing `druid.selectors.coordinator.serviceName`	2016-03-15 08:39:07 +09:00
Fangjin Yang	06813b510a	Merge pull request #2571 from himanshug/gp_by_avoid_sort avoid sort while doing groupBy merging when possible	2016-03-14 14:46:51 -07:00
Nishant	773d6fe86c	Merge pull request #2646 from atomx/update-maxmind Update com.maxmind.geoip2 to 2.6.0	2016-03-14 11:20:48 -07:00
Himanshu	d51a0a0cf4	Merge pull request #2220 from gianm/appenderator-kafka Appenderators, DataSource metadata, KafkaIndexTask	2016-03-14 13:14:36 -05:00
rasahner	2861e854f0	Merge pull request #2540 from pjain1/remove_kill Remove extra parameter from deleteDataSourceSpecificInterval endpoint and correct exception message for invalid interval	2016-03-14 11:16:23 -05:00
Erik Dubbelboer	21b7572533	Update com.maxmind.geoip2 to 2.6.0 com.maxmind.geoip2 2.6.0 depends on com.google.http-client 1.15.0-rc (3 years old). When trying to include other libraries in Druid that require an up to date version of com.google.http-client this causes a problem.	2016-03-12 09:44:00 +00:00
Gian Merlino	187569e702	DataSource metadata. Geared towards supporting transactional inserts of new segments. This involves an interface "DataSourceMetadata" that allows combining of partially specified metadata (useful for partitioned ingestion). DataSource metadata is stored in a new "dataSource" table.	2016-03-10 17:41:50 -08:00
Gian Merlino	3d2214377d	Appenderatoring. Appenderators are a way of getting more control over the ingestion process than a Plumber allows. The idea is that existing Plumbers could be implemented using Appenderators, but you could also implement things that Plumbers can't do. FiniteAppenderatorDrivers help simplify indexing a finite stream of data. Also: - Sink: Ability to consider itself "finished" vs "still writable". - Sink: Ability to return the number of rows contained within the sink.	2016-03-10 17:41:50 -08:00
Gian Merlino	92c828f904	Make SegmentHandoffNotifier Closeable.	2016-03-10 16:50:37 -08:00
Gian Merlino	ad5ffdf483	Nix Committers.supplierOf; Suppliers.ofInstance is good enough.	2016-03-10 16:50:37 -08:00
Gian Merlino	8a11161b20	Plumbers: Move plumber.add out of try/catch for ParseException. The incremental indexes handle that now so it's not necessary. Also, add debug logging and more detailed exceptions to the incremental indexes for the case where there are parse exceptions during aggregation.	2016-03-10 16:39:26 -08:00
Himanshu Gupta	02dfd5cd80	update IncrementalIndex to support unsorted facts map that can be used in groupBy merging to improve performance	2016-03-10 16:11:48 -06:00
Gian Merlino	708bc674fa	Make specifying query context booleans more consistent. Before, some needed to be strings and some needed to be real booleans. Now they can all be either one.	2016-03-08 19:38:26 -08:00
Fangjin Yang	9c2420a1bc	Merge pull request #2599 from himanshug/datasource_isolation make coordinator db polling for list of segments more robust	2016-03-08 12:43:49 -08:00
Fangjin Yang	e7018f524f	Merge pull request #2598 from himanshug/handoff_timeout optional ability to configure handoff wait timeout on realtime tasks	2016-03-08 12:43:36 -08:00
Slim Bouguerra	c72438ead0	override metric name	2016-03-08 10:58:12 -06:00
Himanshu Gupta	1288784bde	in coordinator db polling for available segments, ignore corrupted entries in segments table so that coordinator continues to load new segments even if there are few corrupted segment entries	2016-03-07 15:13:10 -06:00
Himanshu Gupta	0402636598	configurable handoffConditionTimeout in realtime tasks for segment handoff wait	2016-03-05 10:14:54 -06:00
Fangjin Yang	d06c1c5c85	Merge pull request #2583 from guobingkun/fix_multiple_specs_2 update querySegmentSpec when passing query to getQueryRunner	2016-03-02 18:05:34 -08:00
David Lim	9e74772d6b	Merge pull request #2574 from gianm/allostuff Make first few allocatePendingSegment retries quiet.	2016-03-02 16:16:53 -07:00
Bingkun Guo	cfe2dbf1eb	Merge pull request #2580 from gianm/rtc-basePersist RealtimeTuningConfig: Use different default basePersistDirectory per instance.	2016-03-02 16:56:49 -06:00
Bingkun Guo	4a58462fc7	update querySegmentSpec when passing query to getQueryRunner After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment. In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.	2016-03-02 16:44:56 -06:00
Gian Merlino	e65e6a49a5	RealtimeTuningConfig: Use different default basePersistDirectory per instance.	2016-03-02 13:57:53 -08:00
Gian Merlino	004028b887	Make first few allocatePendingSegment retries quiet. Some light retrying can happen during normal operation (SELECT -> INSERT races) and the ensuing log messages would be scary for users.	2016-03-02 13:40:29 -08:00
Fangjin Yang	612e327426	Merge pull request #2581 from gianm/fix-deadlock CliPeon: Fix deadlock on startup by eagerly creating ExecutorLifecycle, ChatHandlerResource.	2016-03-02 11:37:49 -08:00
Gian Merlino	7557eb2800	CliPeon: Fix deadlock on startup by eagerly creating ExecutorLifecycle, ChatHandlerResource. See stack traces here, from current master: https://gist.github.com/gianm/bd9a66c826995f97fc8f 1. The thread "qtp925672150-62" holds the lock on InternalInjectorCreator.class, used by Scopes.SINGLETON, and wants the lock on "handlers" in Lifecycle.addMaybeStartHandler called by DiscoveryModule.getServiceAnnouncer. 2. The main thread holds the lock on "handlers" in Lifecycle.addMaybeStartHandler, which it took because it's trying to add the ExecutorLifecycle to the lifecycle. main is trying to get the InternalInjectorCreator.class lock because it's running ExecutorLifecycle.start, which does some Jackson deserialization, and Jackson needs that lock in order to inject stuff into the Task it's deserializing. This patch eagerly instantiates ChatHandlerResource (which I believe is what's trying to create the ServiceAnnouncer in the qtp925672150-62 jetty thread) and the ExecutorLifecycle.	2016-03-02 10:53:42 -08:00
Gian Merlino	102fc92120	SQLMetadataConnector: Fix overzealous retries that were preventing EntryExistsException from making it out.	2016-03-01 17:20:33 -08:00
Fangjin Yang	9340cae985	Merge pull request #2457 from bjozet/docs/fixes Default value for maxRowsInMemory	2016-03-01 07:43:26 -08:00
Björn Zettergren	2462c82c0e	New defaults for maxRowsInMemory rowFlushBoundary To bring consistency to docs and source this commit changes the default values for maxRowsInMemory and rowFlushBoundary to 75000 after discussion in PR https://github.com/druid-io/druid/pull/2457. The previous default was 500000 and it's lower now on the grounds that it's better for a default to be somewhat less efficient, and work, than to reach for the stars and possibly result in "OutOfMemoryError: java heap space" errors.	2016-03-01 13:50:28 +01:00
Bingkun Guo	4edcb1b861	Refactor FireChief + UTs for RealtimeManagerTest Add tests that verify whether RealtimeManager is querying the correct FireChief for a specific partition make FireChief static and package private, add latches in the UT	2016-02-29 14:41:10 -06:00
Eric Tschetter	68631d89e9	Allow realtime nodes to have multiple shards of the same datasource	2016-02-29 12:30:25 -06:00
Parag Jain	6b3c96c63a	better exception for invalid interval	2016-02-26 10:02:38 -06:00
Fangjin Yang	29d29ba98d	Merge pull request #2263 from jon-wei/flex_dims3 Allow IncrementalIndex to store Long/Float dimensions	2016-02-25 17:23:02 -08:00
Gian Merlino	b331fb4a83	Fix parsing of druid.indexer.server.maxChatRequests.	2016-02-25 14:47:15 -08:00
Parag Jain	b82b487f20	remove extra kill parameter	2016-02-24 17:16:18 -06:00
jon-wei	c17ce02467	Allow IncrementalIndex to store Long/Float dimensions	2016-02-24 13:51:57 -08:00
Himanshu Gupta	a3b37e9225	In persistAndMerge, increase the scope of try-catch block so that any exception while persisting hydrants is caught and consequently that sink is abandoned or the task will forever wait for handoff to happen.	2016-02-23 22:22:33 -06:00
Nishant	6c9e1a28ad	Merge pull request #2519 from gianm/unparseable-handling Better handling of ParseExceptions.	2016-02-24 04:46:29 +05:30
Fangjin Yang	93540c0631	Merge pull request #2503 from gianm/jetty-qos Add druid.indexer.server.maxChatRequests for QoS; deprecate separate ports.	2016-02-23 10:35:53 -08:00
Gian Merlino	3534483433	Better handling of ParseExceptions. Two changes: - Allow IncrementalIndex to suppress ParseExceptions on "aggregate". - Add "reportParseExceptions" option to realtime tuning configs. By default this is "false". Behavior of the counters should now be: - processed: Number of rows indexed, including rows where some fields could be parsed and some could not. - thrownAway: Number of rows thrown away due to rejection policy. - unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all). If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would cause an exception to be thrown). In addition, "processed" will only include fully parseable rows (because even partial parse failures will cause exceptions to be thrown). Fixes #2510.	2016-02-23 10:11:43 -08:00
Fangjin Yang	0c984f9e32	Merge pull request #2109 from himanshug/segments_in_delta_ingestion idempotent batch delta ingestion	2016-02-22 14:00:45 -08:00
Fangjin Yang	3bdd757024	Merge pull request #1773 from b-slim/log_details Adding downstream source when throwing QueryInterruptedException	2016-02-22 10:16:07 -08:00
Himanshu Gupta	21b0b8a07d	new coordinator endpoint to get list of used segment given a dataSource and list of intervals	2016-02-21 23:17:58 -06:00
Slim Bouguerra	77925cc061	adding downstream source of QueryInterruptedException	2016-02-20 13:05:14 -06:00
Gian Merlino	23c993c9e7	Add druid.indexer.server.maxChatRequests for QoS; deprecate separate ports. - Add druid.indexer.server.maxChatRequests, which sets up a QoSFilter on the main Jetty server. - Deprecate druid.indexer.runner.separateIngestionEndpoint - Deprecate druid.indexer.server.chathandler.*	2016-02-19 13:36:09 -08:00
Gian Merlino	243ac5399b	Harmonize realtime indexing loop across the task and standalone nodes. - Both now catch ParseExceptions on plumber.add (see https://groups.google.com/d/topic/druid-user/wmiRDvx2RvM/discussion) - Standalone now treats IndexSizeExceededException as fatal (previously only the task did)	2016-02-19 07:34:15 -08:00
Gian Merlino	e0c049c0b0	Make startup properties logging optional. Off by default, but enabled in the example config files. See also #2452.	2016-02-12 14:12:16 -08:00
Fangjin Yang	1430bc2c88	Merge pull request #2276 from harshjain2/feature-2021 Fix for issue 2021.	2016-02-10 17:04:45 -08:00
Gian Merlino	fa92b77f5a	Harmonize znode writing code in RTR and Worker. - Throw most exceptions rather than suppressing them, which should help detect problems. Continue suppressing exceptions that make sense to suppress. - Handle payload length checks consistently, and improve error message. - Remove unused WorkerCuratorCoordinator.announceTaskAnnouncement method. - Max znode length should be int, not long. - Add tests.	2016-02-10 14:52:00 -08:00
Harsh Jain	a3eb863c8e	Fix for issue 2021	2016-02-10 22:19:12 +05:30
Himanshu Gupta	d1cb17d3f7	at broker - only add segments from specific tiers to the timeline	2016-02-09 22:33:22 -06:00
Himanshu Gupta	b40c342cd1	make Global stupid pool cache size configurable	2016-02-05 14:18:06 -06:00
Parag Jain	9002548eeb	increase test time out and general clean up	2016-02-03 13:26:37 -06:00
Charles Allen	5111fd52f2	Add check for log4j-core in Log4jShutterDownerModule	2016-02-02 15:56:48 -08:00
Himanshu	dc89cdd0f9	Merge pull request #2336 from himanshug/fix_2331 limit size of X-Druid-Response-Context header to 7K	2016-02-02 12:06:59 -06:00
navis.ryu	c03918f89a	AsyncQueryForwardingServletTest#testDeleteBroadcast sometimes fails by port conflict	2016-01-29 19:28:58 +09:00
Himanshu Gupta	f6b4dbd697	bug fix and unit tests for DruidCoordinatorSegmentKiller	2016-01-28 14:10:17 -06:00
Himanshu Gupta	ab3edfa8fc	moving DruidCoordinatorSegmentKiller class out of DruidCoordinator	2016-01-28 14:03:56 -06:00
Nishant	3880f54b87	Merge pull request #2332 from himanshug/configurable_partial make populateUncoveredIntervals a configuration in query context	2016-01-28 10:34:35 +05:30
Himanshu Gupta	a7bde8f4da	limit size of X-Druid-Response-Context header to 7K due to https://github.com/druid-io/druid/issues/2331	2016-01-27 15:18:08 -06:00
Xavier Léauté	5a3642bb93	Merge pull request #2247 from metamx/pedanticBuild Enable strict building in travis	2016-01-27 10:27:03 -08:00
Charles Allen	508734c8b0	Long constant reformatting in tests `l` --> `L`	2016-01-27 08:59:19 -08:00
Nishant	fd6bf3fe22	Use interval comparator instead of bucketMonthComparator fix when two segments have same interval review comments	2016-01-27 17:35:43 +05:30
Himanshu Gupta	3719b6e3c8	make populateUncoveredIntervals a configuration in query context	2016-01-26 15:13:45 -06:00
Harsh Jain	41730b96d4	Fix for issue 2021	2016-01-25 02:48:22 +05:30
Himanshu	7a6109f0ca	Merge pull request #2321 from gianm/pending-index Replace two-column index on pendingSegments table with one-column index.	2016-01-22 13:38:15 -06:00
Gian Merlino	0bd9bff075	Replace two-column index on pendingSegments table with one-column index. Fixes #2319.	2016-01-22 10:50:21 -08:00
Himank Chaudhary	1a5d4e714c	Adding custom mapper for json processing exception to return bad request instead of 500	2016-01-22 09:48:52 -08:00
Fangjin Yang	04d3054353	Merge pull request #2303 from CHOIJAEHONG1/localfirehouse-basedir-npe Throw an IAE when baseDir is null in LocalFireHose	2016-01-21 07:58:52 -08:00
Nishant	dcb7830330	Merge pull request #984 from drcrallen/thread-priority-rebase Use thread priorities. (aka set `nice` values for background-like tasks)	2016-01-21 15:02:34 +05:30
Charles Allen	2e1d6aaf3d	Use thread priorities. (aka set `nice` values for background-like tasks) * Defaults the thread priority to java.util.Thread.NORM_PRIORITY in io.druid.indexing.common.task.AbstractTask * Each exec service has its own Task Factory which is assigned a priority for spawned task. Therefore each priority class has a unique exec service * Added priority to tasks as taskPriority in the task context. <0 means low, 0 means take default, >0 means high. It is up to any particular implementation to determine how to handle these numbers * Add options to ForkingTaskRunner * Add "-XX:+UseThreadPriorities" default option * Add "-XX:ThreadPriorityPolicy=42" default option * AbstractTask - Removed unneded @JsonIgnore on priority * Added priority to RealtimePlumber executors. All sub-executors (non query runners) get Thread.MIN_PRIORITY * Add persistThreadPriority and mergeThreadPriority to realtime tuning config	2016-01-20 14:00:31 -08:00
Nishant	61aca6f9cc	remove wrong checks sink never have null hydrants and hydrants never have null adapters	2016-01-20 23:43:53 +05:30
Jaehong Choi	7132428bba	throw IAE when baseDir is null in LocalFireHose	2016-01-21 01:27:32 +09:00
Nishant	59ea186af7	fix reference counting for segments	2016-01-20 17:24:21 +05:30
jon-wei	747343e621	Preserve dimension order across indexes during ingestion	2016-01-19 13:34:11 -08:00
Jonathan Wei	df2906a91c	Merge pull request #2290 from gianm/index-merger-v9-stuff Respect buildV9Directly in PlumberSchools, so it works on standalone realtime.	2016-01-19 13:04:00 -08:00
Fangjin Yang	0c31f007fc	Merge pull request #1728 from himanshug/aggregators_in_segment_metadata Store AggregatorFactory[] in segment metadata	2016-01-19 12:55:49 -08:00
Himanshu	fe841fd961	Merge pull request #2118 from guobingkun/fix_segment_loading Fix loading segment for historical	2016-01-19 14:25:48 -06:00
Himanshu Gupta	a99aef29a1	adding aggregators to segment metadata	2016-01-19 14:23:39 -06:00
Gian Merlino	1dcf22edb7	Respect buildV9Directly in PlumberSchools, so it works on standalone realtime nodes. Also parameterize some tests to run with/without buildV9Directly: - IndexGeneratorJobTest - RealtimeIndexTaskTest - RealtimePlumberSchoolTest	2016-01-19 12:15:06 -08:00
Bingkun Guo	c4ad50f92c	Fix loading segment for historical Historical will drop a segment that shouldn't be dropped in the following scenario: Historical node tried to load segmentA, but failed with SegmentLoadingException, then ZkCoordinator called removeSegment(segmentA, blah) to schedule a runnable that would drop segmentA by deleting its files. Now, before that runnable executed, another LOAD request was sent to this historical, this time historical actually succeeded on loading segmentA and announced it. But later on, the scheduled drop-of-segment runnable started executing and removed the segment files, while historical is still announcing segmentA.	2016-01-19 10:29:49 -06:00
Himanshu Gupta	164b0aad7a	removing Map<String,Object> segmentMetadata from methods in Index[Maker/Merger] and using Metadata class instead of a Map to store segment metadata	2016-01-18 22:03:46 -06:00
Himanshu Gupta	637d2605e7	kill unwanted parent directories when a segment is deleted from LocalDataSegmentKiller	2016-01-18 21:51:04 -06:00
Harsh Jain	71f1cd5e34	Fix for issue 2021	2016-01-17 16:10:04 +05:30
Fangjin Yang	f6a1a4ae20	Merge pull request #2138 from KurtYoung/feature-build-v9 build v9 directly	2016-01-16 13:35:46 -06:00
Kurt Young	82ff98c2bf	add config for build v9 directly and update docs	2016-01-16 11:26:34 +08:00
Fangjin Yang	2e54553a8f	Merge pull request #1990 from himanshug/schedule_kill_task support periodic hard delete of segments	2016-01-15 15:22:33 -06:00
David Lim	7c65880e55	Merge pull request #2270 from rasahner/warnOfChatHandlerNoop if chathandler is noop, log using warn instead of info	2016-01-15 13:29:27 -07:00
Harsh Jain	6ec6835b5d	Fix for issue 2021.	2016-01-16 00:58:33 +05:30
David Lim	34cd8f8c72	Merge pull request #2258 from fjy/acl-zk acl for zookeeper is added	2016-01-15 10:27:08 -07:00
Fangjin Yang	a54c726726	Merge pull request #2266 from anubhgup/fix-announce Fix loss in segment announcements when segments do not fit in zNode	2016-01-14 17:10:43 -08:00
Xavier Léauté	dc1a62c3d9	Merge pull request #2248 from metamx/druidNodeHashEquals Add hashCode and equals to DruidNode	2016-01-14 16:04:58 -08:00
Anubhav Gupta	6d09ab839f	Fix for loss in segment announcements when segments do not fit in the znodes during compress mode. Added unit test (from Navis).	2016-01-14 14:44:23 -08:00
Robin	7361cd173f	if chathandler is noop, log using warn instead of info	2016-01-14 10:12:42 -06:00
Nikita Geer	1908d63162	acl for zookeeper is added	2016-01-13 14:56:05 -08:00
navis.ryu	18479bb757	time-descending result of timeseries queries	2016-01-13 12:23:01 +09:00
Himanshu Gupta	eb2d251ac8	support periodic hard delete of segments	2016-01-12 16:55:05 -06:00
Himanshu	01a0715ee2	Merge pull request #2161 from metamx/query-metrics-timeout Fix Query metrics for query timeout	2016-01-12 09:50:40 -06:00
Himanshu Gupta	ef3cddabe9	adding UTs for DruidCoordinatorConfig	2016-01-11 22:03:25 -06:00
Charles Allen	b0e04a9162	Add hashCode and equals to DruidNode	2016-01-11 15:23:45 -08:00
Charles Allen	ea623e43d2	Merge pull request #2240 from metamx/fix-load-rule Fix loadRule when one of the tiers had no available servers	2016-01-11 10:05:31 -08:00
Nishant	32bc2f776e	Fix loadRule when one of the tiers had no servers When one of the tiers have no servers, LoadRule should ignore that tier and continue to load/drop segments in other available tiers. the bug also causes whacky behavior with LoadRule with non existent tier where the segment balancer keeps on moving segments to other nodes in existing tiers but the extra segment copies are never dropped eventually leading to all the tiers getting full .	2016-01-11 15:53:14 +05:30
Himanshu	d255f4baac	Merge pull request #2234 from pjain1/emit_realtime_metrics emit handoff count metrics	2016-01-08 14:24:16 -06:00
Parag Jain	9dba0f67e7	emit handoff count metrics	2016-01-08 12:36:13 -06:00
Fangjin Yang	15fc070232	Merge pull request #2213 from himanshug/fix_curtator_test_base [wip] trying/finding fix for announcer test failures	2016-01-07 18:23:49 -08:00
Nishant	1bfb4e3988	Emit query/time for failed and timeout queries emit query/time metric also add success flag fix success flag for router metrics review comments formatting.	2016-01-07 19:41:54 +05:30
Himanshu Gupta	7ab810f3eb	do not ignore exceptions from curator cleanup in CuratorTestBase	2016-01-06 10:42:53 -06:00
Nishant	14989f272d	Add metrics for ingest/bytes/received for EventReceiverFirehose review comments review comments	2016-01-05 20:06:09 +05:30
fjy	57d91d754d	Comment out buggy unit tests, fix #2185	2016-01-03 09:50:16 -08:00
fjy	8424b2b456	Fix announcer test bad check	2016-01-01 19:55:22 -08:00
Himanshu Gupta	fa5c3bb014	adding decorate(DimensionSelector) to DimensionSpec to enable support for arbitrary filtering/transformations to returned dimension values	2015-12-30 15:06:24 -06:00
Bingkun Guo	3c107c5757	Merge pull request #2150 from himanshug/emit_query_bytes emit query/bytes metric	2015-12-30 13:44:19 -06:00

... 4 5 6 7 8 ...

3078 Commits