Commit Graph

2772 Commits

Author SHA1 Message Date
Roman Leventov 5dc95389f7 Add Checkstyle framework (#3551)
* Add Checkstyle framework

* Avoid star import

* Need braces for control flow statements

* Redundant imports

* Add NewLineAtEndOfFile check
2016-10-13 13:37:47 -07:00
Gian Merlino ddc856214d When inserting segments, mark unused if already overshadowed. (#3499)
This is useful for the insert-segment-to-db tool, which would otherwise
potentially insert a lot of overshadowed segments as "used", causing
load and drop churn in the cluster.
2016-10-10 18:10:18 -07:00
Himanshu 7e6824501c fix: QueryResource thread name includes whole inner query string for nested query (#3549)
* print exception details from QueryInterruptedException

* in QueryResource.java, set thread name to include dataSource names and not whole query string e.g. from QueryDataSource
2016-10-06 18:30:52 -07:00
praveev 43cdc675c7 Add support for timezone in segment granularity (#3528)
* Add support for timezone in segment granularity

* CR feedback. Handle null timezone during equals check.

* Include timezone in docs.
Add timezone for ArbitraryGranularitySpec.
2016-10-03 08:15:42 -07:00
Gian Merlino 40f2fe7893 Bump versions to 0.9.3-SNAPSHOT (#3524) 2016-09-29 13:53:32 -07:00
John Zhang 78b06a7d7e make global http client worker threads configurable (#3514) 2016-09-28 23:18:51 -07:00
Xiaoyao 91e6ab4fcf LRU cache guarantee to keep size under limit (#3510)
* LRU cache guarantee to keep size under limit

* address comments

* fix failed tests in jdk7
2016-09-27 17:13:06 -07:00
Parag Jain 56b0586097 secure BrokerQueryResource endpoints (#3506) 2016-09-26 11:27:24 -07:00
David Lim ca9114b41b add supervisor reset API (#3484)
* add supervisor reset API

* CR doc changes and kill running tasks / clear offsets from supervisor
2016-09-22 17:51:06 -07:00
Navis Ryu 49c0fe0e8b Show candidate hosts for the given query (#2282)
* Show candidate hosts for the given query

* Added test cases & minor changes to address comments

* Changed path-param to query-pram for intervals/numCandidates
2016-09-22 08:32:38 -07:00
Keuntae Park 54ec4dd584 Support renaming of outputName for cached select and search query results (#3395)
* support renaming of outputName for cached select and search queries

* rebase and resolve conflicts

* rollback CacheStrategy interface change

* updated based on review comments
2016-09-20 08:19:14 -07:00
Charles Allen 95e08b38ea [QTL] Reduced Locking Lookups (#3071)
* Lockless lookups

* Fix compile problem

* Make stack trace throw instead

* Remove non-germane change

* * Add better naming to cache keys. Makes logging nicer
* Fix #3459

* Move start/stop lock to non-interruptable for readability purposes
2016-09-16 11:54:23 -07:00
Gian Merlino 7a2a4bc6de JavaScript: Disable now affects worker selection and router strategy too. (#3458) 2016-09-13 16:37:42 -07:00
Jonathan Wei df766b2bbd Add dimension handling interface for ingestion and segment creation (#3217)
* Add dimension handling interface for ingestion and segment creation

* update javadocs for DimensionHandler/DimensionIndexer

* Move IndexIO row validation into DimensionHandler

* Fix null column skipping in mergerV9

* Add deprecation note for 'numeric_dims' filename pattern in IndexIO v8->v9 conversion

* Fix java7 test failure
2016-09-12 12:54:02 -07:00
Gleb Smirnov 8bee07e81e Respect server-side sorting of tasks in coordinator console (#3404) 2016-08-28 16:38:29 -07:00
jaehong choi 2e0f253c32 introducing lists of existing columns in the fields of select queries' output (#2491)
* introducing lists of existing columns in the fields of select queries' output

* rebase master

* address the comment. add test code for select query caching

* change the cache code in SelectQueryQueryToolChest to 0x16
2016-08-25 21:37:53 +05:30
kaijianding eafafce1aa fix old usage of dimension as string instead of dimensionSchema in DataSchema (#3365) 2016-08-16 09:58:04 -07:00
David Lim ed924bf214 allow registrants to opt out of announcing themselves when registering as a chat handler (#3360) 2016-08-16 10:51:28 +05:30
rajk-tetration 362b9266f8 Adding filters for TimeBoundary on backend (#3168)
* Adding filters for TimeBoundary on backend

Signed-off-by: Balachandar Kesavan <raj.ksvn@gmail.com>

* updating TimeBoundaryQuery constructor in QueryHostFinderTest

* add filter helpers

* update filterSegments + test

* Conditional filterSegment depending on whether a filter exists

* Style changes

* Trigger rebuild

* Adding documentation for timeboundaryquery filtering

* added filter serialization to timeboundaryquery cache

* code style changes
2016-08-15 10:25:24 -07:00
Jonathan Wei 890e3bdd3f More informative query unit test names (#3342) 2016-08-09 22:24:48 -07:00
Gian Merlino 21bce96c4c More useful query errors. (#3335)
Follow-up to #1773, which meant to add more useful query errors but
did not actually do so. Since that patch, any error other than
interrupt/cancel/timeout was reported as `{"error":"Unknown exception"}`.

With this patch, the error fields are:

- error, one of the specific strings "Query interrupted", "Query timeout",
  "Query cancelled", or "Unknown exception" (same behavior as before).
- errorMessage, the message of the topmost non-QueryInterruptedException
  in the causality chain.
- errorClass, the class of the topmost non-QueryInterruptedException
  in the causality chain.
- host, the host that failed the query.
2016-08-09 16:14:52 +08:00
Navis Ryu 39351fb8d2 Mask properties from logging (#3332)
* Mask properties from logging

* mask "password" by default
2016-08-08 21:36:10 +05:30
kaijianding 50d52a24fc ability to not rollup at index time, make pre aggregation an option (#3020)
* ability to not rollup at index time, make pre aggregation an option

* rename getRowIndexForRollup to getPriorIndex

* fix doc misspelling

* test query using no-rollup indexes

* fix benchmark fail due to jmh bug
2016-08-02 11:13:05 -07:00
Jonathan Wei a6105cbb86 Add numeric StringComparator (#3270)
* Add numeric StringComparator

* Only use direct long comparison for numeric ordering in BoundFilter, add time filtering benchmark query

* Address PR comments, add multithreaded BoundDimFilter test

* Add comment on strlen tie handling

* Add timeseries interval filter benchmark

* Adjust docs

* Use jackson for StringComparator, address PR comments

* Add new TopNMetricSpec and SearchSortSpec with tests (WIP)

* More TopNMetricSpec and SearchSortSpec tests

* Fix NewSearchSortSpec serde

* Update docs for new DimensionTopNMetricSpec

* Delete NumericDimensionTopNMetricSpec

* Delete old SearchSortSpec

* Rename NewSearchSortSpec to SearchSortSpec

* Add TopN numeric comparator benchmark, address PR comments

* Refactor OrderByColumnSpec

* Add null checks to NumericComparator and String->BigDecimal conversion function

* Add more OrderByColumnSpec serde tests
2016-07-29 15:44:16 -07:00
Charles Allen d04af6aee4 Add `slf4j` requst logger (#3146)
* Add `slf4j` requst logger

* Address comments

* Fix conflicts with master

* Fix removed map value
2016-07-29 15:15:41 -07:00
Charles Allen 546e4f79b0 Add size of pending deletes to historical metrics (#3295)
* Add size of pending deletes to historical metrics
2016-07-27 11:30:47 -07:00
Charles Allen b1e3fe77f5 More logging around how the coordinator balancer is happening (#3279)
* More logging around how the coordinator balancer is happening

* Address comments

* Address code review comments and add actual logging
2016-07-27 13:24:32 +05:30
Gian Merlino 2f275497b6 Fix caching of extension classloaders. (#3289) 2016-07-26 15:19:15 -07:00
Gian Merlino 8030f1cb67 Be more respectful of maxRowsInMemory. (#3284)
- Appenderator: Respect maxRowsInMemory across all sinks.
- KafkaIndexTask: Respect maxRowsInMemory across all partitions.
2016-07-26 15:02:35 -06:00
Charles Allen 188a4bc89a Revert "Optionally intern ServerInventoryView inventory objects. (#3238)" (#3286)
This reverts commit a931debf79.
Fixes #3283

The core issue here is that realtime nodes announce their size as 0, so a coordinator which interns the realtime version of the data segment will not be able to see the new sized announcement when handoff occurs.

This is caused by the `eauals` method on a `DataSegment` only evaluating the identifier. the `eauals` method *should* be correct for object equivalence, and things which need to check equivalence of some sub-portion of the object should do so explicitly.
2016-07-26 11:47:34 -07:00
kaijianding 3dc2974894 Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227)
* save TimestampSpec in metadata.drd

* add timestampSpec info in SegmentMetadataQuery
2016-07-25 15:45:30 -07:00
Gian Merlino b316cde554 Appenderator tests for disjoint query intervals. (#3281) 2016-07-23 19:48:15 -07:00
Charles Allen c58bbfa0c6 Intern DataSegments in SQLMetadataSegmentManager (#3267)
* Helps with heap pressure on coordinator
2016-07-21 16:46:08 -07:00
Parag Jain fd798d32bc fix testSecuredGetServer ut (#3262) 2016-07-20 10:20:13 -07:00
Gian Merlino 06624c40c0 Share query handling between Appenderator and RealtimePlumber. (#3248)
Fixes inconsistent metric handling between the two implementations. Formerly,
RealtimePlumber only emitted query/segmentAndCache/time and query/wait and
Appenderator only emitted query/partial/time and query/wait (all per sink).

Now they both do the same thing:
- query/segmentAndCache/time, query/segment/time are the time spent per sink.
- query/cpu/time is the CPU time spent per query.
- query/wait/time is the executor waiting time per sink.

These generally match historical metrics, except segmentAndCache & segment
mean the same thing here, because one Sink may be partially cached and
partially uncached and we aren't splitting that out.
2016-07-19 22:15:13 -05:00
Himanshu 3f82108d15 optionally enable coordinator auto kill tasks on all dataSources via dynamic config (#3250) 2016-07-17 18:47:52 -07:00
Gian Merlino 6cd1f5375b Better harmonized dimensions for query metrics. (#3245)
All query metrics now start with toolChest.makeMetricBuilder, and all of
*those* now start with DruidMetrics.makePartialQueryTimeMetric. Also, "id"
moved to common code, since all query metrics added it anyway.

In particular this will add query-type specific dimensions like "threshold"
and "numDimensions" to servlet-originated metrics like query/time.
2016-07-14 11:55:51 -07:00
Hyukjin Kwon 55e7a52475 Replace deprecated usage for StringInputRowParser and JSONParseSpec (#3215) 2016-07-14 09:19:17 -07:00
Nishant a1715c8cda fix-3237 (#3244)
DruidBroker use FilteredServerInventoryView instead of
ServerInventoryView
2016-07-13 22:30:35 -07:00
Charles Allen a931debf79 Optionally intern ServerInventoryView inventory objects. (#3238) 2016-07-14 08:49:26 +05:30
Charles Allen 5d9fd0a713 Migrate IndexerSQLMetadataStorageCoordinator.getUnusedSegmentsForInterval to streaming (#3043)
* Migrate IndexerSQLMetadataStorageCoordinator.getUnusedSegmentsForInterval to streaming
* Missed query from #2859

* Make inReadOnlyTransaction part of SQLMetadataConnector
2016-07-06 16:55:27 -07:00
Himanshu e1313e4b90 add log msg when event recvr firehose buffer is full (#3209) 2016-07-01 17:35:30 -05:00
Xavier Léauté 485e381387 remove datasource from hadoop output path (#3196)
fixes #2083, follow-up to #1702
2016-06-29 08:53:45 -07:00
Gian Merlino 4c9aeb7353 Revert "update druid console version (#3189)" (#3203)
This reverts commit 496b801bc3.
2016-06-29 08:29:57 -07:00
Xavier Léauté 496b801bc3 update druid console version (#3189) 2016-06-27 18:02:40 -07:00
Hyukjin Kwon 45f553fc28 Replace the deprecated usage of NoneShardSpec (#3166) 2016-06-25 10:27:25 -07:00
Gian Merlino 4cc39b2ee7 Alternative groupBy strategy. (#2998)
This patch introduces a GroupByStrategy concept and two strategies: "v1"
is the current groupBy strategy and "v2" is a new one. It also introduces
a merge buffers concept in DruidProcessingModule, to try to better
manage memory used for merging.

Both of these are described in more detail in #2987.

There are two goals of this patch:

1. Make it possible for historical/realtime nodes to return larger groupBy
   result sets, faster, with better memory management.
2. Make it possible for brokers to merge streams when there are no order-by
   columns, avoiding materialization.

This patch does not do anything to help with memory management on the broker
when there are order-by columns or when there are nested queries. That could
potentially be done in a future patch.
2016-06-24 18:06:09 -07:00
Dave Li 8a08398977 Add segment pruning based on secondary partition dimension (#2982)
* add get dimension rangeset to filters

* add get domain to ShardSpec and added chunk filter in caching clustered client

* add null check and modified not filter, started with unit test

* add filter test with caching

* refactor and some comments

* extract filtershard to helper function

* fixup

* minor changes

* update javadoc
2016-06-24 14:52:19 -07:00
Charles Allen 15f833a861 Make extension classloader caching keyed on directory (#3165)
* Make extension classloaders keyed by extension directory
* Fixes #3163

* Add in same-directory-name unit test
2016-06-23 17:13:19 -07:00
michaelschiff 66d8ad36d7 adds new coordinator metrics 'segment/unavailable/count' and (#3176)
'segment/underReplicated/count' (#3173)
2016-06-23 14:53:15 -07:00