8244 Commits

Author SHA1 Message Date
Charles Allen
2fe55d30c5 Remove LogLevelAdjuster from GuiceRunnable (#4236)
* Redundant since log4j2 allows config on the fly via jmx
* Fix #1805
2017-06-28 13:06:34 -05:00
Gian Merlino
4c33d0a00f Add some new expression functions and macros. (#4442)
* Add some new expression functions and macros.

See misc/math-expr.md for the list of added functions, except for
"like", which previously existed but was not documented.

* Add easymock to datasketches tests.

* Add easymock to distinctcount tests.

* Add easymock to virtual-columns tests.

* Code review comments.

* Clean up code a bit.

* Add easymock to scan-query tests.

* Rework ExprMacros that have multiple impls.

* Improve test coverage.
2017-06-28 10:15:58 -07:00
Roman Leventov
2fa4b10145 More fine-grained DI for management node types. Don't allocate processing resources on Router (#4429)
* Remove DruidProcessingModule, QueryableModule and QueryRunnerFactoryModule from DI for coordinator, overlord, middle-manager. Add RouterDruidProcessing not to allocate processing resources on router

* Fix examples

* Fixes

* Revert Peon configs and add comments

* Remove qualifier
2017-06-27 22:58:01 -07:00
Jihoon Son
82140c2f86 Reduce number of tasks in ITUnionQueryTest (#4476) 2017-06-27 22:28:55 -07:00
dgolitsyn
e04b8be52e maxSegmentsInQueue in CoordinatorDinamicConfig (#4445)
* Add maxSegmentsInQueue parameter to CoordinatorDinamicConfig and use it in LoadRule to improve segments loading and replication time

* Rename maxSegmentsInQueue to maxSegmentsInNodeLoadingQueue

* Make CoordinatorDynamicConfig constructor private; add/fix tests; set default maxSegmentsInNodeLoadingQueue to 0 (unbounded)

* Docs added for maxSegmentsInNodeLoadingQueue parameter in CoordinatorDynamicConfig

* More docs for maxSegmentsInNodeLoadingQueue and style fixes
2017-06-27 22:58:36 -05:00
Jihoon Son
79fd5338e3 Get s3 objects directly from prefixes when listing is failed due to permission (#4444)
* Fall back to getObject when listing is failed due to permission

* Throws an exception when listing is not allowed on directory

* Fix error messages
2017-06-27 18:58:37 -07:00
Jihoon Son
e3c13c246a Respect reportParseExceptions option in IndexTask.determineShardSpecs() (#4467)
* Respect reportParseExceptions option in IndexTask.determineShardSpecs()

* Fix typo
2017-06-27 10:28:22 -07:00
Jihoon Son
7a261c8311 Split travis test (#4468) 2017-06-26 18:51:48 -07:00
Roman Leventov
05d58689ad Remove the ability to create segments in v8 format (#4420)
* Remove ability to create segments in v8 format

* Fix IndexGeneratorJobTest

* Fix parameterized test name in IndexMergerTest

* Remove extra legacy merging stuff

* Remove legacy serializer builders

* Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest
2017-06-26 13:21:39 -07:00
Jihoon Son
5fec619284 Make KafkaLookupExtractorFactoryTest fast (#4466)
* Make KafkaLookupExtractorFactoryTest fast

* Use list

* Use Bytes
2017-06-26 10:15:28 -05:00
David Lim
0f99467cfb rollback to previous httpclient/httpcore versions (#4457) 2017-06-23 21:43:49 -05:00
Jihoon Son
b37c9b5fe0 Fix a bug of CSV/TSV parsers when extracting columns from header (#4443)
* Reset fieldNames whenever a new file begins

* Fix test failure

* Fix test failure
2017-06-23 14:29:26 -07:00
jeffhartley
3e7f7720a1 update aggregations.md re: rollup (#4455)
noted that rollup could be on or off
2017-06-23 14:28:59 -07:00
Himanshu
61c38b66ad exclude aws-java-sdk from hadoop-aws dep in hdfs-storage module (#4437)
* exclude aws-java-sdk from hdfs-storage module

* address review comments
2017-06-22 15:56:35 -05:00
Fokko Driesprong
ff501e8f13 Add Date support to the parquet reader (#4423)
* Add Date support to the parquet reader

Add support for the Date logical type. Currently this is not supported. Since the parquet
date is number of days since epoch gets interpreted as seconds since epoch, it will fails
on indexing the data because it will not map to the appriopriate bucket.

* Cleaned up code and tests

Got rid of unused json files in the examples, cleaned up the tests by
using try-with-resources. Now get the filenames from the json file
instead of hard coding them and integrated general improvements from
the feedback provided by leventov.

* Got rid of the caching

Remove the caching of the logical type of the time dimension column
and cleaned up the code a bit.
2017-06-22 15:56:08 -05:00
Jihoon Son
3e60c9125d Increase timeout of GroupByQueryMergeBufferTest and AppenderatorDriverTest (#4441) 2017-06-22 06:09:52 -07:00
Jihoon Son
3a5c480405 Split IndexMergerTest and ImmutableConciseSetTest (#4427) 2017-06-21 20:55:51 -07:00
Gian Merlino
34d2f9ebfe Queries: Restore old prepareAggregations method. (#4432)
For backwards compatibility, post-#4394.
2017-06-21 05:36:32 -07:00
Gian Merlino
679cf277c0 Add ExpressionFilter. (#4405)
* Add ExpressionFilter.

The expression filter expects a single argument, "expression", and matches
rows where that expression is true.

* Code review comments.

* CR comment.

* Fix logic.

* Fix test.

* Remove unused import.
2017-06-20 12:42:26 -07:00
Jonathan Wei
b333deae9d Add script for getting milestone contributors (#4396)
* Add script for getting milestone contributors

* Use resp.text instead of resp.content

* Add usage notes
2017-06-17 13:25:19 -07:00
Gian Merlino
22aad08a59 ExpressionPostAggregator: Automatically finalize inputs. (#4406)
* ExpressionPostAggregator: Automatically finalize inputs.

Raw HyperLogLogCollectors and such aren't very useful. When writing
expressions like `x / y` users will expect `x` and `y` to be finalized.

* Fix un-merge.

* Code review comments.

* Remove unnecessary ImmutableMap.copyOf.
2017-06-17 13:22:47 -07:00
Roman Leventov
b87f037b77 Update git workflow (#4418)
* Update git workflow

Make Git workflow more elaborate. Fix enumeration. Emphasize what to do on conflicts with master.

* Fix
2017-06-16 19:27:46 -07:00
Jonathan Wei
3b70995bb3 Configurable row limit for JDBC frames (#4417) 2017-06-16 17:07:40 -07:00
Jonathan Wei
cc815eec81 Create/close yielder in same thread for JDBC queries (#4415)
* Create/close yielder in same thread for JDBC queries

* PR comments

* More PR comments

* Add connectionId to DruidStatement executor
2017-06-16 16:50:33 -07:00
Goh Wei Xiang
f68a0693f3 Allow use of non-threadsafe ObjectCachingColumnSelectorFactory (#4397)
* Adding a flag to indicate when ObjectCachingColumnSelectorFactory need not be threadsafe.

* - Use of computeIfAbsent over putIfAbsent
- Replace Maps.newXXXMap() with normal instantiation
- Documentations on when is thread-safe required.
- Use Builders for On/OffheapIncrementalIndex

* - Optimization on computeIfAbsent
- Constant EMPTY DimensionsSpec
- Improvement on IncrementalIndexSchema.Builder
  - Remove setting of default values
  - Use var args for metrics
- Correction on On/OffheapIncrementalIndex Builders
- Combine On/OffheapIncrementalIndex Builders

* - Removing unused imports.

* - Helper method for testing with IncrementalIndex.Builder

* - Correction on javadoc.

* Style fix
2017-06-16 16:04:19 -05:00
Gian Merlino
e78d8584a1 JettyQosTest, DruidAvaticaHandlerTest: Extend timeout. (#4416)
Fixes #4408, probably.
2017-06-15 18:28:50 -07:00
Gian Merlino
054cf8a183 Limit random access in compressed column tests. (#4414)
* Limit random access in compressed column tests.

Random access leads to lots of block decompressions for reading single elements,
which is time prohibitive for the large column tests. For those tests, limit the
number of randomly accessed elements to 1000.

* Random -> ThreadLocalRandom
2017-06-15 14:48:06 -07:00
Jihoon Son
bf537f9226 Increase publish timeout (#4407) 2017-06-15 14:55:30 -05:00
Gian Merlino
17ef785618 Speed up sketch tests by merging fewer indexes. (#4413)
The tests go from 5 minutes to about 10 seconds. 1000 maxRowCount is still
low enough to get a few merges, so we're still exercising that functionality.
2017-06-15 14:47:55 -05:00
Jonathan Wei
7fe295009e Faster ByteBufferMinMaxOffsetHeapTest (#4404) 2017-06-15 13:14:29 -05:00
Gian Merlino
6edee7f434 Expressions work better with strings. (#4394)
* Expressions work better with strings.

- ExpressionObjectSelector able to read from string columns, and able to
  return strings.
- ExpressionVirtualColumn able to offer string (and long for that matter)
  as its native type.
- ExpressionPostAggregator able to return strings.
- groupBy, topN: Allow post-aggregators to accept dimensions as inputs,
  making ExpressionPostAggregator more useful.
- topN: Use DimExtractionTopNAlgorithm for STRING columns that do not
  have dictionaries, allowing it to work with STRING-type expression
  virtual columns.
- Adjusts null handling to better match the rest of Druid: null and
  empty string treated the same; nulls implicitly treated as zeroes in
  numeric context.

* Code review comments.

* More code review.

* Fix test.

* Adjust annotations.
2017-06-14 14:50:18 -07:00
Roman Leventov
976492c186 Make PolyBind to fail if property value is not found (fixes #4369) (#4374)
* Make PolyBind to fail if property value is not found

* Fix test

* Add onHeap option in NamespaceExtractionModule

* Add PolyBind.createChoiceWithDefaultNoScope()

* Fix NPE

* Fix

* Configure MetadataStorageProvider option for MySQL, PostgreSQL and SQLServer

* Deprecate PolyBind.createChoiceWithDefault form with unused defaultKey

* Fix NPE
2017-06-13 09:45:43 -07:00
Jihoon Son
073695d311 Simple improvement for realtime manager (#4377)
* Simple improvement for realtime manager

* Address comments

* tmp

* Address comments and add more tests

* Add catch for InterruptedException

* Address comments
2017-06-12 15:25:34 -07:00
Roman Leventov
113b8007b7 Increase timeout for QueryGranularityTest.testDeadLock() (#4395) 2017-06-12 13:28:21 -07:00
Roman Leventov
c121845102 Avoid using Guava in DataSegmentPushers because of incompatibilities (#4391)
* Avoid using Guava in DataSegmentPushers because of Hadoop incompatibilities

* Clarify comments
2017-06-12 09:58:34 -07:00
Roman Leventov
5285eb961b Update dependencies (#4313)
* Update dependencies

* Downgrade curator

* Rollback aws-java-sdk dependency to 1.10.77

* Revert exclusions in integration-tests

* Depend only on aws-java-sdk-ec2 instead of umbrella aws-java-sdk (fixes #4382)
2017-06-09 14:32:07 -07:00
Amar Ramachandran
fc80df339e Fix incorrect name (#4386) 2017-06-09 13:32:17 -04:00
Jihoon Son
bbc307b30e Add legacy constructor to CsvParseSpec and DelimitedParseSpec for backward compatibility (#4388)
* Add legacy constructor to CsvParseSpec

* Remove JsonProperty annotations

* Add legacy constructor to DelimitedParseSpec
2017-06-08 19:33:02 -07:00
Niketh Sabbineni
2cd91b64d0 Uncompress streams without having to download to tmp first (#4364)
* Uncompress streams without having to download to tmp first

* Remove unused file
2017-06-08 18:08:38 -07:00
Akash Dwivedi
f22e646168 Use MetadataSegmentManager info during segment move. (#4260)
* Use MetadataSegmentManager info to move segments.

* Use original segment to drop.

* Formatting.

* Addressed comments.
2017-06-08 12:12:21 -07:00
Gian Merlino
1f2afccdf8 Expressions: Add ExprMacros. (#4365)
* Expressions: Add ExprMacros, which have the same syntax as functions, but
can convert themselves to any kind of Expr at parse-time.

ExprMacroTable is an extension point for adding new ExprMacros. Anything
that might need to parse expressions needs an ExprMacroTable, which can
be injected through Guice.

* Address code review comments.
2017-06-08 09:32:10 -04:00
Jonathan Wei
9ae04b7375 Remove queryMetricsFactory from GroupByQueryConfig (#4383) 2017-06-07 21:35:26 -05:00
Roman Leventov
4ef3a770a3 Fix compilation of ExpressionBenchmark (#4379) 2017-06-07 14:18:09 -05:00
Jordan Pittier
6697f3a62b [Doc]Update configuration/historical.md: correct default numBootstrapThreads value (#4376)
According to 4ace65a2af/server/src/main/java/io/druid/segment/loading/SegmentLoaderConfig.java (L81) if numBootstrapThreads is not set, it default to numLoadingThreads.
2017-06-07 10:24:42 -07:00
Roman Leventov
63a897c278 Enable most IntelliJ 'Probable bugs' inspections (#4353)
* Enable most IntelliJ 'Probable bugs' inspections

* Fix in RemoteTestNG

* Fix IndexSpec's equals() and hashCode() to include longEncoding

* Fix inspection errors

* Extract global isntance of natural().nullsFirst(); address comments

* Fix

* Use noinspection comments instead of SuppressWarnings on method for IntelliJ-specific inspections

* Prohibit Ordering.natural().nullsFirst() using Checkstyle
2017-06-07 09:54:25 -07:00
Roman Leventov
b487fa355b More methods in QueryMetrics and TopNQueryMetrics (the last part of #3798) (#4284)
* Add more methods to QueryMetrics and TopNQueryMetrics, add BitmapResultFactory

* Add implementor expectations section to BitmapResultFactory javadoc
2017-06-07 09:49:08 -07:00
Himanshu
4ace65a2af fix NPE in IndexGeneratorJob (#4371)
* fix NPE in IndexGeneratorJob

* address review comment

* review comments
2017-06-07 05:54:03 -07:00
Roman Leventov
2d15215cd0 Add TeamCity inspections badge (#4351) 2017-06-06 20:53:25 -04:00
Gian Merlino
67b162a337 SQL: More forgiving Avatica server. (#4368)
* SQL: More forgiving Avatica server.

- Automatically close statements that are fully iterated or that have
  errors, to prevent dangling statements from causing clients to hit
  open statement limits.
- Empower client auto-reconnects by throwing NoSuchConnectionException
  when appropriate.
- Try to close empty connections when we hit the open connection limit,
  rather than failing the newly opened connection. Client
  auto-reconnections mean this shouldn't cause problems in practice.
- Improve concurrency of the server by making "connections" a
  concurrent map.
- Lower default connection timeout to PT5M from PT30M.

* Fix DruidStatement test.
2017-06-06 10:11:40 -07:00
kaijianding
551a89bd67 serialize DateTime As Long to improve json serde performance (#4038) 2017-06-06 10:08:51 -07:00