1012 Commits

Author SHA1 Message Date
Jihoon Son
56fb11ce0b Lazy initialization for JavaScript functions (#4871)
* Lazy initialization of JavaScript functions

* Fix test failure

* Fix thread-safety and postpone js conf check

* Fix test fail

* Fix test

* Fix KafkaIndexTaskTest

* Move config check
2017-10-10 21:52:42 -07:00
Gian Merlino
b20e3038b6 SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. (#4889)
* SQL: Upgrade to Calcite 1.14.0, some refactoring of internals.

This brings benefits:
- Ability to do GROUP BY and ORDER BY with ordinals.
- Ability to support IN filters beyond 19 elements (fixes #4203).

Some refactoring of druid-sql internals:
- Builtin aggregators and operators are implemented as SqlAggregators
  and SqlOperatorConversions rather being special cases. This simplifies
  the Expressions and GroupByRules code, which were becoming complex.
- SqlAggregator implementations are no longer responsible for filtering.

Added new functions:
- Expressions: strpos.
- SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR,
  and DATE_TRUNC.

* Add missing @Override annotation.

* Adjustments for forbidden APIs.

* Adjustments for forbidden APIs.

* Disable GROUP BY alias.

* Doc reword.
2017-10-10 12:44:05 -07:00
chunghochen
0614b92df1 adding new post aggregators for test statistics to druid-stats extension (#4532)
* adding new post aggregators of test stats to druid-stats extension

* changes to address code review comments

* fix checkstyle violations using druid_intellij_formatting.xml after merge upstream/master

* add @Override annotation per CI log

* make changes per review comments/discussions

* remove some blocks per review comments
2017-10-09 23:43:27 -07:00
Guillaume Balaine
35944d24ae Fix JdbcCacheGenerator, null values shouldn't be allowed (#4881)
* Fix JdbcCacheGenerator, null values shouldn't be allowed

* Add a test case for null values
2017-10-06 09:31:48 -07:00
Alexander Saydakov
bba96f59f8 added missing synchronized keyword (#4894)
* added missing synchronized keyword

* added missing synchronized keyword
2017-10-03 12:16:54 -05:00
Jonathan Wei
5e60ccade1 Add context map to AuthenticationResult (#4870) 2017-10-02 17:08:14 -05:00
Gian Merlino
1f2074c247 Bump versions in master to 0.11.1-SNAPSHOT. (#4878)
* Bump versions in master to 0.11.1-SNAPSHOT.

* Missed a few.
2017-09-28 17:09:51 -05:00
Himanshu
f69c9280c4 remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858)
* remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form

* sanitize output of /druid/coordinator/v1/cluster endpoint
2017-09-28 10:40:59 -05:00
Goh Wei Xiang
2c30d5ba55 Add org.joda.time.DateTime.parse() to forbidden APIs (#4857)
* Added org.joda.time.DateTime#(java.lang.String) to forbidden API.

* Added org.joda.time.DateTime#(java.lang.String, org.joda.time.format.DateTimeFormatter) to forbidden API.

* Add additional APIs that may create DateTime with default time zone

* Add helper function that accepts formatter to parse String.

* Add additional forbidden APIs

* Replace existing usage of forbidden APIs

* Use wrapper class to enforce Chronology on DateTimeFormatter.

* Creates constant UtcFormatter for constant ISODateTimeFormat.
2017-09-27 17:46:44 -05:00
Alexander Saydakov
c3fbe5158d use latest sketches-core-0.10.1 and memory-0.10.3 (#4828)
* use latest sketches-core-0.10.1 and memory-0.10.3

* style fix

* better variable name

* removed explicit dependency on memory
2017-09-27 15:18:33 -05:00
Roman Leventov
c702ac771f Fix formatting in ApproximateHistogramTest (#4853) 2017-09-26 15:14:25 -05:00
Gino Ledesma
e60bc0cabc bug: getQuantiles() returns values that exceed max (#4744)
Fixes https://github.com/druid-io/druid/issues/3972
2017-09-26 10:43:56 -07:00
Gian Merlino
bf8fd4c203 Add flattenSpec support to the Avro parser. (#4832)
* Add flattenSpec support to the Avro parser.

Also:

- Refactor the JSONPathParser a bit so it can share flattening code
  with Avro (see ObjectFlatteners).
- Remove the JSONParser. It was only used in two places: by
  UriNamespaceExtractor, and as a base for JSONToLowerParser. Migrated
  the former to JSONPathParser and made the latter a standalone.
- Move GenericRecordAsMap to the Parquet extension, since the Avro
  extension no longer uses it.

* Fix indentation.

* Fix equals/hashCode.
2017-09-26 09:26:06 -07:00
Roman Leventov
b56a907145 Add namespace extraction thread config (#4833) 2017-09-25 09:52:36 -07:00
Parag Jain
07446ef32c warn if topic not found (#4834) 2017-09-25 12:21:46 +09:00
Charles Allen
a6470c1d03 Move caffeine out of extension and make it the default cache implementation. (#4810)
* Move caffeine out of extension.

* Remove `JsonTypeName` from the class itself

* Fix bad docs

* Fix distribution pom

* Fix unused import

* Make caffeine default

* Address code comments

* Add more description around the jre version in the readme

* Add suggested comments
2017-09-22 10:46:55 -07:00
Roman Leventov
e267f3901b Enforce Indentation with Checkstyle (#4799) 2017-09-21 13:06:48 -07:00
Roman Leventov
88e9a80636 Rename ObjectValueSelector.get() to getObject(); Add getObject() and classOfObject() to ColumnValueSelector (#4801) 2017-09-19 14:47:20 -05:00
Charles Allen
e38705e348 Add timing to log for URI based Lookup fetching (#4805)
* Add timing to log for URI based metrics

* Reformat
2017-09-18 11:18:32 -05:00
Gian Merlino
96612cc665 Fix incorrect log formatting in DruidKerberosAuthenticationHandler. (#4817) 2017-09-17 22:41:36 -07:00
Jonathan Wei
c2a0e753b6 Extension points for authentication/authorization (#4271)
* Extension points for authentication/authorization

* Address some PR comments

* Authorization result caching

* Add unit tests for SecuritySanityCheckFilter and PreResponseAuthorizationCheckFilter

* Use Set for auth caching, close outputstreams in filters

* Don't close output stream on success in sanity check filter

* Add ConfigResourceFilter to coordinator lookups

* Fix filtering authorization check for empty resource list

* HttpClient users must explicitly escalate the client

* Remove response modification from PreResponseAuthorizationCheckFilter

* Remove extraneous pom.xml

* Fix unit test

* Better lifecycle management

* Rename AuthorizationManager to Authorizer

* Fix authorization denials for empty supervisor list

* Address some PR comments

* Address more PR comments

* Small cleanup

* Add Jetty HttpClient wrapper to Authenticator

* Remove Authorizer start/stop

* Restore immutable context map in DruidConnection, UT fix

* Fix/update docs

* Add authorization checks to EventReceiverFirehose

* Fix router authorization check failure, restore PreResponseAuthorizationFilter changes

* Compile fixes

* Test fixes

* Update Authenticator/Authorizer doc comments

* Merge fixes

* PR comments

* Fix test

* Fix IT

* More PR comments

* PR comments

* SSL fix
2017-09-15 23:45:48 -07:00
Roman Leventov
3f92184dd8 Inspection fixes (#4809) 2017-09-15 17:48:29 -07:00
Roman Leventov
cd5de123bd Self-checking S3DataSegmentMover.safeMove() (#4725)
* Self-checking S3DataSegmentMover.safeMove()

* Remove unused in S3DataSegmentMoverTest

* Address comments

* More specific excpetions

* Remove delete check
2017-09-14 13:49:21 -07:00
Jonathan Wei
3a29521273 Fix GroupBy limit push down error when buffer is too small (#4745)
* Fix GroupBy limit push down error when buffer is too small

* Address PR comments
2017-09-12 12:34:50 -07:00
Gian Merlino
34a03b8e6c SQL: EXPLAIN improvements. (#4733)
* SQL: EXPLAIN improvements.

- Include query JSON in explain output.
- Fix a bug where semi-joins and nested groupBys were not fully explained.
- Fix a bug where limits were not included in "select" query explanations.

* Fix compile error.

* Fix compile error.

* Fix tests.
2017-09-01 09:35:13 -07:00
Himanshu
4c04083926 kafkaIndexTask unannounce service in final block (#4736) 2017-09-01 09:31:15 -07:00
Charles Allen
bdfc6fe25e Move common TypeReference into JacksonUtils (#4738) 2017-08-31 13:40:16 -07:00
Parag Jain
594a66f3c0 add scheme to AsyncQueryForwardingServlet (#4688)
* add scheme to AsyncQueryForwardingServlet

* add sslContext binding for Router
2017-08-28 15:03:43 -07:00
hzy001
4f61dc66a9 Remove the deprecated variable localChildren (#4357)
Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>
2017-08-24 15:27:34 -05:00
Roman Leventov
cacf63b007 Add AggregateCombiners (#4676)
* Add MetricCombiners

* Rename MetricCombiner to AggregateCombiner

* Spelling

* Fix TimestampAggregatorFactory.combine() and add makeAggregateCombiner() implementation

* Rename AggregateCombiner.combine() to fold()
2017-08-21 16:45:29 -07:00
Roman Leventov
cbd1902db8 Add forbidden-apis plugin; prohibit using system time zone (#4611)
* Forbidden APIs WIP

* Remove some tests

* Restore io.druid.math.expr.Function

* Integration tests fix

* Add comments

* Fix in SimpleWorkerProvisioningStrategy

* Formatting

* Replace String.format() with StringUtils.format() in RemoteTaskRunnerTest

* Address comments

* Fix GroupByMultiSegmentTest
2017-08-21 13:02:42 -07:00
Himanshu
74a64c88ab internal-discovery: interfaces for announcement/discovery, curator based impls (#4634)
* internal-discovery: interfaces for announcement/discovery, curator impls

* more tests

* address some review comments

* more fixes

* address more review comments

* simplify ObjectMapper setup in CuratorDruidNodeAnnouncerAndDiscoveryTest

* fix KafkaIndexTaskTest

* make lookupTier overridable via RealtimeIndexTask and KafkaIndexTask context

* make teamcity build happy
2017-08-16 13:07:16 -07:00
Parag Jain
725a144096 add localhost as advertised hostname (#4689)
* add localhost as advertised hostname

* set advertised.host.name to localhost for test kafka broker
2017-08-14 16:59:26 -07:00
Roman Leventov
bf28d0775b Remove QueryRunner.run(Query, responseContext) and related legacy methods (#4482)
* Remove QueryRunner.run(Query, responseContext) and related legacy methods

* Remove local var
2017-08-11 09:12:38 +09:00
Yuewen Wang
c821bc9a5a Implement "earlyMessageRejectionPeriod" config discussed in issue #4599 (#4607)
* Implement "earlyMessageRejectionPeriod" config discussed in issue #4599
    * implement the logics of this param
    * Added doc of this config
    * Added unit tests of it

* Update KafkaSupervisor.java

ameliorate comment

* fix format

* fix bug when rebasing
2017-08-11 09:12:08 +09:00
Peter Cunningham
ede7cf9eef Added support for where clauses to JDBC lookups. (#4643)
* Added support for where clauses to filter lookup values on ingestion.

Added a filter field to the JDBC lookups that is used to generate a
where clause so that only rows matching the filter value will be
brought into Druid. Example being filter="SOMECOLUMN=1"

* Required changes based on code review.

* Required changes based on code review.

* Added support for where clauses to filter lookup values on ingestion.

Added a filter field to the JDBC lookups that is used to generate a
where clause so that only rows matching the filter value will be
brought into Druid. Example being filter="SOMECOLUMN=1"

* Updates based on code review, mainly formatting and small refactor of
the buildLookupQuery method.

* Fixed broken buildLookupQuery method

* Removed empty line.

* Updates per review comments
2017-08-09 10:47:46 -07:00
Roman Leventov
7454fd86a0 Polymorphic numeric getters for ColumnValueSelector (#4623)
* Add methods getFloat(), getDouble() and getLong() to ColumnValueSelector

* Fix copy-paste mistake in docs

* Spelling
2017-08-08 18:38:06 -07:00
Jihoon Son
d5606bc558 Passing lockTimeout as a parameter for TaskLockbox.lock() (#4549)
* Passing lockTimeout as a parameter for TaskLockbox.lock()

* Remove TIME_UNIT

* Fix tc fail

* Add taskLockTimeout to TaskContext

* Add caution
2017-08-08 18:21:07 -07:00
Roman Leventov
f5d4171459 Prohibit for loops which could be foreach with IntelliJ (#4653)
* Replace for with foreach

* Replace for with for-each in GroupByQueryEngineV2

* Remove io.druid.collections.IntList
2017-08-08 18:05:33 -07:00
Charles Allen
bbe7fb8c46 Better logging for S3DataSegmentPuller getVersion (#4657)
* Eventual consistency of S3 means a `404` can be thrown. It helps to know the URI that was attempted.
2017-08-08 16:21:22 +03:00
Roman Leventov
aa7e4ae5e4 Enforce correct spacing with Checkstyle (#4651) 2017-08-05 10:18:25 -07:00
Jihoon Son
f3f2cd35e1 Array-based aggregation for groupBy query (#4576)
* Array-based aggregation

* Fix handling missing grouping key

* Handle invalid offset

* Fix compilation

* Add cardinality check

* Fix cardinality check

* Address comments

* Address comments

* Address comments

* Address comments

* Cleanup GroupByQueryEngineV2.process

* Change to Byte.SIZE

* Add flatMap
2017-08-03 20:04:54 +03:00
Charles Allen
8921538251 Make AWSCredentialsConfig use PasswordProvider for the string matter (#4613)
* Make AWSCredentialsConfig use PasswordProvider for the string matter
* Fixes https://github.com/druid-io/druid/issues/3911

* Add unit tests
2017-07-29 15:48:49 -07:00
Roman Leventov
5929066dfb Add NamespaceLookupExtractorFactory.toString() (#4606) 2017-07-26 12:02:07 -07:00
Gian Merlino
5048ab3e96 Add metrics to the native queries underpinning SQL. (#4561)
* Add metrics to the native queries underpinning SQL.

This is done by factoring out the metrics and request log emitting
code from QueryResource into a new QueryLifecycle class. That class
is used by both QueryResource and the SQL DruidSchema and QueryMaker.

Also fixes a couple of bugs in QueryResource:

- RequestLogLine start time was set to `TimeUnit.NANOSECONDS.toMillis(startNs)`,
  which is incorrect since absolute nanos cannot be converted to millis.
- DruidMetrics.makeRequestMetrics was called with null `query` on
  unparseable queries, which led to spurious "Unable to log query"
  errors.

Partial fix for #4047.

* Code style

* Remove unused imports.

* Fix tests.

* Remove unused import.
2017-07-24 21:26:27 -07:00
Roman Leventov
c0beb78ffd Enforce brace formatting with Checkstyle (#4564) 2017-07-21 10:26:59 -05:00
Gian Merlino
2be7068f6e Fixes and improvements to SQL metadata caching. (#4551)
* Fixes and improvements to SQL metadata caching.

Also adds support for MultipleSpecificSegmentSpec to CachingClusteredClient.

SQL changes:
- Cache metadata on a per-segment level, in addition to per-dataSource, so
  we don't need to re-query all segments whenever a single new one appears.
  This should lower the load placed on the cluster by metadata queries.
- Fix race condition in DruidSchema that can cause us to miss metadata. It was
  possible to notice new segments, then issue a query, and have that query
  not actually hit those segments, and not notice that it didn't hit those segments.
  Then, the metadata from those segments would be ignored.
- Fix assumption in DruidSchema that all segments are immutable. Now, mutable
  segments are periodically re-queried.
- Fix inappropriate re-use of SchemaPlus. Now we create one for each planning
  cycle, rather than sharing one. It caches table objects, which we want to
  avoid, since it can cause stale metadata. We do the caching in DruidSchema
  so we don't need the SchemaPlus caching.

Server changes:
- Add a TimelineCallback to TimelineServerView, for callers that want to get updates
  when the timeline has been modified.
- Change CachingClusteredClient from a QueryRunner to a QuerySegmentWalker. This
  allows it to accept queries that are segment-descriptor-based rather than
  intervals-based. In particular it will now support MultipleSpecificSegmentSpec.

* Fix DruidSchema, and unused imports.

* Remove unused import.

* Fix SqlBenchmark.
2017-07-20 10:14:15 -07:00
Slim
71e7a4c054 Adding double colums supports (#4491)
* add double columns support

* Fix numbers and expected results in UTs

* adding float aggregators

* fix IT expected test results

* fix comments

* more fixes

* fix comp

* fix test

* refactor double and float aggregator factories

* fix

* fix UTs

* fix comments

* clean unused code

* fix more comments

* undo unnecessary changes

* fix null issue

* refactor TopNColumnSelectorStrategyFactory

* fix docs

* refactor NumericTopNColumnSelectorStrategy

* fix return

* fix comments

* handle the null case in DimesionIndexer

* more null fixing

* cosmetic changes
2017-07-20 10:14:14 +03:00
Gian Merlino
441ee56ba9 DataSegmentPusher: Add allowed hadoop property prefixes. (#4562)
* DataSegmentPusher: Add allowed hadoop property prefixes.

* Fix dots.
2017-07-18 10:16:12 -07:00
Roman Leventov
60cdf94677 Add PMD and prohibit unnecessary fully qualified class names in code (#4350)
* Add PMD and prohibit unnecessary fully qualified class names in code

* Extra fixes

* Remove extra unnecessary fully-qualified names

* Remove qualifiers

* Remove qualifier
2017-07-17 22:22:29 +09:00