Commit Graph

194 Commits

Author SHA1 Message Date
Dayue Gao 2325844a38 fix incorrect check of maxSemiJoinRowsInMemory (#6242) 2018-08-27 16:28:29 -07:00
Gian Merlino 0172326c62 SQL: Support more result formats, add columns header. (#6191)
* SQL: Support more result formats, add columns header.

- Add result formats for line-based JSON and CSV.
- Add X-Druid-Sql-Columns header with a list of all columns that
the response will contain.
- Add more comprehensive documentation on what callers should expect
when making Druid SQL queries.

* Fix some tests.

* Adjust tests.

* Adjust trailer, add types header.

* Fix trailers.
2018-08-26 23:00:14 -06:00
Gian Merlino 28e6ae3664
SQL: Finalize aggregations for inner queries when necessary. (#6221)
* SQL: Finalize aggregations for inner queries when necessary.

Fixes #5779.

* Fixed test method name.
2018-08-25 13:56:23 -07:00
Jihoon Son ecee3e0a24 Further optimize memory for Travis jobs (#6150)
* Further optimize memory for Travis jobs

* fix build

* sudo false
2018-08-10 22:03:36 -07:00
Nishant Bangarwa 75c8a87ce1 Part 2 of changes for SQL Compatible Null Handling (#5958)
* Part 2 of changes for SQL Compatible Null Handling

* Review comments - break lines longer than 120 characters

* review comments

* review comments

* fix license

* fix test failure

* fix CalciteQueryTest failure

* Null Handling - Review comments

* review comments

* review comments

* fix checkstyle

* fix checkstyle

* remove unrelated change

* fix test failure

* fix failing test

* fix travis failures

* Make StringLast and StringFirst aggregators nullable and fix travis failures
2018-08-02 08:20:25 -07:00
Roman Leventov 0754d78a2e Prohibit Lists.newArrayList() with a single argument (#6068)
* Prohibit Lists.newArrayList() with a single argument

* Test fixes

* Add Javadoc to Node constructor
2018-07-31 20:09:10 -07:00
Benedict Jin 331a0afb98 Remove redundant type parameters and enforce some other style and inspection rules (#5980)
* Various changes about druid-services module

* Patch improvements from reviewer

* Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile

* Fix ArraysAsListWithZeroOrOneArgument

* Fix conflict

* Fix ToArrayCallWithZeroLengthArrayArgument

* Fix AliEqualsAvoidNull

* Remove blank line

* Remove unused import clauses

* Fix code style in TopNQueryRunnerTest

* Fix conflict

* Don't use Collections.singletonList when converting the type of array type

* Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase

* Roll back the latest commit

* Add java.io.File#toURL() into druid-forbidden-apis

* Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord

* Add a new regexp element into stylecode xml file

* Fix style error for new regexp

* Set the level of ArraysAsListWithZeroOrOneArgument as WARNING

* Fix style error for new regexp

* Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile

* Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR

* Add toArray(new Object[0]) regexp into checkstyle config file & fix them

* Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it

* Add a comment for string equals regexp in checkstyle config

* Fix code format

* Add RedundantTypeArguments as ERROR level inspection

* Fix cannot resolve symbol datasource
2018-07-27 16:56:49 -05:00
Jonathan Wei efab3b0160 Add concat and textcat SQL functions (#6005) 2018-07-20 11:21:04 -07:00
Gian Merlino cd8ea3da8d
SQL: Add server-wide default time zone config. (#5993)
* SQL: Add server-wide default time zone config.

* Switch API.
2018-07-18 13:12:40 -07:00
Gian Merlino 04ea3c9f8c
Update license headers. (#5976)
* Update license headers.

For compliance with http://www.apache.org/legal/src-headers.html.

* More license adjustments.

* Fix mistakenly edited package line.
2018-07-11 09:55:18 -07:00
Gian Merlino 948e73da77 Extend various test timeouts. (#5978)
False failures on Travis due to spurious timeout (in turn due to noisy
neighbors) is a bigger problem than legitimate failures taking too long
to time out. So it makes sense to extend timeouts.
2018-07-10 13:02:14 -07:00
Surekha 441c9819d9 Support limit for timeseries query (#5894) (#5931)
* Support limit for timeseries query (#5894)

* Fix tests

* Address PR comments

* Try to fix teamcity inspection checks

* Remove unused method from VirtualColumns

* Remove unused import statement
2018-07-09 08:58:42 -07:00
Jihoon Son 10a01d6846 [SQL] Fix missing postAggregations for Timeseries and TopN (#5912)
* [SQL] Fix missing postAggregations for Timeseries and TopN

* fix build

* fix test
2018-06-29 10:36:55 -07:00
Jonathan Wei 0eae89170e
Make DruidPlanner constructor public again (#5891) 2018-06-20 11:10:50 -07:00
Gian Merlino 6d0dd2fd0f CalciteQueryTest: Add more subquery tests. (#5880)
None of them actually work right now, but this is useful to help document,
via tests, what works and what doesn't.
2018-06-18 11:54:29 -07:00
Jihoon Son fe4d678aac Support projection after sorting in SQL (#5788)
* Add sort project

* add more test

* address comments
2018-06-11 11:33:47 -04:00
Slim Bouguerra 8aa8d9fa5b
Kerberos Spnego Authentication Router Issue (#5706)
* Adding decoration method to proxy servlet

Change-Id: I872f9282fb60bfa20524271535980a36a87b9621

* moving the proxy request decoration to authenticators

Change-Id: I7f94b9ff5ecf08e8abf7169b58bc410f33148448

* added docs

Change-Id: I901543e52f0faf4666bfea6256a7c05593b1ae70

* use the authentication result to decorate request

Change-Id: I052650de9cd02b4faefdbcdaf2332dd3b2966af5

* adding authenticated by name

Change-Id: I074d2933460165feeddb19352eac9bd0f96f42ca

* ensure that authenticator is not null

Change-Id: Idb58e308f90db88224a06f3759114872165b24f5

* fix types and minor bug

Change-Id: I6801d49a05d5d8324406fc0280286954eb66db10

* fix typo

Change-Id: I390b12af74f44d760d0812a519125fbf0df4e97b

* use actual type names

Change-Id: I62c3ee763363781e52809ec912aafd50b8486b8e

* set authenitcatedBy to null for AutheticationResults created by
Escalator.

Change-Id: I4a675c372f59ebd8a8d19c61b85a1e4bf227a8ba
2018-05-05 20:33:51 -07:00
Gian Merlino dc786ebc4c SQL: Remove some unused code. (#5690) 2018-04-24 11:42:16 -07:00
Slim Bouguerra 73da7426da
Timeseries results are incoherent for case interval is out of range and case false filter. (#5649)
* adding some tests

Change-Id: I92180498e2e6695212b286d980e349c136c78c86

* added empty sequence runner

Change-Id: I20c83095072bbf3b4a3a57dfc1934d528e2c7a1a

* treat only granularity ALL

Change-Id: I1d88fab500c615bc46db4f4497ce93089976441f

* moving toList within If and add expected queries

Change-Id: I56cdd980e44f0685806efb45e29031fa2e328ec4

* typo

Change-Id: I42fdd28da5471f6ae57d3962f671741b106300cd

* adding tests and fix logic of intervals

Change-Id: I0bd414d2278e3eddc2810e4f5080e6cf6a117f12

* fix style

Change-Id: I99a2380934c9ab350ca934c56041dc343c08b99f

* comments review

Change-Id: I726a3b905a9520d8b1db70e4ba17853c65c414a4
2018-04-23 15:55:18 -07:00
Nishant Bangarwa e6efd75a3d Add config to allow setting up custom unsecured paths for druid nodes. (#5614)
* Add config to allow setting up custom unsecured paths for druid nodes.

* return all resources for Unsecured paths

* review comment - Add test

* fix tests

* fix test
2018-04-11 17:10:07 -07:00
Gian Merlino ff27c54774 SQL: Remove useless boolean CASTs in filters. (#5619) 2018-04-10 23:05:09 +05:30
Roman Leventov 693e3575f9
Remove unused code and exception declarations (#5461)
* Remove unused code and exception declarations

* Address comments

* Remove redundant Exception declarations

* Make FirehoseFactoryV2.connect() to throw IOException again
2018-03-16 22:11:12 +01:00
Gian Merlino fdd55538e1 SQL: Remove unused escalator, authConfig from various classes. (#5483)
DruidPlanner.plan is responsible for checking authorization, so these objects
weren't needed in as many places as they were injected.
2018-03-14 13:28:51 -07:00
Gian Merlino 0f03ab0c74 SQL: Fix precision of TIMESTAMP types. (#5464)
Druid stores timestamps down to the millisecond, so we should use
precision = 3. Setting this wrong sometimes caused milliseconds
to be ignored in timestamp literals.

Fixes #5337.
2018-03-05 18:56:52 -08:00
Gian Merlino ff0de21fc5 SQL: Fix assumption that AND, OR have two arguments. (#5470)
Calcite can deliver an AND or OR operator with > 2 arguments.
Fixes #5468.
2018-03-05 18:56:35 -08:00
Nishant Bangarwa e0d456b1ba Uniformly set Calcite systemProperties for All Unit tests (#5451)
Fixes test failures reported in -
https://github.com/druid-io/druid/issues/4909

Issue is that If some test skips setting up Calcite system properties
with proper encoding and loads calcite classes that use that property,
All subsequent tests in the same JVM fails.

To reproduce the issue - ExpressionsTest and CalciteQueryTest from IDE
in this order.

A better fix would be to not use System Properties in calcite, This
will work for now.

All new Calcite Unit tests that are added need to inherit
CalciteTestBase.
2018-03-01 12:56:32 -08:00
Jonathan Wei c23b723510 Skip normal authentication for JDBC requests in Router (#5435)
* Skip normal authentication for JDBC requests in Router

* Add integration test

* PR comments
2018-02-28 12:25:32 -08:00
Gian Merlino f3796bc81b SQL: Lower default JDBC frame size. (#5409)
The previous default of 100,000 was a bit excessive and could easily
lead to OOM errors on "select *" style queries.
2018-02-21 10:00:48 -08:00
Gian Merlino 818ce51964 SQL: Fix selecting BOOLEAN type in JDBC. (#5401) 2018-02-21 09:59:56 -08:00
Slim 37c09ce3f8 Use both Joad Ids and Java IDs as Timezone to string readers (#5349)
* Use both Joad Ids and Java IDs as Timezone to string readers

Change-Id: Ieb5c18559879f3f3a0104912ce2f0a354ad0aac3

* move the function to DateTimes and add org.joda.time.DateTimeZone#forID as part of forbidden api

Change-Id: Iff97fa044758019ed0c231587d10e31a9cc18da0

* exclude class and remove other usage

Change-Id: Ib458c2caaa1865535767e1009fbf017a92c8f615

* remove it from test classes

Change-Id: I9b576324f6c7e17a74bd8b13879232c9a8cd40b4

* remove unused

Change-Id: If1c5b70c26c2b7c83c20434cb72b2060653f5052
2018-02-06 16:34:11 +05:30
Gian Merlino 7051230a41 SQL: Throttle metadata refreshes when they fail. (#5328) 2018-02-02 15:17:39 -08:00
Gian Merlino 7e02408510 Update versions to 0.13.0-SNAPSHOT. (#5323) 2018-02-02 12:06:38 -06:00
Jonathan Wei 80419752b5 Add metamx emitter, http clients, and metrics packages to druid java-util (#5289)
* Add metamx java-util emitter, http clients, and metrics packages to druid java-util

* Remove metamx java-util from pom.xml files

* Checkstyle fixes

* Import fix

* TeamCity inspection fixes

* Use slf4j, move some version defs to master pom.xml

* Use parent jvm-attach-api and maven-surefire-plugin versions

* Add ] to log msg, suppress inspection
2018-01-24 22:10:36 +01:00
Clint Wylie 491f8cca81 fix timewarp query results when using timezones and crossing DST transitions (#5157)
* timewarp and timezones
changes:
* `TimewarpOperator` will now compensate for daylight savings time shifts between date translation ranges for queries using a `PeriodGranularity` with a timezone defined
* introduces a new abstract query type `TimeBucketedQuery` for all queries which have a `Granularity` (100% not attached to this name). `GroupByQuery`, `SearchQuery`, `SelectQuery`, `TimeseriesQuery`, and `TopNQuery` all extend `TimeBucke
tedQuery`, cutting down on some duplicate code and providing a mechanism for `TimewarpOperator` (and anything else) that needs to be aware of granularity

* move precondition check to TimeBucketedQuery, add Granularities.nullToAll, add getTimezone to TimeBucketQuery

* formatting

* more formatting

* unused import

* changes:
* add 'getGranularity' and 'getTimezone' to 'Query' interface
* merge 'TimeBucketedQuery' into 'BaseQuery'
* fixup tests from resulting serialization changes

* dedupe

* fix after merge

* suppress warning
2018-01-11 12:39:33 -08:00
Roman Leventov 8877ce38d6
Enforce modifier order with Checkstyle (#5246) 2018-01-11 09:50:42 +01:00
Jonathan Wei 9186547689 Exclude sketches-core from druid-sql (#5223) 2018-01-05 17:12:20 -06:00
Jonathan Wei 935ac646f4
Upgrade to Calcite 1.15.0 (#5210)
* Upgrade to Calcite 1.15.0

* Use Filtration.eternity()
2018-01-04 12:11:24 -08:00
Roman Leventov 579f9fbedf Add IndexedInts.debugToString() and AbstractIndex.toString(); Add Sequence.toList() and limit() (#5175)
* Add IndexedInts.debugToString() and AbstractIndex.toString()

* Fix AppenderatorTest
2018-01-04 09:56:47 +09:00
Jonathan Wei ba873c614b Fix max connections in DruidAvaticaHandlerTest (#5188)
* Fix max connections in DruidAvaticaHandlerTest

* Fix additional tests

* Added comment
2017-12-21 15:32:14 -06:00
Roman Leventov 5787d04fad Bump Druid version to 0.12.0 (#5138) 2017-12-15 07:37:01 -08:00
Roman Leventov 64848c7ebf DataSegment memory optimizations (#5094)
* Deduplicate DataSegments contents (loadSpec's keys, dimensions and metrics lists as a whole) more aggressively; use ArrayMap instead of default LinkedHashMap for DataSegment.loadSpec, because they have only 3 entries on average; prune DataSegment.loadSpec on brokers

* Fix DataSegmentTest

* Refinements

* Try to fix

* Fix the second DataSegmentTest

* Nullability

* Fix tests

* Fix tests, unify to use TestHelper.getJsonMapper()

* Revert TestUtil as ServerTestHelper, fix tests

* Add newline

* Fix indexing tests

* Fix s3 tests

* Try to fix tests, remove lazy caching of ObjectMapper in TestHelper, rename TestHelper.getJsonMapper() to makeJsonMapper()

* Fix HDFS tests

* Fix HdfsDataSegmentPusherTest

* Capitalize constant names
2017-12-12 11:41:40 -08:00
Roman Leventov a7a6a0487e Replace IOPeon with SegmentWriteOutMedium; Improve buffer compression (#4762)
* Replace IOPeon with OutputMedium; Improve compression

* Fix test

* Cleanup CompressionStrategy

* Javadocs

* Add OutputBytesTest

* Address comments

* Random access in OutputBytes and GenericIndexedWriter

* Fix bugs

* Fixes

* Test OutputBytes.readFully()

* Address comments

* Rename OutputMedium to SegmentWriteOutMedium and OutputBytes to WriteOutBytes

* Add comments to ByteBufferInputStream

* Remove unused declarations
2017-12-04 18:04:27 -08:00
Parag Jain 7c01f77b04 Parse Batch support (#5081)
* add parseBatch and deprecate parse method in InputRowParser

add addAll method, skip max rows in memory check for it

remove parse method from implemetations

transform transformers

add string multiplier input row parser

fix withParseSpec

fix kafka batch indexing

fix isPersistRequired

comments

* add unit test

* make persist async

* review comments
2017-12-04 16:06:16 -06:00
Gian Merlino 5f6bdd940b SQL: Improve translation of time floor expressions. (#5107)
* SQL: Improve translation of time floor expressions.

The main change is to TimeFloorOperatorConversion.applyTimestampFloor.

- Prefer timestamp_floor expressions to timeFormat extractionFns, to
  avoid turning things into strings when it isn't necessary.
- Collapse CAST(FLOOR(X TO Y) AS DATE) to FLOOR(X TO Y) if appropriate.

* Fix tests.
2017-11-29 12:06:03 -08:00
Gian Merlino 486159ba8c SQL: Add TIMESTAMPADD. (#5079) 2017-11-16 12:00:34 -08:00
Gian Merlino 7722401cb3 SQL: Add rule to prune unused aggregations. (#5049) 2017-11-13 20:24:45 -08:00
Gian Merlino 77df5e0673 ExpressionSelectors: Add optimized selectors. (#5048)
* ExpressionSelectors: Add caching selectors.

- SingleLongInputCaching selector for expressions on the __time column,
  using a similar optimization to SingleScanTimeDimSelector
- SingleStringInputDimensionSelector for expressions on string columns
  that return strings, using a similar optimization to ExtractionFn
  based DimensionSelectors.
- SingleStringInputCaching selector for expressions on string columns
  that return primitives.

Also, in the SQL planner, prefer expressions for time operations
rather than extractionFns.

* Code review comments.
2017-11-13 20:24:24 -08:00
Gian Merlino 4fd4444b42 SQL: Add "array" result format, and document result formats. (#5032)
* SQL: Add "array" result format, and document result formats.

* Code style.
2017-11-13 20:24:06 -08:00
Gian Merlino 6c0c858913 SQL: Support CASE-style filtered count distinct. (#5047)
i.e., aggregations like COUNT(DISTINCT CASE WHEN x THEN y END). This
patch also changes complex columns to report as nullable, which
is required for them to type-check properly when used in these kinds
of filtered aggregations.
2017-11-13 20:23:54 -08:00
Jonathan Wei 9ac150c23a
Split internal client escalation from Authenticator interface (#5073)
* Split internal client escalation from Authenticator interface

* PR comments
2017-11-13 19:29:08 -08:00
Akash Dwivedi c1538f29fc maxQueryTimeout property in runtime properties. (#4852)
* maxQueryTimeout property in runtime properties.

* extra line

* move withTimeoutAndMaxScatterGatherBytes method to QueryLifeCycle.

* Fix initialize method.

* remove unused import.

* doc update.

* some more details in doc about query failure..

* minor fix.

* decorating QueryRunner to set and verify context. Added by servers.

* remove whitespace.
2017-11-13 19:23:11 -06:00
Gian Merlino 9444da5038 SQL: Improved behavior when implicitly casting strings to date/time literals. (#5023)
* SQL: Improved behavior when implicitly casting strings to date/time literals.

- Handle all flavors of ISO8601 and SQL literals.
- Throw errors on other literals instead of silently transforming them to 0.

* Respect timeZone when format is null.
2017-11-10 17:43:22 +09:00
Roman Leventov 3541b7544b Prohibit and remove unused declarations in the processing module (#4930)
* Prohibit and remove unused declarations in the processing module

* Fix tests

* Fix integration tests

* Suppress unused

* Try to remove SuppressWarnings unused in VirtualColumn

* Remove reset 'false positives'

* Annotate CliCommandCreator as ExtensionPoint

* Unused import warning instead of error in IntelliJ

* Fixes

* Add comment

* Fix AzureBlob

* Fix CloudFilesBlob

* Address comments

* Add Project SDK section to INTELLIJ_SETUP.md

* Fix image
2017-11-09 09:27:27 -08:00
Gian Merlino 6c725a7e06 Fix havingSpec on complex aggregators. (#5024)
* Fix havingSpec on complex aggregators.

- Uses the technique from #4883 on DimFilterHavingSpec too.
- Also uses Transformers from #4890, necessitating a move of that and other
  related classes from druid-server to druid-processing. They probably make
  more sense there anyway.
- Adds a SQL query test.

Fixes #4957.

* Remove unused import.
2017-11-01 12:58:08 -04:00
Gian Merlino b7fc1424dd
SQL: Avoid using timeseries with "having" clauses. (#5017) 2017-10-30 17:42:54 -07:00
Jonathan Wei 3e0a6fc374 Filter unauthorized datasources in INFORMATION_SCHEMA queries (#4998)
* Filter unauthorized datasources in INFORMATION_SCHEMA queries

* PR comments
2017-10-26 12:36:47 -07:00
Gian Merlino 5fc6891404 Reduce code duplication between test ExprMacroTables. (#4979) 2017-10-18 15:57:49 -05:00
Gian Merlino 43051829f2 Regression test for #4208. (#4968) 2017-10-17 15:54:00 -05:00
Jihoon Son 8d9902831e Refactoring PrefetchableTextFilesFirehoseFactory (#4836)
* Refactoring prefetchable firehose

* Fix to read cache when prefetch is disabled

* More tests

* Cleanup codes

* Add Fetcher

* Fix test failure

* Count file size

* Fix test

* rename generic parameter

* address comments

* address comments

* reuse buffer

* move Execs to java-util

* use execs

* Fix build
2017-10-13 21:39:28 -05:00
Gian Merlino f51f346e36 SQL: Fix POWER doc, add test. (#4953) 2017-10-13 14:38:15 -07:00
Jihoon Son 675c6c00dd Add checkstyle and intellij rule to prohibit unnecessary qualifiers in interfaces (#4958)
* add checkstyle and intellij rule

* fix tc fail
2017-10-13 07:56:19 -07:00
Atul Mohan c07678b143 Synchronization of lookups during startup of druid processes (#4758)
* Changes for lookup synchronization

* Refactor of Lookup classes

* Minor refactors and doc update

* Change coordinator instance to be retrieved by DruidLeaderClient

* Wait before thread shutdown

* Make disablelookups flag true by default

* Update docs

* Rename flag

* Move executorservice shutdown to finally block

* Update LookupConfig

* Refactoring and doc changes

* Remove lookup config constructor

* Revert Lookupconfig constructor changes

* Add tests to LookupConfig

* Make executorservice local

* Update LRM

* Move ListeningScheduledExecutorService to ExecutorCompletionService

* Move exception to outer block

* Remove check to see future is done

* Remove unnecessary assignment

* Add logging
2017-10-12 21:22:24 -05:00
Gian Merlino 57a4038379 SQL: Fix CASE-filtered aggregations with GROUP BY. (#4943) 2017-10-12 15:40:43 -07:00
Gian Merlino b20e3038b6 SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. (#4889)
* SQL: Upgrade to Calcite 1.14.0, some refactoring of internals.

This brings benefits:
- Ability to do GROUP BY and ORDER BY with ordinals.
- Ability to support IN filters beyond 19 elements (fixes #4203).

Some refactoring of druid-sql internals:
- Builtin aggregators and operators are implemented as SqlAggregators
  and SqlOperatorConversions rather being special cases. This simplifies
  the Expressions and GroupByRules code, which were becoming complex.
- SqlAggregator implementations are no longer responsible for filtering.

Added new functions:
- Expressions: strpos.
- SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR,
  and DATE_TRUNC.

* Add missing @Override annotation.

* Adjustments for forbidden APIs.

* Adjustments for forbidden APIs.

* Disable GROUP BY alias.

* Doc reword.
2017-10-10 12:44:05 -07:00
Jihoon Son e6eabac385 Implement repalceInput and add tpch dataset (#4848) 2017-10-03 08:00:59 -07:00
Gian Merlino 1f2074c247 Bump versions in master to 0.11.1-SNAPSHOT. (#4878)
* Bump versions in master to 0.11.1-SNAPSHOT.

* Missed a few.
2017-09-28 17:09:51 -05:00
Gian Merlino fbd4cd633b SQL: Delay query translation until the end of planning. (#4846)
* SQL: Delay query translation until the end of planning.

This fixes a bug in which input rels to nested queries could get swapped
out by the optimizer, leading to incorrect nested query planning.

This also, I hope, makes the query translation code easier to understand. At
least for me, the PartialDruidQuery -> DruidQuery -> Query chain is easier
to understand than the previous-existing rule spaghetti.

* Make test more consistent.

* Fix test.
2017-09-28 11:43:20 -07:00
Himanshu f69c9280c4 remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858)
* remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form

* sanitize output of /druid/coordinator/v1/cluster endpoint
2017-09-28 10:40:59 -05:00
Roman Leventov 9c126e2aa9 Forbid MapMaker (#4845)
* Forbid MapMaker

* Shorter syntax

* Forbid Maps.newConcurrentMap()
2017-09-27 06:49:47 -07:00
Roman Leventov e267f3901b Enforce Indentation with Checkstyle (#4799) 2017-09-21 13:06:48 -07:00
Jonathan Wei c2a0e753b6 Extension points for authentication/authorization (#4271)
* Extension points for authentication/authorization

* Address some PR comments

* Authorization result caching

* Add unit tests for SecuritySanityCheckFilter and PreResponseAuthorizationCheckFilter

* Use Set for auth caching, close outputstreams in filters

* Don't close output stream on success in sanity check filter

* Add ConfigResourceFilter to coordinator lookups

* Fix filtering authorization check for empty resource list

* HttpClient users must explicitly escalate the client

* Remove response modification from PreResponseAuthorizationCheckFilter

* Remove extraneous pom.xml

* Fix unit test

* Better lifecycle management

* Rename AuthorizationManager to Authorizer

* Fix authorization denials for empty supervisor list

* Address some PR comments

* Address more PR comments

* Small cleanup

* Add Jetty HttpClient wrapper to Authenticator

* Remove Authorizer start/stop

* Restore immutable context map in DruidConnection, UT fix

* Fix/update docs

* Add authorization checks to EventReceiverFirehose

* Fix router authorization check failure, restore PreResponseAuthorizationFilter changes

* Compile fixes

* Test fixes

* Update Authenticator/Authorizer doc comments

* Merge fixes

* PR comments

* Fix test

* Fix IT

* More PR comments

* PR comments

* SSL fix
2017-09-15 23:45:48 -07:00
Gian Merlino 2ce8123bdb Move scan-query from a contrib extension into core. (#4751)
* Move scan-query from a contrib extension into core.

Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion

This patch also adds support for virtual columns to the Scan query,
and updates Druid SQL to use Scan instead of Select.

This patch also makes some behavioral changes to handling of the __time
column. In particular, it is now is returned as "__time" rather than
"timestamp"; it is no longer included if you do not specifically ask for
it in your "columns"; and it is returned as a long rather than a string.

Users can revert time handling to the legacy extension behavior by
setting "legacy" : true in their queries, or setting the property
druid.query.scan.legacy = true. This is meant to provide a migration
path for users that were formerly using the contrib extension.

* Adjustments from review.

* Add back Select query.

* Adjust SQL docs.

* Restore SelectQuery link.
2017-09-13 09:51:24 -07:00
Gian Merlino c3a1ce6933 SQL: Fix toTimeseriesQuery and toTopNQuery. (#4780)
The former would sometimes eat limits, and the latter would sometimes
use the wrong dimension comparator.
2017-09-12 14:37:27 -07:00
Gian Merlino 4909c48b0c SQL: Full TRIM support. (#4750)
* SQL: Full TRIM support.

- Support trimming arbitrary characters
- Support BOTH, LEADING, and TRAILING

* Remove unused import.

* Fix tests, add RTRIM / LTRIM.

* Remove unused imports.

* BTRIM and docs.

* Replace for with foreach.
2017-09-12 11:49:08 -07:00
dgolitsyn 752151f6cb Add CachingCostBalancerStrategy (#4731)
* Add CachingCostBalancerStrategy; Rename ServerView.ServerCallback to ServerRemovedCallback

* Fix benchmark units

* Style, forbidden-api, review, bug fixes

* Add docs

* Address comments
2017-09-08 12:23:04 -05:00
Gian Merlino 34a03b8e6c SQL: EXPLAIN improvements. (#4733)
* SQL: EXPLAIN improvements.

- Include query JSON in explain output.
- Fix a bug where semi-joins and nested groupBys were not fully explained.
- Fix a bug where limits were not included in "select" query explanations.

* Fix compile error.

* Fix compile error.

* Fix tests.
2017-09-01 09:35:13 -07:00
Gian Merlino daf3c5f927 Add "round" option to cardinality and hyperUnique aggregators. (#4720)
* Add "round" option to cardinality and hyperUnique aggregators.

Also turn it on by default in SQL, to make math on distinct counts
work more as expected.

* Fix some compile errors.

* Fix test.

* Formatting.
2017-08-28 14:52:11 -07:00
Roman Leventov cbd1902db8 Add forbidden-apis plugin; prohibit using system time zone (#4611)
* Forbidden APIs WIP

* Remove some tests

* Restore io.druid.math.expr.Function

* Integration tests fix

* Add comments

* Fix in SimpleWorkerProvisioningStrategy

* Formatting

* Replace String.format() with StringUtils.format() in RemoteTaskRunnerTest

* Address comments

* Fix GroupByMultiSegmentTest
2017-08-21 13:02:42 -07:00
Gian Merlino 5ff8c52f16 SQL: Fix race with metadata caching. (#4674)
If DruidSchema started too long after the BrokerServerView, its
initialization callback would never get called, and it would sit
there not knowing about any tables.

This moves the registration of the callback into the constructor,
where it belongs.
2017-08-10 18:27:10 -07:00
Gian Merlino d4ef0f6d94 Improved SQL support for floats and doubles. (#4598)
* Improved SQL support for floats and doubles.

- Use Druid FLOAT for SQL FLOAT, and Druid DOUBLE for SQL DOUBLE, REAL,
  and DECIMAL.
- Use float* aggregators when appropriate.
- Add tests involving both float and double columns.
- Adjust documentation accordingly.

* CR comments.

* Fix braces.
2017-07-25 13:54:44 -07:00
Gian Merlino 5048ab3e96 Add metrics to the native queries underpinning SQL. (#4561)
* Add metrics to the native queries underpinning SQL.

This is done by factoring out the metrics and request log emitting
code from QueryResource into a new QueryLifecycle class. That class
is used by both QueryResource and the SQL DruidSchema and QueryMaker.

Also fixes a couple of bugs in QueryResource:

- RequestLogLine start time was set to `TimeUnit.NANOSECONDS.toMillis(startNs)`,
  which is incorrect since absolute nanos cannot be converted to millis.
- DruidMetrics.makeRequestMetrics was called with null `query` on
  unparseable queries, which led to spurious "Unable to log query"
  errors.

Partial fix for #4047.

* Code style

* Remove unused imports.

* Fix tests.

* Remove unused import.
2017-07-24 21:26:27 -07:00
Roman Leventov c0beb78ffd Enforce brace formatting with Checkstyle (#4564) 2017-07-21 10:26:59 -05:00
Gian Merlino 2be7068f6e Fixes and improvements to SQL metadata caching. (#4551)
* Fixes and improvements to SQL metadata caching.

Also adds support for MultipleSpecificSegmentSpec to CachingClusteredClient.

SQL changes:
- Cache metadata on a per-segment level, in addition to per-dataSource, so
  we don't need to re-query all segments whenever a single new one appears.
  This should lower the load placed on the cluster by metadata queries.
- Fix race condition in DruidSchema that can cause us to miss metadata. It was
  possible to notice new segments, then issue a query, and have that query
  not actually hit those segments, and not notice that it didn't hit those segments.
  Then, the metadata from those segments would be ignored.
- Fix assumption in DruidSchema that all segments are immutable. Now, mutable
  segments are periodically re-queried.
- Fix inappropriate re-use of SchemaPlus. Now we create one for each planning
  cycle, rather than sharing one. It caches table objects, which we want to
  avoid, since it can cause stale metadata. We do the caching in DruidSchema
  so we don't need the SchemaPlus caching.

Server changes:
- Add a TimelineCallback to TimelineServerView, for callers that want to get updates
  when the timeline has been modified.
- Change CachingClusteredClient from a QueryRunner to a QuerySegmentWalker. This
  allows it to accept queries that are segment-descriptor-based rather than
  intervals-based. In particular it will now support MultipleSpecificSegmentSpec.

* Fix DruidSchema, and unused imports.

* Remove unused import.

* Fix SqlBenchmark.
2017-07-20 10:14:15 -07:00
Slim 71e7a4c054 Adding double colums supports (#4491)
* add double columns support

* Fix numbers and expected results in UTs

* adding float aggregators

* fix IT expected test results

* fix comments

* more fixes

* fix comp

* fix test

* refactor double and float aggregator factories

* fix

* fix UTs

* fix comments

* clean unused code

* fix more comments

* undo unnecessary changes

* fix null issue

* refactor TopNColumnSelectorStrategyFactory

* fix docs

* refactor NumericTopNColumnSelectorStrategy

* fix return

* fix comments

* handle the null case in DimesionIndexer

* more null fixing

* cosmetic changes
2017-07-20 10:14:14 +03:00
Roman Leventov 60cdf94677 Add PMD and prohibit unnecessary fully qualified class names in code (#4350)
* Add PMD and prohibit unnecessary fully qualified class names in code

* Extra fixes

* Remove extra unnecessary fully-qualified names

* Remove qualifiers

* Remove qualifier
2017-07-17 22:22:29 +09:00
Gian Merlino 16817e408d SQL + Expressions = Best friends forever. (#4360)
* SQL + Expressions = Best friends forever.

- Use expressions as a projection layer for anything that can't be
  expressed using traditional Druid extractionFns. Sometimes they're
  embedded directly (like "expression" filters, builtin aggregators,
  or "expression" post-aggregators). Sometimes they're referenced
  through virtual columns (like dimensionSpecs, which can't innately
  reference functions of more than one column without the virtual
  column layer).
- Add many new functions and operators, taking advantage of the
  expression capability (see the querying/sql.md doc).
- Improve consistency of constant reduction and of casting by
  using Druid expressions for this instead of Calcite's RexExecutor.

* Fix casting bug, and other code review comments.

* Fix docs.
2017-07-07 08:48:26 -07:00
Parag Jain 6e2f78f552 TLS support (#4270) 2017-07-06 17:40:12 -07:00
Roman Leventov 9ae457f7ad Avoid using the default system Locale and printing to System.out in production code (#4409)
* Avoid usages of Default system Locale and printing to System.out or System.err in production code

* Fix Charset in DruidKerberosUtil

* Remove redundant string format in GenericIndexed

* Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format()

* Fix testSafeFormat()

* More fixes of redundant StringUtils.format() inside ISE

* Rename unimportantSafeFormat() to nonStrictFormat()
2017-06-29 14:06:19 -07:00
Roman Leventov ae900a4934 Update versions to 0.11.0-SNAPSHOT (#4483) 2017-06-28 17:05:58 -07:00
Gian Merlino 22aad08a59 ExpressionPostAggregator: Automatically finalize inputs. (#4406)
* ExpressionPostAggregator: Automatically finalize inputs.

Raw HyperLogLogCollectors and such aren't very useful. When writing
expressions like `x / y` users will expect `x` and `y` to be finalized.

* Fix un-merge.

* Code review comments.

* Remove unnecessary ImmutableMap.copyOf.
2017-06-17 13:22:47 -07:00
Jonathan Wei 3b70995bb3 Configurable row limit for JDBC frames (#4417) 2017-06-16 17:07:40 -07:00
Jonathan Wei cc815eec81 Create/close yielder in same thread for JDBC queries (#4415)
* Create/close yielder in same thread for JDBC queries

* PR comments

* More PR comments

* Add connectionId to DruidStatement executor
2017-06-16 16:50:33 -07:00
Goh Wei Xiang f68a0693f3 Allow use of non-threadsafe ObjectCachingColumnSelectorFactory (#4397)
* Adding a flag to indicate when ObjectCachingColumnSelectorFactory need not be threadsafe.

* - Use of computeIfAbsent over putIfAbsent
- Replace Maps.newXXXMap() with normal instantiation
- Documentations on when is thread-safe required.
- Use Builders for On/OffheapIncrementalIndex

* - Optimization on computeIfAbsent
- Constant EMPTY DimensionsSpec
- Improvement on IncrementalIndexSchema.Builder
  - Remove setting of default values
  - Use var args for metrics
- Correction on On/OffheapIncrementalIndex Builders
- Combine On/OffheapIncrementalIndex Builders

* - Removing unused imports.

* - Helper method for testing with IncrementalIndex.Builder

* - Correction on javadoc.

* Style fix
2017-06-16 16:04:19 -05:00
Gian Merlino e78d8584a1 JettyQosTest, DruidAvaticaHandlerTest: Extend timeout. (#4416)
Fixes #4408, probably.
2017-06-15 18:28:50 -07:00
Gian Merlino 1f2afccdf8 Expressions: Add ExprMacros. (#4365)
* Expressions: Add ExprMacros, which have the same syntax as functions, but
can convert themselves to any kind of Expr at parse-time.

ExprMacroTable is an extension point for adding new ExprMacros. Anything
that might need to parse expressions needs an ExprMacroTable, which can
be injected through Guice.

* Address code review comments.
2017-06-08 09:32:10 -04:00
Gian Merlino 67b162a337 SQL: More forgiving Avatica server. (#4368)
* SQL: More forgiving Avatica server.

- Automatically close statements that are fully iterated or that have
  errors, to prevent dangling statements from causing clients to hit
  open statement limits.
- Empower client auto-reconnects by throwing NoSuchConnectionException
  when appropriate.
- Try to close empty connections when we hit the open connection limit,
  rather than failing the newly opened connection. Client
  auto-reconnections mean this shouldn't cause problems in practice.
- Improve concurrency of the server by making "connections" a
  concurrent map.
- Lower default connection timeout to PT5M from PT30M.

* Fix DruidStatement test.
2017-06-06 10:11:40 -07:00
Jonathan Wei d49e53e6c2 Timeout and maxScatterGatherBytes handling for queries run by Druid SQL (#4305)
* Timeout and maxScatterGatherBytes handling for queries run by Druid SQL

* Address PR comments

* Fix contexts in CalciteQueryTest

* Fix contexts in QuantileSqlAggregatorTest
2017-05-23 16:57:51 +09:00
Jonathan Wei e043bf88ec Add a ServerType for peons (#4295)
* Add a ServerType for peons

* Add toString() method, toString() test, unsupported type check

* Use ServerType enum in DruidServer and DruidServerMetadata
2017-05-22 17:24:59 -05:00
Gian Merlino 8ca7f9410e SQL: Add test for concurrent JDBC queries. (#4290) 2017-05-18 12:25:15 -07:00
Jihoon Son 5c0a7ad2f8 Make realtimes available for loading segments (#4148)
* Add ServerType

* Add realtimes to DruidCluster

* fix test fails

* Add SegmentManager

* Fix equals and hashCode of ServerHolder

* Address comments and add more tests

* Address comments
2017-05-18 10:03:39 -05:00