Commit Graph

20 Commits

Author SHA1 Message Date
Gian Merlino d4ef0f6d94 Improved SQL support for floats and doubles. (#4598)
* Improved SQL support for floats and doubles.

- Use Druid FLOAT for SQL FLOAT, and Druid DOUBLE for SQL DOUBLE, REAL,
  and DECIMAL.
- Use float* aggregators when appropriate.
- Add tests involving both float and double columns.
- Adjust documentation accordingly.

* CR comments.

* Fix braces.
2017-07-25 13:54:44 -07:00
Gian Merlino 16817e408d SQL + Expressions = Best friends forever. (#4360)
* SQL + Expressions = Best friends forever.

- Use expressions as a projection layer for anything that can't be
  expressed using traditional Druid extractionFns. Sometimes they're
  embedded directly (like "expression" filters, builtin aggregators,
  or "expression" post-aggregators). Sometimes they're referenced
  through virtual columns (like dimensionSpecs, which can't innately
  reference functions of more than one column without the virtual
  column layer).
- Add many new functions and operators, taking advantage of the
  expression capability (see the querying/sql.md doc).
- Improve consistency of constant reduction and of casting by
  using Druid expressions for this instead of Calcite's RexExecutor.

* Fix casting bug, and other code review comments.

* Fix docs.
2017-07-07 08:48:26 -07:00
Jonathan Wei 3b70995bb3 Configurable row limit for JDBC frames (#4417) 2017-06-16 17:07:40 -07:00
Gian Merlino 3216134f8c SQL: Make row extractions extensible and add one for lookups. (#3991)
This is a reopening of #3989, since that PR was merged to master prematurely
and accidentally.
2017-03-13 21:56:16 -07:00
Gian Merlino 960769c583 SQL: Fix example INFORMATION_SCHEMA query. (#4017) 2017-03-06 16:07:47 -08:00
Gian Merlino af5a4cce3c SQL: Clarify approximate distinct count behavior. (#4000) 2017-03-03 13:42:30 -08:00
Gian Merlino 4a56d7d8a0 SQL: Ability to generate exact distinct count queries. (#3999) 2017-03-03 23:40:36 +05:30
Gian Merlino e63eefd7ff Revert "SQL: Make row extractions extensible and add one for lookups. (#3989)"
The PR was merged to master accidentally.

This reverts commit 23927a3c96.
2017-03-01 17:06:12 -08:00
Gian Merlino 23927a3c96 SQL: Make row extractions extensible and add one for lookups. (#3989)
* SQL: Make row extractions extensible and add one for lookups.

* Fix QuantileSqlAggregatorTest.
2017-03-01 17:03:43 -08:00
Jihoon Son 7200dce112 Atomic merge buffer acquisition for groupBys (#3939)
* Atomic merge buffer acquisition for groupBys

* documentation

* documentation

* address comments

* address comments

* fix test failure

* Addressed comments

- Add InsufficientResourcesException
- Renamed GroupByQueryBrokerResource to GroupByQueryResource

* addressed comments

* Add takeBatch() to BlockingPool
2017-02-22 14:49:37 -06:00
Gian Merlino e7d01b67b6 Move SQL configs to sql.md. (#3959)
This puts all the SQL stuff in one place. It also makes life easier by
pointing out that configs be made in either common.runtime.properties
or the broker runtime.properties.
2017-02-22 08:37:24 -08:00
Gian Merlino ca6053d045 SQL: Resolve column type conflicts in favor of newer segments. (#3930)
* SQL: Resolve column type conflicts in favor of newer segments.

Helps with schema evolution from e.g. long -> float, which is supported
on the query side.

* Take columns from highest timestamp instead of max segment id.

* Fixes and docs.
2017-02-15 17:48:49 -08:00
Gian Merlino 16ef513c7d SQL: Add context and contextual functions to planner. (#3919)
* SQL: Add context and contextual functions to planner.

Added support for context parameters specified as JDBC connection properties
or a JSON object for SQL-over-JSON-over-HTTP.

Also added features that depend on context functionality:

- Added CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP functions.
- Added support for time zones other than UTC via a "timeZone" context.
- Pass down query context to Druid queries too.

Also some bug fixes:

- Fix DATE handling, it was largely done incorrectly before.
- Fix CAST(__time TO DATE) which should do a floor-to-day.
- Fix non-equality comparisons to FLOOR(__time TO X).
- Fix maxQueryCount property.

* Pass down context to nested queries too.
2017-02-15 14:09:14 -08:00
Jihoon Son a459db68b6 Fine grained buffer management for groupby (#3863)
* Fine-grained buffer management for group by queries

* Remove maxQueryCount from GroupByRules

* Fix code style

* Merge master

* Fix compilation failure

* Address comments

* Address comments

- Revert Sequence
- Add isInitialized() to Grouper
- Initialize the grouper in RowBasedGrouperHelper.Accumulator
- Simple refactoring RowBasedGrouperHelper.Accumulator
- Add tests for checking the number of used merge buffers
- Improve docs

* Revert unnecessary changes

* change to visible to testing

* fix misspelling
2017-02-14 12:55:54 -08:00
Jonathan Wei ca2b04f0fd Add long/float ColumnSelectorStrategy implementations (#3838)
* Add long/float ColumnSelectorStrategy implementations

* Address PR comments

* Add String strategy with internal dictionary to V2 groupby, remove dict from numeric wrapping selectors, more tests

* PR comments

* Use BaseSingleValueDimensionSelector for long/float wrapping

* remove unused import

* Address PR comments

* PR comments

* PR comments

* More PR comments

* Fix failing calcite histogram subquery tests

* ScanQuery test and comment about isInputRaw

* Add outputType to extractionDimensionSpec, tweak SQL tests

* Fix limit spec optimization for numerics

* Add cardinality sanity checks to TopN

* Fix import from merge

* Add tests for filtered dimension spec outputType

* Address PR comments

* Allow filtered dimspecs on numerics

* More comments
2017-02-08 20:39:29 -08:00
Gian Merlino ac84a3e011 SQL: Add resolution parameter, fix filtering bug with APPROX_QUANTILE (#3868)
* SQL: Add resolution parameter to quantile agg, rename to APPROX_QUANTILE.

* Fix bug with re-use of filtered approximate histogram aggregators.

Also add APPROX_QUANTILE tests for filtering and running on complex columns.
Includes some slight refactoring to allow tests to make DruidTables that
include complex columns.

* Remove unused import
2017-01-25 18:39:26 -08:00
Gian Merlino d51f5e058d SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. (#3852)
* SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators.

Switched from CalciteConnection to Planner, bringing benefits:

- CalciteConnection's JDBC interface no longer sits between the SQL server
  (HTTP/Avatica) and Druid's query layer. Instead, the SQL servers can use
  Druid Sequence objects directly, reducing overhead in the query return path.

- Implemented our own Planner-based Avatica Meta, letting us control
  connection timeouts and connection / statement limits. The previous
  CalciteConnection-based implementation didn't have any limits or timeouts.

- The Planner interface lets us override the operator table, opening up
  SQL language extensions. This patch includes two: APPROX_COUNT_DISTINCT
  in core, and a QUANTILE aggregator in the druid-histogram extension.

Also:

- Added INFORMATION_SCHEMA metadata schema.

- Added tests for Unicode literals and escapes.

* Verify statement is actually open before closing it.

* More detailed INFORMATION_SCHEMA docs.
2017-01-19 16:32:20 -08:00
Gian Merlino e86859b228 SQL support for nested groupBys. (#3806)
* SQL support for nested groupBys.

Allows, for example, doing exact count distinct by writing:

  SELECT COUNT(*) FROM (SELECT DISTINCT col FROM druid.foo)

Contrast with approximate count distinct, which is:

  SELECT COUNT(DISTINCT col) FROM druid.foo

* Add deeply-nested groupBy docs, tests, and maxQueryCount config.

* Extract magic constants into statics.

* Rework rules to put preconditions in the "matches" method.
2017-01-11 18:32:53 -08:00
Gian Merlino dd63f54325 Built-in SQL. (#3682) 2016-12-16 17:15:59 -08:00
fjy 42ac41d55e add more docs based on proposed wishlist 2015-07-02 17:46:08 -07:00