Commit Graph

380 Commits

Author SHA1 Message Date
Gian Merlino c44452f0c1 Tidy up lifecycle, query, and ingestion logging. (#8889)
* Tidy up lifecycle, query, and ingestion logging.

The goal of this patch is to improve the clarity and usefulness of
Druid's logging for cluster operators. For more information, see
https://twitter.com/cowtowncoder/status/1195469299814555648.

Concretely, this patch does the following:

- Changes a lot of INFO logs to DEBUG, and DEBUG to TRACE, with the
  goal of reducing redundancy and improving clarity by avoiding
  showing rarely-useful log messages. This includes most "starting"
  and "stopping" messages, and most messages related to individual
  columns.
- Adds new log4j2 templates that show operators how to enabled DEBUG
  logging for certain important packages.
- Eliminate stack traces for query errors, unless log level is DEBUG
  or more. This is useful because query errors often indicate user
  error rather than system error, but dumping stack trace often gave
  operators the impression that there was a system failure.
- Adds task id to Appenderator, AppenderatorDriver thread names. In
  the default log4j2 configuration, this will put them in log lines
  as well. It's very useful if a user is using the Indexer, where
  multiple tasks run in the same JVM.
- More consistent terminology when it comes to "sequences" (sets of
  segments that are handed-off together by Kafka ingestion) and
  "offsets" (cursors in partitions). These terms had been confused in
  some log messages due to the fact that Kinesis calls offsets
  "sequence numbers".
- Replaces some ugly toString calls with either the JSONification or
  something more operator-accessible (like a URL or segment identifier,
  instead of JSON object representing the same).

* Adjustments.

* Adjust integration test.
2019-11-19 13:57:58 -08:00
Clint Wylie cc54b2a9df support for array expressions in TransformSpec with ExpressionTransform (#8744)
* transformSpec + array expressions

changes:
* added array expression support to transformSpec
* removed ParseSpec.verify since its only use afaict was preventing transform expr that did not replace their input from functioning
* hijacked index task test to test changes

* remove docs about being unsupported

* re-arrange test assert

* unused imports

* imports

* fix tests

* preserve types

* suppress warning, fixes, add test

* formatting

* cleanup

* better list to array type conversion and tests

* fix oops
2019-11-13 11:04:37 -08:00
Gian Merlino 0e8c3f74d0 SQL: EARLIEST, LATEST aggregators. (#8815)
* SQL: EARLIEST, LATEST aggregators.

I chose these names instead of FIRST, LAST because those are already
reserved functions in Calcite that mean something different. I think
these are also better names anyway.

* Finalify.

* SQL updates.

* Adjust aggregator calls.

* Validations, test updates.

* Review docs.
2019-11-08 16:29:25 -08:00
Roman Leventov 5c0fc0a13a Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564)
* IndexerSQLMetadataStorageCoordinator.getTimelineForIntervalsWithHandle() don't fetch abutting intervals; simplify getUsedSegmentsForIntervals()

* Add VersionedIntervalTimeline.findNonOvershadowedObjectsInInterval() method; Propagate the decision about whether only visible segmetns or visible and overshadowed segments should be returned from IndexerMetadataStorageCoordinator's methods to the user logic; Rename SegmentListUsedAction to RetrieveUsedSegmentsAction, SegmetnListUnusedAction to RetrieveUnusedSegmentsAction, and UsedSegmentLister to UsedSegmentsRetriever

* Fix tests

* More fixes

* Add javadoc notes about returning Collection instead of Set. Add JacksonUtils.readValue() to reduce boilerplate code

* Fix KinesisIndexTaskTest, factor out common parts from KinesisIndexTaskTest and KafkaIndexTaskTest into SeekableStreamIndexTaskTestBase

* More test fixes

* More test fixes

* Add a comment to VersionedIntervalTimelineTestBase

* Fix tests

* Set DataSegment.size(0) in more tests

* Specify DataSegment.size(0) in more places in tests

* Fix more tests

* Fix DruidSchemaTest

* Set DataSegment's size in more tests and benchmarks

* Fix HdfsDataSegmentPusherTest

* Doc changes addressing comments

* Extended doc for visibility

* Typo

* Typo 2

* Address comment
2019-11-06 11:07:04 -08:00
Clint Wylie 3ff5e02237 remove select query (#8739)
* remove select query

* thanks teamcity

* oops

* oops

* add back a SelectQuery class that throws RuntimeExceptions linking to docs

* adjust text

* update docs per review

* deprecated
2019-10-30 19:29:56 -07:00
Surekha 98f59ddd7e Add `sys.supervisors` table to system tables (#8547)
* Add supervisors table to SystemSchema

* Add docs

* fix checkstyle

* fix test

* fix CI

* Add comments

* Fix javadoc teamcity error

* comments

* fix links in docs

* fix links

* rename fullStatus query param to system and remove it from docs
2019-10-18 15:16:42 -07:00
Jonathan Wei d88075237a
Add initial SQL support for non-expression sketch postaggs (#8487)
* Add initial SQL support for non-expression sketch postaggs

* Checkstyle, spotbugs

* checkstyle

* imports

* Update SQL docs

* Checkstyle

* Fix theta sketch operator docs

* PR comments

* Checkstyle fixes

* Add missing entries for HLL sketch module

* PR comments, add round param to HLL estimate operator, fix optional HLL param
2019-10-18 14:59:44 -07:00
Jihoon Son 4046c86d62
Stateful auto compaction (#8573)
* Stateful auto compaction

* javaodc

* add removed test back

* fix test

* adding indexSpec to compactionState

* fix build

* add lastCompactionState

* address comments

* extract CompactionState

* fix doc

* fix build and test

* Add a task context to store compaction state; add javadoc

* fix it test
2019-10-15 22:57:42 -07:00
Chi Cao Minh 5f61374cb3 Fix dependency analyze warnings (#8230)
* Fix dependency analyze warnings

Update the maven dependency plugin to the latest version and fix all
warnings for unused declared and used undeclared dependencies in the
compile scope. Added new travis job to add the check to CI. Also fixed
some source code files to use the correct packages for their imports and
updated druid-forbidden-apis to prevent regressions.

* Address review comments

* Adjust scope for org.glassfish.jaxb:jaxb-runtime

* Fix dependencies for hdfs-storage

* Consolidate netty4 versions
2019-09-09 14:37:21 -07:00
Clint Wylie c73a489335
bump master version to 0.17.0-incubating-SNAPSHOT (#8421) 2019-08-28 01:58:36 -07:00
Himanshu 4d87a19547
Logging emitter to publish query and other metric events as valid json objects (#8359)
* LoggingEmitter: print event as json

* use DefaultRequestLogEventBuilderFactory in emitting request logger by default

* print context in query metric as json

* removed unused jsonMapper from DefaultQueryMetrics

* add comment

* remove change to DefaultRequestLogEventBuilderFactory.java
2019-08-27 15:00:23 -07:00
Jihoon Son e5ef5ddafa Fix the shuffle with TLS enabled for parallel indexing; add an integration test; improve unit tests (#8350)
* Fix shuffle with tls enabled; add an integration test; improve unit tests

* remove debug log

* fix tests

* unused import

* add javadoc

* rename to getContent
2019-08-26 19:27:41 -07:00
SandishKumarHN 33f0753a70 Add Checkstyle for constant name static final (#8060)
* check ctyle for constant field name

* check ctyle for constant field name

* check ctyle for constant field name

* check ctyle for constant field name

* check ctyle for constant field name

* check ctyle for constant field name

* check ctyle for constant field name

* check ctyle for constant field name

* check ctyle for constant field name

* merging with upstream

* review-1

* unknow changes

* unknow changes

* review-2

* merging with master

* review-2 1 changes

* review changes-2 2

* bug fix
2019-08-23 13:13:54 +03:00
Surekha cf2a2dd917
Add group_id to the sys.tasks table (#8304)
* Add group_id to overlord tasks API and sys.tasks table

* adjust test

* modify docs

* Make groupId nullable

* fix integration test

* fix toString

* Remove groupId from TaskInfo

* Modify docs and tests

* modify TaskMonitorTest
2019-08-22 15:28:23 -07:00
Clint Wylie 1054d85171
add mechanism to control filter optimization in historical query processing (#8209)
* add support for mechanism to control filter optimization in historical query processing

* oops

* adjust

* woo

* javadoc

* review comments

* fix

* default

* oops

* oof

* this will fix it

* more nullable, refactor DimFilter.getRequiredColumns to use Set, formatting

* extract class DimFilterToStringBuilder with common code from custom DimFilter toString implementations

* adjust variable naming

* missing nullable

* more nullable

* fix javadocs

* nullable

* address review comments

* javadocs, precondition

* nullable

* rename method to be consistent

* review comments

* remove tuning from ColumnComparisonFilter/ColumnComparisonDimFilter
2019-08-09 16:36:18 -07:00
Jihoon Son 8fa114c349 Fix bugs in overshadowableManager and add unit tests (#8222)
* Fix bugs in overshadowableManager and add unit tests

* Fix SegmentManager

* add segment manager test

* Address comments

* Address comments
2019-08-07 15:51:21 -05:00
Clint Wylie e7c6deac76 optimize single input column multi-value expressions (#8047)
* optimize single input column multi-value expressions

* javadocs

* merge fixup

* vectorization fixup

* more fixes

* more docs

* more links

* empty

* javadocs are hard

* suppress javadoc refs issue

* fix it
2019-08-02 13:21:25 -07:00
Chi Cao Minh 4bd3bad8ba Add IPv4 SQL functions (#8223)
* Add IPv4 SQL functions

New SQL functions for filtering IPv4 addresses:
- IPV4_MATCH: Check if IP address belongs to a subnet
- IPV4_PARSE: Convert string IP address to integer
- IPV4_STRINGIFY: Convert integer IP address to string

These are the SQL analogs of the druid expressions with the same name.
Filtering is more efficient when operating on IP addresses as integers
instead of strings.

* Refactor operator conversions into named constants
2019-08-01 21:29:58 -07:00
Surekha f0ecdfee30 Fix `is_realtime` column behavior in sys.segments table (#8154)
* Fix is_realtime flag

* make variable final

* minor changes

* Modify is_realtime behavior based on review comment

* Fix UT
2019-07-31 22:26:49 -06:00
Gian Merlino 77297f4e6f GroupBy array-based result rows. (#8196)
* GroupBy array-based result rows.

Fixes #8118; see that proposal for details.

Other than the GroupBy changes, the main other "interesting" classes are:

- ResultRow: The array-based result type.
- BaseQuery: T is no longer required to be Comparable.
- QueryToolChest: Adds "decorateObjectMapper" to enable query-aware serialization
  and deserialization of result rows (necessary due to their positional nature).
- QueryResource: Uses the new decoration functionality.
- DirectDruidClient: Also uses the new decoration functionality.
- QueryMaker (in Druid SQL): Modifications to read ResultRows.

These classes weren't changed, but got some new javadocs:

- BySegmentQueryRunner
- FinalizeResultsQueryRunner
- Query

* Adjustments for TC stuff.
2019-07-31 16:15:12 -07:00
Jonathan Wei 640b7afc1c Add CliIndexer process type and initial task runner implementation (#8107)
* Add CliIndexer process type and initial task runner implementation

* Fix HttpRemoteTaskRunnerTest

* Remove batch sanity check on PeonAppenderatorsManager

* Fix paralle index tests

* PR comments

* Adjust Jersey resource logging

* Additional cleanup

* Fix SystemSchemaTest

* Add comment to LocalDataSegmentPusherTest absolute path test

* More PR comments

* Use Server annotated with RemoteChatHandler

* More PR comments

* Checkstyle

* PR comments

* Add task shutdown to stopGracefully

* Small cleanup

* Compile fix

* Address PR comments

* Adjust TaskReportFileWriter and fix nits

* Remove unnecessary closer

* More PR comments

* Minor adjustments

* PR comments

* ThreadingTaskRunner: cancel  task run future not shutdownFuture and remove thread from workitem
2019-07-29 17:06:33 -07:00
Chi Cao Minh ab71a2e1e4 Revert "Fix dependency analyze warnings (#8128)" (#8189)
This reverts commit 5dd0d8e873.
2019-07-29 11:42:16 -07:00
Chi Cao Minh 5dd0d8e873 Fix dependency analyze warnings (#8128)
* Fix dependency analyze warnings

Update the maven dependency plugin to the latest version and fix all
warnings for unused declared and used undeclared dependencies in the
compile scope. Added new travis job to add the check to CI. Also fixed
some source code files to use the correct packages for their imports.

* Fix licenses and dependencies

* Fix licenses and dependencies again

* Fix integration test dependency

* Address review comments

* Fix unit test dependencies

* Fix integration test dependency

* Fix integration test dependency again

* Fix integration test dependency third time

* Fix integration test dependency fourth time

* Fix compile error

* Fix assert package
2019-07-26 10:49:03 -07:00
Jihoon Son db14946207
Add support minor compaction with segment locking (#7547)
* Segment locking

* Allow both timeChunk and segment lock in the same gruop

* fix it test

* Fix adding same chunk to atomicUpdateGroup

* resolving todos

* Fix segments to lock

* fix segments to lock

* fix kill task

* resolving todos

* resolving todos

* fix teamcity

* remove unused class

* fix single map

* resolving todos

* fix build

* fix SQLMetadataSegmentManager

* fix findInputSegments

* adding more tests

* fixing task lock checks

* add SegmentTransactionalOverwriteAction

* changing publisher

* fixing something

* fix for perfect rollup

* fix test

* adjust package-lock.json

* fix test

* fix style

* adding javadocs

* remove unused classes

* add more javadocs

* unused import

* fix test

* fix test

* Support forceTimeChunk context and force timeChunk lock for parallel index task if intervals are missing

* fix travis

* fix travis

* unused import

* spotbug

* revert getMaxVersion

* address comments

* fix tc

* add missing error handling

* fix backward compatibility

* unused import

* Fix perf of versionedIntervalTimeline

* fix timeline

* fix tc

* remove remaining todos

* add comment for parallel index

* fix javadoc and typos

* typo

* address comments
2019-07-24 17:35:46 -07:00
Sashidhar Thallam ea4bad7836 Druid SQL EXTRACT time function - adding support for additional Time Units (#8068)
* 1. Added TimestampExtractExprMacro.Unit for MILLISECOND 2. expr eval for MILLISECOND 3. Added a test case to test extracting millisecond from expression. #7935

* 1. Adding DATASOURCE4 in tests. 2. Adding test TimeExtractWithMilliseconds

* Fixing testInformationSchemaTables test

* Fixing failing tests in DruidAvaticaHandlerTest

* Adding cannotVectorize() call before the test

* Extract time function - Adding support for MICROSECOND, ISODOW, ISOYEAR and CENTURY time units, documentation changes.

* Adding MILLISECOND in test case

* Adding support DECADE and MILLENNIUM, updating test case and documentation

* Fixing expression eval for DECADE and MILLENIUM
2019-07-19 20:38:32 -07:00
Samarth Jain ceb3a891bb Fix druid sql group by queries returning complex aggregation type (#8099)
* Fix druid sql group by queries returning complex aggregation type

* Remove unnecessary check
2019-07-19 13:52:14 -07:00
Clint Wylie 03e55d30eb
add CachingClusteredClient benchmark, refactor some stuff (#8089)
* add CachingClusteredClient benchmark, refactor some stuff

* revert WeightedServerSelectorStrategy to ConnectionCountServerSelectorStrategy and remove getWeight since felt artificial, default mergeResults in toolchest implementation for topn, search, select

* adjust javadoc

* adjustments

* oops

* use it

* use BinaryOperator, remove CombiningFunction, use Comparator instead of Ordering, other review adjustments

* rename createComparator to createResultComparator, fix typo, firstNonNull nullable parameters
2019-07-18 13:16:28 -07:00
Roman Leventov ceb969903f
Refactor SQLMetadataSegmentManager; Change contract of REST met… (#7653)
* Refactor SQLMetadataSegmentManager; Change contract of REST methods in DataSourcesResource

* Style fixes

* Unused imports

* Fix tests

* Fix style

* Comments

* Comment fix

* Remove unresolvable Javadoc references; address comments

* Add comments to ImmutableDruidDataSource

* Merge with master

* Fix bad web-console merge

* Fixes in api-reference.md

* Rename in DruidCoordinatorRuntimeParams

* Fix compilation

* Residual changes
2019-07-17 17:18:48 +03:00
Gian Merlino ffa25b7832
Query vectorization. (#6794)
* Benchmarks: New SqlBenchmark, add caching & vectorization to some others.

- Introduce a new SqlBenchmark geared towards benchmarking a wide
  variety of SQL queries. Rename the old SqlBenchmark to
  SqlVsNativeBenchmark.
- Add (optional) caching to SegmentGenerator to enable easier
  benchmarking of larger segments.
- Add vectorization to FilteredAggregatorBenchmark and GroupByBenchmark.

* Query vectorization.

This patch includes vectorized timeseries and groupBy engines, as well
as some analogs of your favorite Druid classes:

- VectorCursor is like Cursor. (It comes from StorageAdapter.makeVectorCursor.)
- VectorColumnSelectorFactory is like ColumnSelectorFactory, and it has
  methods to create analogs of the column selectors you know and love.
- VectorOffset and ReadableVectorOffset are like Offset and ReadableOffset.
- VectorAggregator is like BufferAggregator.
- VectorValueMatcher is like ValueMatcher.

There are some noticeable differences between vectorized and regular
execution:

- Unlike regular cursors, vector cursors do not understand time
  granularity. They expect query engines to handle this on their own,
  which a new VectorCursorGranularizer class helps with. This is to
  avoid too much batch-splitting and to respect the fact that vector
  selectors are somewhat more heavyweight than regular selectors.
- Unlike FilteredOffset, FilteredVectorOffset does not leverage indexes
  for filters that might partially support them (like an OR of one
  filter that supports indexing and another that doesn't). I'm not sure
  that this behavior is desirable anyway (it is potentially too eager)
  but, at any rate, it'd be better to harmonize it between the two
  classes. Potentially they should both do some different thing that
  is smarter than what either of them is doing right now.
- When vector cursors are created by QueryableIndexCursorSequenceBuilder,
  they use a morphing binary-then-linear search to find their start and
  end rows, rather than linear search.

Limitations in this patch are:

- Only timeseries and groupBy have vectorized engines.
- GroupBy doesn't handle multi-value dimensions yet.
- Vector cursors cannot handle virtual columns or descending order.
- Only some filters have vectorized matchers: "selector", "bound", "in",
  "like", "regex", "search", "and", "or", and "not".
- Only some aggregators have vectorized implementations: "count",
  "doubleSum", "floatSum", "longSum", "hyperUnique", and "filtered".
- Dimension specs other than "default" don't work yet (no extraction
  functions or filtered dimension specs).

Currently, the testing strategy includes adding vectorization-enabled
tests to TimeseriesQueryRunnerTest, GroupByQueryRunnerTest,
GroupByTimeseriesQueryRunnerTest, CalciteQueryTest, and all of the
filtering tests that extend BaseFilterTest. In all of those classes,
there are some test cases that don't support vectorization. They are
marked by special function calls like "cannotVectorize" or "skipVectorize"
that tell the test harness to either expect an exception or to skip the
test case.

Testing should be expanded in the future -- a project in and of itself.

Related to #3011.

* WIP

* Adjustments for unused things.

* Adjust javadocs.

* DimensionDictionarySelector adjustments.

* Add "clone" to BatchIteratorAdapter.

* ValueMatcher javadocs.

* Fix benchmark.

* Fixups post-merge.

* Expect exception on testGroupByWithStringVirtualColumn for IncrementalIndex.

* BloomDimFilterSqlTest: Tag two non-vectorizable tests.

* Minor adjustments.

* Update surefire, bump up Xmx in Travis.

* Some more adjustments.

* Javadoc adjustments

* AggregatorAdapters adjustments.

* Additional comments.

* Remove switching search.

* Only missiles.
2019-07-12 12:54:07 -07:00
Gian Merlino 613f09b45a SQL: Add TIME_CEIL function. (#8027)
Also simplify conversions for CEIL, FLOOR, and TIME_FLOOR by allowing them to
share more code.
2019-07-04 15:40:03 -07:00
Clint Wylie e6ba258197 multi-value string expression transformation fix (#8019)
* multi-value string expression transformation fix

* fixes

* more docs and test

* revert unintended doc change

* formatting

* change tostring to print binding identifier

* review fixup

* oops
2019-07-03 23:03:47 -07:00
Clint Wylie c556d44a19
more sql support for expression array functions (#7974)
* more sql support for expression array functions

* prepend/slice

* doc fixes

* fix imports

* fix tests

* add null numeric expr for proper conversions between ExprEval and Expr and back to ExprEval

* re-arrange

* imports :(

* add append/prepend test
2019-07-02 21:39:26 -07:00
Clint Wylie 93b738bbfa
expression language array constructor and sql multi-value string filtering support (#7973)
* expr array constructor and sql multi-value string support

* doc fix

* checkstyle

* change from feedback
2019-07-01 15:14:50 -07:00
Xue Yu 2831944056 support NVL sql function (#7965)
* sql nvl

* add nvl in sql doc
2019-06-30 13:14:30 -07:00
Clint Wylie 151edeec3c
expression virtual column selector fix for expressions which produce array types (#7958)
* fix bug in multi-value string expression column selector

* more test

* imports!!

* fixes
2019-06-26 16:57:13 -07:00
Fokko Driesprong 82b248cc17 Spotbugs: Enable MS_SHOULD_BE_FINAL (#7946) 2019-06-23 15:42:18 -07:00
Clint Wylie 494b8ebe56 multi-value string column support for expressions (#7588)
* array support for expression language for multi-value string columns

* fix tests?

* fixes

* more tests

* fixes

* cleanup

* more better, more test

* ignore inspection

* license

* license fix

* inspection

* remove dumb import

* more better

* some comments

* add expr rewrite for arrayfn args for more magic, tests

* test stuff

* more tests

* fix test

* fix test

* castfunc can deal with arrays

* needs more empty array

* more tests, make cast to long array more forgiving

* refactor

* simplify ExprMacro Expr implementations with base classes in core

* oops

* more test

* use Shuttle for Parser.flatten, javadoc, cleanup

* fixes and more tests

* unused import

* fixes

* javadocs, cleanup, refactors

* fix imports

* more javadoc

* more javadoc

* more

* more javadocs, nonnullbydefault, minor refactor

* markdown fix

* adjustments

* more doc

* move initial filter out

* docs

* map empty arg lambda, apply function argument validation

* check function args at parse time instead of eval time

* more immutable

* more more immutable

* clarify grammar

* fix docs

* empty array is string test, we need a way to make arrays better maybe in the future, or define empty arrays as other types..
2019-06-19 13:57:37 -07:00
SandishKumarHN 01881e3a98 Use only com.google.errorprone.annotations.concurrent.GuardedBy, not javax.annotations.concurrent.GuardedBy (#7889) 2019-06-17 15:58:51 +02:00
Sashidhar Thallam 3bee6adcf7 Use map.putIfAbsent() or map.computeIfAbsent() as appropriate instead of containsKey() + put() (#7764)
* https://github.com/apache/incubator-druid/issues/7316 Use Map.putIfAbsent() instead of containsKey() + put()

* fixing indentation

* Using map.computeIfAbsent() instead of map.putIfAbsent() where appropriate

* fixing checkstyle

* Changing the recommendation text

* Reverting auto changes made by IDE

* Implementing recommendation: A ConcurrentHashMap on which computeIfAbsent() is called should be assigned into variables of ConcurrentHashMap type, not ConcurrentMap

* Removing unused import
2019-06-14 17:59:36 +02:00
Surekha ea752ef562 Optimize overshadowed segments computation (#7595)
* Move the overshadowed segment computation to SQLMetadataSegmentManager's poll

* rename method in MetadataSegmentManager

* Fix tests

* PR comments

* PR comments

* PR comments

* fix indentation

* fix tests

*  fix test

*  add test for SegmentWithOvershadowedStatus serde format

* PR comments

* PR comments

* fix test

* remove snapshot updates outside poll

* PR comments

* PR comments

* PR comments

*  removed unused import
2019-06-07 19:15:54 +02:00
Clint Wylie 12a1ecfc2b allow sql lookup function to take advantage of injective lookups (#7655) 2019-06-06 14:36:10 -07:00
Xue Yu d482da6e9b fix timestamp ceil lower bound bug (#7823) 2019-06-04 01:16:31 -07:00
Eyal Yurman 69e9b8a464 Enables SQL by default. (#7808) 2019-05-31 20:53:42 -07:00
Jihoon Son 7abfbb066a Bump up snapshot version to 0.16.0 (#7802) 2019-05-30 17:17:33 -07:00
Gian Merlino 58a571ccda
SQL: Use SegmentId instead of DataSegment as set/map keys. (#7796)
Recently we've been talking about using SegmentIds as map keys rather than
DataSegments, because its sense of equality is more well-defined. This is
a refactor that does this in the druid-sql module, which mostly involves
DruidSchema and some related classes. It should have no user-visible effects.
2019-05-30 12:58:36 -07:00
Gian Merlino 8649b8ab4c
SQL: Allow select-sort-project query shapes. (#7769)
* SQL: Allow select-sort-project query shapes.

Fixes #7768.

Design changes:

- In PartialDruidQuery, allow projection after select + sort by removing
  the SELECT_SORT query stage and instead allowing the SORT and
  SORT_PROJECT stages to apply either after aggregation or after a plain
  non-aggregating select. This is different from prior behavior, where
  SORT and SORT_PROJECT were only considered valid after aggregation
  stages. This logic change is in the "canAccept" method.
- In DruidQuery, represent either kind of sorting with a single "Sorting"
  class (instead of DefaultLimitSpec). The Sorting class is still
  convertible into a DefaultLimitSpec, but is also convertible into the
  sorting parameters accepted by a Scan query.
- In DruidQuery, represent post-select and post-sorting projections with
  a single "Projection" class. This obsoletes the SortProject and
  SelectProjection classes, and simplifies the DruidQuery by allowing us
  to move virtual-column and post-aggregator-creation logic into the
  new Projection class.
- Split "DruidQuerySignature" into RowSignature and VirtualColumnRegistry.
  This effectively means that instead of having mutable and immutable
  versions of DruidQuerySignature, we instead of RowSignature (always
  immutable) and VirtualColumnRegistry (always mutable, but sometimes
  null). This change wasn't required, but IMO it this makes the logic
  involving them easier to follow, and makes it more clear when the
  virtual column registry is active and when it's not.

Other changes:

- ConvertBoundsToSelectors now just accepts a RowSignature, but we
  use the VirtualColumnRegistry.getFullRowSignature() method to get
  a signature that includes all columns, and therefore allows us to
  simplify the logic (no need to special-case virtual columns).
- Add `__time` to the Scan column list if the query is ordering by time.

* Remove unused import.
2019-05-30 12:56:29 -07:00
Roman Leventov 782863ed0f Fix some problems reported by PVS-Studio (#7738)
* Fix some problems reported by PVS-Studio

* Address comments
2019-05-29 11:20:45 -07:00
Surekha 1fe0de1c96 Fix currSize attribute of historical server type (#7706) 2019-05-21 11:55:58 -07:00
Gian Merlino cbbce955de SQL: Allow NULLs in place of optional arguments in many functions. (#7709)
* SQL: Allow NULLs in place of optional arguments in many functions.

Also adjust SQL docs to describe how to make time literals using
TIME_PARSE (which is now possible in a nicer way).

* Be less forbidden.
2019-05-21 11:54:34 -07:00
Gian Merlino 43c54385f6 SQL: Respect default timezone for TIME_PARSE and TIME_SHIFT. (#7704)
* SQL: Respect default timezone for TIME_PARSE and TIME_SHIFT.

They were inadvertently using UTC rather than the default timezone.
Also, harmonize how time functions handle their parameters.

* Fix tests

* Add another TIME_SHIFT test.
2019-05-21 11:40:44 -07:00
Gian Merlino 69b2ea3ddc SQL: TIME_EXTRACT should have 2 required operands. (#7710)
* SQL: TIME_EXTRACT should have 2 required operands.

Timestamp and time unit are both required.

* Add regression test.
2019-05-21 11:32:36 -07:00
Gian Merlino bcea05e4e8 SQL: Fix exception with OR of impossible filters. (#7707)
Fixes #7671.
2019-05-21 11:32:09 -07:00
Xue Yu dd7dace70a Add TIMESTAMPDIFF sql support (#7695)
* add timestampdiff sql support

* feedback address
2019-05-21 08:05:38 -07:00
Gian Merlino cb6ec2cab8 SqlOperatorConversion Javadoc fix. (#7713)
Appears to be a copypasta error; the toDruidFilter method was referred
to aggregations, but it's not handling aggregations.
2019-05-20 21:21:21 -07:00
Surekha d3545f5086 Show all server types in sys.servers table (#7654)
* update sys.servers table to show all servers

* update docs

* Fix integration test

* modify test query for batch integration test

* fix case in test queries

* make the server_type lowercase

* Apply suggestions from code review

Co-Authored-By: Himanshu <g.himanshu@gmail.com>

* Fix compilation from git suggestion

* fix unit test
2019-05-15 16:54:02 -07:00
Xue Yu 35a1fbefea upgrade avatica to 1.12.0 (#7644) 2019-05-12 14:38:06 -07:00
Xue Yu f7b8b57c3b simpilfy DruidConvertletTable.java, remove STANDARD_CONVERTLET declare (#7632) 2019-05-10 14:08:32 -07:00
Jonathan Wei a013350018 Adjust required permissions for system schema (#7579)
* Adjust required permissions for system schema

* PR comments, fix current_size handling

* Checkstyle

* Set curr_size instead of current_size

* Adjust information schema docs

* Fix merge conflict

* Update tests
2019-05-02 07:18:02 -07:00
Surekha 15d19f3059 Add is_overshadowed column to sys.segments table (#7425)
* Add is_overshadowed column to sys.segments table

* update docs

* Rename class and variables

* PR comments

* PR comments

* remove unused variables in MetadataResource

* move constants together

* add getFullyOvershadowedSegments method to ImmutableDruidDataSource

* Fix compareTo of SegmentWithOvershadowedStatus

* PR comment

* PR comments

* PR comments

* PR comments

* PR comments

* fix issue with already consumed stream

* minor refactoring

* PR comments
2019-05-01 18:00:57 +02:00
Gian Merlino c648775b5b SQL: Remove "useFallback" feature. (#7567)
This feature allows Calcite's Bindable interpreter to be bolted on
top of Druid queries and table scans. I think it should be removed for
a few reasons:

1. It is not recommended for production anyway, because it generates
unscalable query plans (e.g. it will plan a join into two table scans
and then try to do the entire join in memory on the broker).
2. It doesn't work with Druid-specific SQL functions, like TIME_FLOOR,
REGEXP_EXTRACT, APPROX_COUNT_DISTINCT, etc.
3. It makes the SQL planning code needlessly complicated.

With SQL coming out of experimental status soon, it's a good opportunity
to remove this feature.
2019-04-28 18:26:44 -07:00
Xue Yu 2c8a71f883 Support LPAD and RPAD sql function (#7388)
* lpad and rpad sql function

* feedback address

* feedback address

* add doc and format

* update docs
2019-04-22 14:51:32 -07:00
Clint Wylie be65cca248 refactor druid-bloom-filter aggregators (#7496)
* now with 100% more buffer

* there can be only 1

* simplify

* javadoc

* clean up unused test method

* fix exception message

* style

* why does style hate javadocs

* review stuff

* style :(
2019-04-18 11:54:06 -07:00
Kazuhito Takeuchi 7c19c92a81 Add ROUND function in druid-sql. (#7224)
* Implement round function in druid-sql

* Return value according to the type of argument

* Fix codes for abnoraml inputs, updated math-expr.md

* Fix assert text

* Fix error messages and refactor codes

* Fix compile error, update sql.md, refactor codes and format tests
2019-04-16 11:15:39 -07:00
Gian Merlino 721191635a
SQL: Include virtual columns used for filtering in ScanQuery. (#7472)
PR #6902 introduced the ability to use virtual columns for filters, but they
were being omitted from "scan" queries, so filters would refer to a null column
instead of the intended virtual column.
2019-04-14 15:03:36 -07:00
Surekha 3e5dae9b96 Rename SegmentMetadataHolder to AvailableSegmentMetadata (#7372) 2019-04-14 10:19:48 -07:00
Justin Borromeo 408e3e1b2a Remove select execution code from SQL planner (#7416)
* Removed select execution code from SQL planner

* Update doc
2019-04-10 22:32:57 -07:00
Benedict Jin 2f64414ade Add "REVERSE" / "REPEAT" / "RIGHT" / "LEFT" functions (#7334)
* Add "REVERSE" / "REPEAT" / "RIGHT" / "LEFT" functions

* Fix ImportOrder

* Use RuntimeException instead of OutOfMemoryError according to "Effective Java"

* Simplify

* Patch suggestions
2019-04-10 11:46:29 +08:00
Justin Borromeo 799c66d9ac Allow max rows and max segments for time-ordered scans to be overridden using the scan query JSON spec (#7413)
* Initial changes

* Fixed NPEs

* Fixed failing spec test

* Fixed failing Calcite test

* Move configs to context

* Validated and added docs

* fixed weird indentation

* Update default context vals in doc

* Fixed allowable values
2019-04-07 20:12:52 -07:00
Clint Wylie 76b4a5c62e refactor lookups to be more chill to router (#7222)
* refactor lookups to be more chill to router

* remove accidental change

* fix and combine LookupIntrospectionResourceTest

* fix inspection

* rename RouterLookupModule to LookupSerdeModule and RouterLookupExtractorFactoryContainerProvider to NoopLookupExtractorFactoryContainerProvider

* make comment generic

* use ConfigResourceFilter instead of StateResourceFilter

* fix indentation

* unused import

* another unused import

* refactor some stuff into processing module, split up LookupModule.java classes into their own files
2019-04-05 14:49:41 -07:00
Gian Merlino 8c104a115c
SQL: Add STRING_FORMAT function. (#7327) 2019-04-03 17:09:54 -04:00
Atul Mohan c883c52cb1 Fix tests (#7401) 2019-04-02 16:49:21 -07:00
Justin Borromeo 4584b5e139 SQL support for time-ordered scan (#7373)
* Squashed commit of the following:

commit 287a367f41
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Mar 27 20:03:41 2019 -0700

    Implemented Clint's recommendations

commit 07503ea5c0
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Mar 27 17:49:09 2019 -0700

    doc fix

commit 231a72e7d9
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Mar 27 17:38:20 2019 -0700

    Modified sequence limit to accept longs and added test for long limits

commit 1df50de321
Merge: 480e932fd c7fea6ac8
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 26 15:23:01 2019 -0700

    Merge branch 'master' into 6088-Time-Ordering-On-Scans-N-Way-Merge

commit 480e932fdf
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 26 14:58:04 2019 -0700

    Checkstyle and doc update

commit 487f31fcf6
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 26 14:39:25 2019 -0700

    Refixed regression

commit fb858efbb7
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 26 13:14:48 2019 -0700

    Added test for n-way merge

commit 376e8bf906
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 26 11:42:54 2019 -0700

    Refactor n-way merge

commit 8a6bb1127c
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 17:17:41 2019 -0700

    Fix docs and flipped boolean in ScanQueryLimitRowIterator

commit 35692680fc
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 16:15:49 2019 -0700

    Fix bug messing up count of rows

commit 219af478c8
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 15:57:55 2019 -0700

    Fix bug in numRowsScanned

commit da4fc66403
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 15:19:45 2019 -0700

    Check type of segment spec before using for time ordering

commit b822fc73df
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 13:19:02 2019 -0700

    Revert "Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge"

    This reverts commit 57033f36df, reversing
    changes made to 8f01d8dd16.

commit 57033f36df
Merge: 8f01d8dd1 86d9730fc
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 13:13:52 2019 -0700

    Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge

commit 8f01d8dd16
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 13:13:32 2019 -0700

    Revert "Fixed failing tests -> allow usage of all types of segment spec"

    This reverts commit ec470288c7.

commit ec470288c7
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 11:01:35 2019 -0700

    Fixed failing tests -> allow usage of all types of segment spec

commit 86d9730fc9
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 25 11:01:35 2019 -0700

    Fixed failing tests -> allow usage of all types of segment spec

commit 8b3b6b51ed
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Mar 22 16:01:56 2019 -0700

    Nit comment

commit a87d02127c
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Mar 22 15:54:42 2019 -0700

    Fix checkstyle and test

commit 62dcedacde
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Mar 22 15:30:41 2019 -0700

    More comments

commit 1b46b58aec
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Mar 22 15:19:52 2019 -0700

    Added a bit of docs

commit 49472162b7
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Mar 22 10:27:41 2019 -0700

    Rename segment limit -> segment partitions limit

commit 43d490cc3a
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Mar 21 13:16:58 2019 -0700

    Optimized n-way merge strategy

commit 42f5246b8d
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Mar 20 17:40:19 2019 -0700

    Smarter limiting for pQueue method

commit 4823dab895
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Mar 20 16:05:53 2019 -0700

    Finish rename

commit 2528a56142
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 18 14:00:50 2019 -0700

    Renaming

commit 7bfa77d3c1
Merge: a032c46ee 7e49d4739
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 12 16:57:45 2019 -0700

    Merge branch 'Update-Query-Interrupted-Exception' into 6088-Time-Ordering-On-Scans-N-Way-Merge

commit 7e49d47391
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 12 16:51:25 2019 -0700

    Added error message for UOE

commit a032c46ee0
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 12 16:47:17 2019 -0700

    Updated error message

commit 57b5682654
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 12 12:44:02 2019 -0700

    Fixed tests

commit 45e95bb1f4
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Mar 12 11:09:08 2019 -0700

    Optimization

commit cce917ab84
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Mar 8 14:11:07 2019 -0800

    Checkstyle fix

commit 73f4038068
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Mar 7 18:40:00 2019 -0800

    Applied Jon's recommended changes

commit fb966def83
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Mar 7 11:03:01 2019 -0800

    Sorry, checkstyle

commit 6dc53b311c
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Mar 6 10:34:13 2019 -0800

    Improved test and appeased TeamCity

commit 35c96d3557
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 4 16:00:44 2019 -0800

    Checkstyle fix

commit 2d1978d571
Merge: 83ec3fe1f 3398d3982
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Mar 4 15:24:49 2019 -0800

    Merge branch 'master' into 6088-Time-Ordering-On-Scans-N-Way-Merge

commit 83ec3fe1f1
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Mar 1 13:40:22 2019 -0800

    Nit-change on javadoc

commit 47c970b5f4
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Mar 1 13:38:29 2019 -0800

    Wrote tests and added Javadoc

commit 5ff59f5ca6
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 28 15:58:20 2019 -0800

    Reset config

commit 806166f977
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 28 15:49:07 2019 -0800

    Fixed failing tests

commit de83b11a1b
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 26 16:40:48 2019 -0800

    Fixed mistakes in merge

commit 5bd0e1a32c
Merge: 18cce9a64 9fa649b3b
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 26 16:39:16 2019 -0800

    Merge branch 'master' into 6088-Time-Ordering-On-Scans-N-Way-Merge

commit 18cce9a646
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 26 13:16:44 2019 -0800

    Change so batching only occurs on broker for time-ordered scans

    Restricted batching to broker for time-ordered queries and adjusted
    tests

    Formatting

    Cleanup

commit 451e2b4365
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 26 11:14:27 2019 -0800

    WIP

commit 69b24bd851
Merge: 763c43df7 417b9f2fe
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 22 18:13:26 2019 -0800

    Merge branch 'master' into 6088-Time-Ordering-On-Scans-N-Way-Merge

commit 763c43df7e
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 22 18:07:06 2019 -0800

    Multi-historical setup works

commit 06a5218917
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 22 16:59:57 2019 -0800

    Wrote docs

commit 3b923dac9c
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 22 14:03:22 2019 -0800

    Fixed bug introduced by replacing deque with list

commit 023538d831
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 22 13:30:08 2019 -0800

    Sequence stuff is so dirty :(

commit e1fc2955d3
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 22 10:39:59 2019 -0800

    WIP

commit f57ff253fa
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 21 18:22:06 2019 -0800

    Ordering is correct on n-way merge -> still need to batch events into
    ScanResultValues

commit 1813a5472c
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 21 17:06:18 2019 -0800

    Cleanup

commit f83e99655d
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 21 16:56:36 2019 -0800

    Refactor and pQueue works

commit b13ff624a9
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 21 15:13:33 2019 -0800

    Set up time ordering strategy decision tree

commit fba6b022f0
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 21 15:08:27 2019 -0800

    Added config and get # of segments

commit c9142e721c
Merge: cd489a020 554b0142c
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 20 10:12:50 2019 -0800

    Merge branch 'master' into 6088-Time-Ordering-On-Scans-V2

commit cd489a0208
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 20 00:16:48 2019 -0800

    Fixed failing test due to null resultFormat

commit 7baeade832
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 19 17:52:06 2019 -0800

    Changes based on Gian's comments

commit 35150fe1a6
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 15 15:57:53 2019 -0800

    Small changes

commit 4e69276d57
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 11 12:09:54 2019 -0800

    Removed unused import to satisfy PMD check

commit ecb0f483a9
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 11 10:37:11 2019 -0800

    improved doc

commit f0eddee665
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 11 10:18:45 2019 -0800

    Added more javadoc

commit 5f92dd7325
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 11 10:05:58 2019 -0800

    Unused import

commit 93e1636287
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 11 10:03:14 2019 -0800

    Added javadoc on ScanResultValueTimestampComparator

commit 134041c479
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 8 13:13:54 2019 -0800

    Renamed sort function

commit 2e3577cd3d
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 7 13:01:25 2019 -0800

    Fixed benchmark queries

commit d3b335af42
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 7 11:08:07 2019 -0800

    added all query types to scan benchmark

commit ab00eade9f
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Thu Feb 7 09:42:48 2019 -0800

    Kicking travis with change to benchmark param

commit b432beaf84
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 17:45:59 2019 -0800

    Fixed failing calcite tests

commit b2c8c77ad4
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 17:39:48 2019 -0800

    Fixing tests WIP

commit 85e72a614e
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 15:42:02 2019 -0800

    Set to spaces over tabs

commit 7e872a8ebc
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 15:36:24 2019 -0800

    Created an error message for when someone tries to time order a result
    set > threshold limit

commit e8a4b49044
Merge: 305876a43 8e3a58f72
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 15:05:11 2019 -0800

    Merge branch 'master' into 6088-Time-Ordering-On-Scans-V2

commit 305876a434
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 15:02:02 2019 -0800

    nit

commit 8212a21caf
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 14:40:35 2019 -0800

    Improved conciseness

commit 10b5e0ca93
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 13:42:12 2019 -0800

    .

commit dfe4aa9681
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 13:41:18 2019 -0800

    Fixed codestyle and forbidden API errors

commit 148939e88b
Merge: 4f51024b3 5edbe2ae1
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 13:26:17 2019 -0800

    Merge branch '6088-Create-Scan-Benchmark' into 6088-Time-Ordering-On-Scans-V2

commit 5edbe2ae12
Merge: 60b7684db 315ccb76b
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 13:18:55 2019 -0800

    Merge github.com:apache/incubator-druid into 6088-Create-Scan-Benchmark

commit 60b7684db7
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 13:02:13 2019 -0800

    Committing a param change to kick teamcity

commit 4f51024b31
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 12:08:12 2019 -0800

    Wrote more tests for scan result value sort

commit 8b7d5f5081
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Wed Feb 6 11:55:09 2019 -0800

    Wrote tests for heapsort scan result values and fixed bug where iterator
    wasn't returning elements in correct order

commit b6d4df3864
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 5 16:45:20 2019 -0800

    Decrease segment size for less memory usage

commit d1a1793f36
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 5 12:40:26 2019 -0800

    nit

commit 7deb06f6df
Merge: b7d3a4900 86c5eee13
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 5 10:53:38 2019 -0800

    Merge branch '6088-Create-Scan-Benchmark' into 6088-Time-Ordering-On-Scans-V2

commit 86c5eee13b
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 5 10:31:27 2019 -0800

    Broke some long lines into two lines

commit b7d3a4900a
Merge: 796083f2b 8bc5eaa90
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 5 10:23:32 2019 -0800

    Merge branch 'master' into 6088-Time-Ordering-On-Scans-V2

commit 737a83321d
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Tue Feb 5 10:15:32 2019 -0800

    Made Jon's changes and removed TODOs

commit 796083f2bb
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 15:37:42 2019 -0800

    Benchmark param change

commit 20c36644db
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 15:36:35 2019 -0800

    More param changes

commit 9e6e71616b
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 15:31:21 2019 -0800

    Changed benchmark params

commit 01b25ed112
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 14:36:18 2019 -0800

    Added time ordering to the scan benchmark

commit 432acaf085
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 12:03:14 2019 -0800

    Change number of benchmark iterations

commit 12e51a2721
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 12:02:13 2019 -0800

    Added TimestampComparator tests

commit e66339cd76
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 10:56:41 2019 -0800

    Remove todos

commit ad731a362b
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 10:55:56 2019 -0800

    Change benchmark

commit 989bd2d50e
Merge: 7b5847139 26930f8d2
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Mon Feb 4 10:46:38 2019 -0800

    Merge branch '6088-Create-Scan-Benchmark' into 6088-Time-Ordering-On-Scans-V2

commit 7b58471394
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Sat Feb 2 03:48:18 2019 -0800

    Licensing stuff

commit 79e8319383
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 1 18:22:58 2019 -0800

    Move ScanResultValue timestamp comparator to a separate class for testing

commit 7a6080f636
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 1 18:00:58 2019 -0800

    Stuff for time-ordered scan query

commit 26930f8d20
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 1 16:38:49 2019 -0800

    It runs.

commit dd4ec1ac9c
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 1 15:12:17 2019 -0800

    Need to form queries

commit dba6e492a0
Merge: 10e57d5f9 7d4cc2873
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 1 14:13:39 2019 -0800

    Merge branch 'master' into 6088-Create-Scan-Benchmark

commit 10e57d5f9e
Author: Justin Borromeo <jborrome@edu.uwaterloo.ca>
Date:   Fri Feb 1 14:04:13 2019 -0800

    Moved Scan Builder to Druids class and started on Scan Benchmark setup

* Changed SQL planning to use scan over select

* Fixed some bugs

* Removed unused imports

* Updated calcite query test and test segment walker

* Fixed formatting recommendations
2019-04-02 15:46:01 -07:00
Xue Yu 78fd5aff21 support radians and degrees in sql (#7336)
* support radians and degrees in sql

* update test case
2019-04-02 12:47:49 -07:00
Justin Borromeo ad7862c58a Time Ordering On Scans (#7133)
* Moved Scan Builder to Druids class and started on Scan Benchmark setup

* Need to form queries

* It runs.

* Stuff for time-ordered scan query

* Move ScanResultValue timestamp comparator to a separate class for testing

* Licensing stuff

* Change benchmark

* Remove todos

* Added TimestampComparator tests

* Change number of benchmark iterations

* Added time ordering to the scan benchmark

* Changed benchmark params

* More param changes

* Benchmark param change

* Made Jon's changes and removed TODOs

* Broke some long lines into two lines

* nit

* Decrease segment size for less memory usage

* Wrote tests for heapsort scan result values and fixed bug where iterator
wasn't returning elements in correct order

* Wrote more tests for scan result value sort

* Committing a param change to kick teamcity

* Fixed codestyle and forbidden API errors

* .

* Improved conciseness

* nit

* Created an error message for when someone tries to time order a result
set > threshold limit

* Set to spaces over tabs

* Fixing tests WIP

* Fixed failing calcite tests

* Kicking travis with change to benchmark param

* added all query types to scan benchmark

* Fixed benchmark queries

* Renamed sort function

* Added javadoc on ScanResultValueTimestampComparator

* Unused import

* Added more javadoc

* improved doc

* Removed unused import to satisfy PMD check

* Small changes

* Changes based on Gian's comments

* Fixed failing test due to null resultFormat

* Added config and get # of segments

* Set up time ordering strategy decision tree

* Refactor and pQueue works

* Cleanup

* Ordering is correct on n-way merge -> still need to batch events into
ScanResultValues

* WIP

* Sequence stuff is so dirty :(

* Fixed bug introduced by replacing deque with list

* Wrote docs

* Multi-historical setup works

* WIP

* Change so batching only occurs on broker for time-ordered scans

Restricted batching to broker for time-ordered queries and adjusted
tests

Formatting

Cleanup

* Fixed mistakes in merge

* Fixed failing tests

* Reset config

* Wrote tests and added Javadoc

* Nit-change on javadoc

* Checkstyle fix

* Improved test and appeased TeamCity

* Sorry, checkstyle

* Applied Jon's recommended changes

* Checkstyle fix

* Optimization

* Fixed tests

* Updated error message

* Added error message for UOE

* Renaming

* Finish rename

* Smarter limiting for pQueue method

* Optimized n-way merge strategy

* Rename segment limit -> segment partitions limit

* Added a bit of docs

* More comments

* Fix checkstyle and test

* Nit comment

* Fixed failing tests -> allow usage of all types of segment spec

* Fixed failing tests -> allow usage of all types of segment spec

* Revert "Fixed failing tests -> allow usage of all types of segment spec"

This reverts commit ec470288c7.

* Revert "Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge"

This reverts commit 57033f36df, reversing
changes made to 8f01d8dd16.

* Check type of segment spec before using for time ordering

* Fix bug in numRowsScanned

* Fix bug messing up count of rows

* Fix docs and flipped boolean in ScanQueryLimitRowIterator

* Refactor n-way merge

* Added test for n-way merge

* Refixed regression

* Checkstyle and doc update

* Modified sequence limit to accept longs and added test for long limits

* doc fix

* Implemented Clint's recommendations
2019-03-28 14:37:09 -07:00
Roman Leventov bca40dcdaf
Fix some IntelliJ inspections (#7273)
Prepare TeamCity for IntelliJ 2018.3.1 upgrade. Mostly removed redundant exceptions declarations in `throws` clauses.
2019-03-25 21:11:01 -03:00
Gian Merlino 4ca5fe0f60 SQL: Add PARSE_LONG function. (#7326)
* SQL: Add PARSE_LONG function.

* Fix test.
2019-03-22 15:40:10 -07:00
Roman Leventov dfd27e00c0
Avoid many unnecessary materializations of collections of 'all segments in cluster' cardinality (#7185)
* Avoid many  unnecessary materializations of collections of 'all segments in cluster' cardinality

* Fix DruidCoordinatorTest; Renamed DruidCoordinator.getReplicationStatus() to computeUnderReplicationCountsPerDataSourcePerTier()

* More Javadocs, typos, refactor DruidCoordinatorRuntimeParams.createAvailableSegmentsSet()

* Style

* typo

* Disable StaticPseudoFunctionalStyleMethod inspection because of too much false positives

* Fixes
2019-03-19 18:22:56 -03:00
Furkan KAMACI 7ada1c49f9 Prohibit Throwables.propagate() (#7121)
* Throw caught exception.

* Throw caught exceptions.

* Related checkstyle rule is added to prevent further bugs.

* RuntimeException() is used instead of Throwables.propagate().

* Missing import is added.

* Throwables are propogated if possible.

* Throwables are propogated if possible.

* Throwables are propogated if possible.

* Throwables are propogated if possible.

* * Checkstyle definition is improved.
* Throwables.propagate() usages are removed.

* Checkstyle pattern is changed for only scanning "Throwables.propagate(" instead of checking lookbehind.

* Throwable is kept before firing a Runtime Exception.

* Fix unused assignments.
2019-03-14 18:28:33 -03:00
Clint Wylie d7ba19d477 sql, filters, and virtual columns (#6902)
* refactor sql planning to re-use expression virtual columns when possible when constructing a DruidQuery, allowing virtual columns to be defined in filter expressions, and making resulting native druid queries more concise. also minor refactor of built-in sql aggregators to maximize code re-use

* fix it

* fix it in the right place

* fixup for base64 stuff

* fixup tests

* fix merge conflict on import order

* fixup

* fix imports

* fix tests

* review comments

* refactor

* re-arrange

* better javadoc

* fixup merge

* fixup tests

* fix accidental changes
2019-03-11 11:37:58 -07:00
Xue Yu 65118277a3 support sin cos etc trigonometric function in sql (#7182)
* support triangle function in sql

* feedback address
2019-03-04 19:18:22 -08:00
Himanshu Pandey 8b803cbc22 Added checkstyle for "Methods starting with Capital Letters" (#7118)
* Added checkstyle for "Methods starting with Capital Letters" and changed the method names violating this.

* Un-abbreviate the method names in the calcite tests

* Fixed checkstyle errors

* Changed asserts position in the code
2019-02-23 20:10:31 -08:00
Surekha 02ef14f262 Fix num_rows in sys.segments (#6888)
* Fix the bug with num_rows in sys.segments

* Fix segmentMetadataInfo update in DruidSchema
* Add numRows to SegmentMetadataHolder builder's constructor, so it's not overwritten
* Rename SegSegmentSignature to setSegmentMetadataHolder and fix it so nested map is appended instead of recreated
* Replace Map<String, Set<String>> segmentServerMap with Set<String> for num_replica

* Remove unnecessary code and update test

* Add unit test for num_rows

* PR comments

* change access modifier to default package level

* minor changes to comments

* PR comments
2019-02-11 16:21:19 -08:00
Jonathan Wei fafbc4a80e
Set version to 0.15.0-incubating-SNAPSHOT (#7014) 2019-02-07 14:02:52 -08:00
Justin Borromeo 6723243ed2 Create Scan Benchmark (#6986)
* Moved Scan Builder to Druids class and started on Scan Benchmark setup

* Need to form queries

* It runs.

* Remove todos

* Change number of benchmark iterations

* Changed benchmark params

* More param changes

* Made Jon's changes and removed TODOs

* Broke some long lines into two lines

* Decrease segment size for less memory usage

* Committing a param change to kick teamcity
2019-02-06 14:45:01 -08:00
Surekha ef451d3603 Add null checks in DruidSchema (#6830)
* Add null checks in DruidSchema

* Add unit tests

* Add VisibleForTesting annotation

* PR comments

* unused import
2019-02-05 13:42:20 -08:00
Jonathan Wei 8bc5eaa908
Set version to 0.14.0-incubating-SNAPSHOT (#7003) 2019-02-04 19:36:20 -08:00
Roman Leventov 0e926e8652 Prohibit assigning concurrent maps into Map-typed variables and fields and fix a race condition in CoordinatorRuleManager (#6898)
* Prohibit assigning concurrent maps into Map-types variables and fields; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool

* Enforce that if compute(), computeIfAbsent(), computeIfPresent() or merge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java

* Remove unnecessary comment

* Checkstyle

* Fix getFromExtensions()

* Add a reference to the comment about guarded computeIfAbsent() optimization; IdentityHashMap optimization

* Fix UriCacheGeneratorTest

* Workaround issue with MaterializedViewQueryQueryToolChest

* Strengthen Appenderator's contract regarding concurrency
2019-02-04 09:18:12 -08:00
Surekha 7baa33049c Introduce published segment cache in broker (#6901)
* Add published segment cache in broker

* Change the DataSegment interner so it's not based on DataSEgment's equals only and size is preserved if set

* Added a trueEquals to DataSegment class

* Use separate interner for realtime and historical segments

* Remove trueEquals as it's not used anymore, change log message

* PR comments

* PR comments

* Fix tests

* PR comments

* Few more modification to

* change the coordinator api
* removeall segments at once from MetadataSegmentView in order to serve a more consistent view of published segments
* Change the poll behaviour to avoid multiple poll execution at same time

* minor changes

* PR comments

* PR comments

* Make the segment cache in broker off by default

* Added a config to PlannerConfig
* Moved MetadataSegmentView to sql module

* Add doc for new planner config

* Update documentation

* PR comments

* some more changes

* PR comments

* fix test

* remove unintentional change, whether to synchronize on lifecycleLock is still in discussion in PR

* minor changes

* some changes to initialization

* use pollPeriodInMS

* Add boolean cachePopulated to check if first poll succeeds

* Remove poll from start()

* take the log message out of condition in stop()
2019-02-02 22:27:13 -08:00
Clint Wylie 7a5827e12e bloom filter sql aggregator (#6950)
* adds sql aggregator for bloom filter, adds complex value serde for sql results

* fix tests

* checkstyle

* fix copy-paste
2019-02-01 13:54:46 -08:00
Clint Wylie af3cbc3687 add bloom filter druid expression (#6904)
* add "bloom_filter_test" druid expression to support bloom filters in ExpressionVirtualColumn and ExpressionDimFilter and sql expressions

* more docs

* use java.util.Base64, doc fixes
2019-01-28 08:41:45 -05:00
Benedict Jin 2b73644340 * Use `@SuppressWarnings("GuardedBy")` instead of `noinspection FieldAccessNotGuarded` comment (#6903)
* Remove `@GuardedBy("connectionLock")` from `connectionLock` itself

* Add FieldAccessNotGuarded into inspection profile and set the level to ERROR
2019-01-27 12:42:45 -08:00
Clint Wylie 66f64cd8bd fix long/float/double dimension filtering for columns with nulls (#6906)
* fix long,float, double dimension filtering when sql compatible null handling is enabled and the column has null values

* revert unintended change

* fix tests
2019-01-23 22:36:52 -08:00
Roman Leventov 8eae26fd4e Introduce SegmentId class (#6370)
* Introduce SegmentId class

* tmp

* Fix SelectQueryRunnerTest

* Fix indentation

* Fixes

* Remove Comparators.inverse() tests

* Refinements

* Fix tests

* Fix more tests

* Remove duplicate DataSegmentTest, fixes #6064

* SegmentDescriptor doc

* Fix SQLMetadataStorageUpdaterJobHandler

* Fix DataSegment deserialization for ignoring id

* Add comments

* More comments

* Address more comments

* Fix compilation

* Restore segment2 in SystemSchemaTest according to a comment

* Fix style

* fix testServerSegmentsTable

* Fix compilation

* Add comments about why SegmentId and SegmentIdWithShardSpec are separate classes

* Fix SystemSchemaTest

* Fix style

* Compare SegmentDescriptor with SegmentId in Javadoc and comments rather than with DataSegment

* Remove a link, see https://youtrack.jetbrains.com/issue/IDEA-205164

* Fix compilation
2019-01-21 11:11:10 -08:00
zhaojiandong 9f0fdcfef6 Fix deadlock in DruidStatement & DruidConnection (#6868)
* Fix deadlock in DruidStatement & DruidConnection

* change statements type to ConcurrentMap
2019-01-17 10:16:35 -08:00
Dayue Gao 5b8a221713 Add SQL id, request logs, and metrics (#6302)
* use SqlLifecyle to manage sql execution, add sqlId

* add sql request logger

* fix UT

* rename sqlId to sqlQueryId, sql/time to sqlQuery/time, etc

* add docs and more sql request logger impls

* add UT for http and jdbc

* fix forbidden use of com.google.common.base.Charsets

* fix UT in QuantileSqlAggregatorTest, supressed unused warning of getSqlQueryId

* do not use default method in QueryMetrics interface

* capitalize 'sql' everywhere in the non-property parts of the docs

* use RequestLogger interface to log sql query

* minor bugfixes and add switching request logger

* add filePattern configs for FileRequestLogger

* address review comments, adjust sql request log format

* fix inspection error

* try SuppressWarnings("RedundantThrows") to fix inspection error on ComposingRequestLoggerProvider
2019-01-15 23:12:59 -08:00
Surekha f72f33f84a Fix num_replicas count in sys.segments table (#6804)
* Fix num_replicas count from sys.segments

* Adjust unit test for num_replica > 1

* Pass named arguments instead of passing boolean constants

* Address PR comments

* PR comments
2019-01-15 08:31:29 -08:00
Charles Allen 5d2947cd52 Use Guava Compatible immediate executor service (#6815)
* Use multi-guava version friendly direct executor implementation

* Don't use a singleton

* Fix strict compliation complaints

* Copy Guava's DirectExecutor

* Fix javadoc

* Imports are the devil
2019-01-11 10:42:19 -08:00
Gian Merlino bc671ac436
SQL: Fix ordering of sort, sortProject in DruidSemiJoin. (#6769)
They were added in the wrong order, leading to this error message
when evaluating rules:

"Cannot move from stage[AGGREGATE] to stage[SORT_PROJECT]"
2019-01-03 10:36:28 -08:00
Surekha 5e5aad49e6 Set is_available to false by default for published segment (#6757)
* Set is_available to false by default for published segment

* Address comments

Fix the is_published value for segments not in metadata store

* Remove unused import

* Use non-null sharSpec for a segment in test

* Fix checkstyle

* Modify comment
2018-12-20 13:29:00 -08:00
Gian Merlino f0b7c272b9 Broker: Start up DruidSchema immediately if there are no segments. (#6765)
Fixes a bug introduced in #6742, where the broker would delay startup
indefinitely if there were no segments at all being served by any
data servers.
2018-12-20 11:07:35 -07:00
Gian Merlino 7a09cde4de
Broker: Await initialization before finishing startup. (#6742)
* Broker: Await initialization before finishing startup.

In particular, hold off on announcing the service and starting the
HTTP server until the server view and SQL metadata cache are finished
initializing. This closes a window of time where a Broker could return
partial results shortly after startup.

As part of this, some simplification of server-lifecycle service
announcements. This helps ensure that the two different kinds of
announcements we do (legacy and new-style) stay in sync.

* Remove unused imports.

* Fix NPE in ServerRunnable.
2018-12-18 20:32:31 -08:00
Gian Merlino f12a1aa993 SQL: Add support for queries with project-after-semijoin. (#6756)
* SQL: Add support for queries with project-after-semijoin.

These didn't work before, since the top Project rel wasn't getting
merged into the DruidSemiJoin rel. This patch allows that to happen.

* Null handling

* Null handling

* Null handling
2018-12-18 17:53:14 -08:00
Roman Leventov ec38df7575
Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() (#6606)
* Simplify DruidNodeDiscoveryProvider; add DruidNodeDiscovery.Listener.nodeViewInitialized() method; prohibit and eliminate some suboptimal Java 8 patterns

* Fix style

* Fix HttpEmitterTest.timeoutEmptyQueue()

* Add DruidNodeDiscovery.Listener.nodeViewInitialized() calls in tests

* Clarify code
2018-12-01 01:12:56 +01:00
Clint Wylie efdec50847 bloom filter sql (#6502)
* bloom filter sql support

* docs

* style fix

* style fixes after rebase

* use copied/patched bloomkfilter

* remove context literal lookup function, changes from review

* fix build

* rename LookupOperatorConversion to QueryLookupOperatorConversion

* remove doc

* revert unintended change

* add internal exception to bloom filter deserialization exception
2018-11-27 14:11:18 +08:00
Roman Leventov 87b96fb1fd
Add checkstyle rules about imports and empty lines between members (#6543)
* Add checkstyle rules about imports and empty lines between members

* Add suppressions

* Update Eclipse import order

* Add empty line

* Fix StatsDEmitter
2018-11-20 12:42:15 +01:00
Gian Merlino e9c3d3e651 SystemSchema: Fix data types for various fields. (#6642)
* SystemSchema: Fix data types for various fields.

- segments: start, end, partition_num
- servers: plaintext_port, tls_port
- tasks: plaintext_port, tls_port

The declared and actual types did not match, but they must or
else queries may generate ClassCastExceptions.

Also adjusted some of the code for generating values to be more
robust in the face of nulls or malformed strings.

* Fix style.
2018-11-19 09:24:19 +08:00
Roman Leventov 8f3fe9cd02 Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies (#6607)
* Prohibit String.replace() and String.replaceAll(), fix and prohibit some toString()-related redundancies

* Fix bug

* Replace checkstyle regexp with IntelliJ inspection
2018-11-15 13:21:34 -08:00
Gian Merlino 80173b5d29 SQL: Set INFORMATION_SCHEMA catalog name to "druid". (#6595)
* SQL: Set INFORMATION_SCHEMA catalog name to "druid".

Some third party tools ignore catalogs with empty names. So using
the name "druid" for the catalog makes integration easier.

* Update tests.
2018-11-14 06:32:40 +08:00
Gian Merlino ab518781bb SQL: Support AVG on system tables. (#6601) 2018-11-14 06:31:33 +08:00
Gian Merlino 154b6fbcef SQL: Add "POSITION" function. (#6596)
Also add a "fromIndex" argument to the strpos expression function. There
are some -1 and +1 adjustment terms due to the fact that the strpos
expression behaves like Java indexOf (0-indexed), but the POSITION SQL
function is 1-indexed.
2018-11-13 13:39:00 -08:00
Roman Leventov 54351a5c75 Fix various bugs; Enable more IntelliJ inspections and update error-prone (#6490)
* Fix various bugs; Enable more IntelliJ inspections and update error-prone

* Fix NPE

* Fix inspections

* Remove unused imports
2018-11-06 14:38:08 -08:00
Surekha bcb754d066 Use current coordinator leader instead of cached one (#6551) (#6552)
* Use current coordinator leader instead of cached one (#6551)

Check the response status and throw exception if not OK

* Modify tests

* PR comment

* Add the correct check for status of BytesAccumulatingResponseHandler

* Move the status check into JsonParserIterator so sql query outputs meaningful message on failure

* Fix tests
2018-11-06 13:09:51 -08:00
QiuMM 676f5e6d7f Prohibit some guava collection APIs and use JDK collection APIs directly (#6511)
* Prohibit some guava collection APIs and use JDK APIs directly

* reset files that changed by accident

* sort codestyle/druid-forbidden-apis.txt alphabetically
2018-10-29 13:02:43 +01:00
Roman Leventov 84ac18dc1b
Catch some incorrect method parameter or call argument formatting patterns with checkstyle (#6461)
* Catch some incorrect method parameter or call argument formatting patterns with checkstyle

* Fix DiscoveryModule

* Inline parameters_and_arguments.txt

* Fix a bug in PolyBind

* Fix formatting
2018-10-23 07:17:38 -03:00
QiuMM 85a89e2703 make druid node bind address configurable (#6464)
* make druid node bind address configurable

* fix tests

* fix travis-ci
2018-10-15 14:19:40 -07:00
Gian Merlino f537c0069a SQL: Support for selecting multi-value dimensions. (#6462)
* SQL: Support for selecting multi-value dimensions.

Fixes #4637. Doesn't completely address everything mentioned in #4638,
but at least fixes one issue on the way there.

* Fix null cases in tests.
2018-10-15 14:01:21 -07:00
QiuMM 6c71ee5ed5 fix type mismatch caused by #6377 (#6466) 2018-10-15 17:34:18 +09:00
Clint Wylie 84598fba3b combine druid-api, druid-common, java-util into druid-core (#6443)
* combine druid-api, druid-common, java-util

* spacing
2018-10-14 20:37:37 -07:00
Roman Leventov e3397ba00f Enforce Druid's exception class use (#6456) 2018-10-13 16:35:14 -07:00
Surekha e908fd6db7 Add check for nullable numRows (#6460)
* Add check for nullable numRows

* Make numRows long instead of Long type

* Add check for numRows in unit test
* small refactoring

* Modify test

PR comment from https://github.com/apache/incubator-druid/pull/6094#pullrequestreview-163937783

* Add a test for serverSegments table

* update tests
2018-10-13 15:08:42 -07:00
Surekha 3be4a97150 Fix inconsistent segment size(#6448) (#6451)
* Fix inconsistent segment size(#6448)

* Fix the segment size for published segments
* Changes to get numReplicas
* Make coordinator segments API truly streaming

* Changes to store partial segment data

* Simplify SegmentMetadataHolder
* Store partial the columns from available segments

* Address comments
2018-10-12 12:55:20 -07:00
David Lim 20ab213ba6 change project versions to 0.13.0-incubating-SNAPSHOT (#6453) 2018-10-11 19:28:01 -07:00
Surekha 3a0a667fe0 Introduce SystemSchema tables (#5989) (#6094)
* Added SystemSchema with following tables (#5989)

* SEGMENTS table provides details on served and published segments
* SERVERS table provides details on data servers
* SERVERSEGMETS table is the JOIN of SEGMENTS and SERVERS
* TASKS table provides details on tasks

* Add documentation for system schema

* Fix static-analysis warnings

* Address PR comments

*Add unit tests

* Fix a test

* Try to fix a test

* Fix a bug around replica count

* rename io.druid to org.apache.druid

* Major change is to make tasks and segment queries streaming

* Made tasks/segments stream to calcite instead of storing it in memory
* Add num_rows to segments table
* Refactor JsonParserIterator
* Replace with closeable iterator

* Fix docs, make num_rows column nullable, some unit test changes

* make num_rows column type long, allow it to be null

fix a compile error after merge, add TrafficCop param to InputStreamResponseHandler

* Filter null rows for segments table from Linq4j enumerable

* change num_replicas datatype to long in segments table

* Fix some tests and address comments

* Doc updates, other PR comments

* Update tests

* Address comments

* Add auth check
* Update docs
* Refactoring

* Fix teamcity warning, change the getQueryableServer in TimelineServerView

* Fix compilation after rebase

* Use the stream API from AuthorizationUtils

* Added LeaderClient interface and NoopDruidLeaderClient class

* Revert "Added LeaderClient interface and NoopDruidLeaderClient class"

This reverts commit 100fa46e39.

* Make the naming consistent to server_segments for the join table

* Add ForbiddenException on auth check failure
* Remove static block from SystemSchema

* Try to fix a test in CalciteQueryTest due to rename of server_segments

* Fix the json output format in the coordinator API

* Add auth check in the segments API
* Add null check to avoid NPE

* Use annonymous class object instead of mock for DruidLeaderClient in SqlBenchmark

* Fix test failures, type long/BIGINT can be nullable

* Revert long nullability to fix tests

* Fix style for tests

* PR comments

* Address PR comments

* Add the missing BytesAccumulatingResponseHandler class

* Use Sequences.withBaggage in DruidPlanner

* Fix docs, add comments

* Close the iterator if hasNext returns false
2018-10-10 17:17:29 -07:00
Roman Leventov 3ae563263a
Renamed 'Generic Column' -> 'Numeric Column'; Fixed a few resource leaks in processing; misc refinements (#5957)
This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR.

Some of the changes:
 - Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names.
 - Generified `ComplexMetricExtractor`
2018-10-02 14:50:22 -03:00
Gian Merlino 244046fda5 SQL: Fix too-long headers in http responses. (#6411)
Fixes #6409 by moving column name info from HTTP headers into the
result body.
2018-10-01 18:13:08 -07:00
Gian Merlino 3548396a45 SQL: Update to Calcite 1.17.0. (#6404)
* SQL: Update to Calcite 1.17.0.

Other than keeping things fresh, another motivation is that
this fixes CALCITE-1436 (AggregateNode NPE for aggregators other
than SUM/COUNT), which affects aggregate functions on our system
tables.

Also sets shouldConvertRaggedUnionTypesToVarying = true, a new
type system parameter that prefers VARCHAR over CHAR. This is
better for Druid, because we don't really have support for a
true CHAR type.

* Remove unused import.
2018-09-29 18:33:29 -07:00
Gian Merlino 3922582d8c
SQL: Fix too-strict check in SortProject. (#6403)
The "Duplicate field name" check on inputRowSignature is too strict:
it is actually fine for a row signature to have the same field name
twice. It happens when the same expression is selected twice, and
both selections map to the same Druid object (dimension, aggregator,
etc).

I did not succeed in writing a test that triggers this, but I did see
it occur in production for a complex query with hundreds of aggregators.
2018-09-29 13:54:34 -07:00
Gian Merlino 0da042cdd9
SQL: Unwrap IS_TRUE, IS_FALSE and friends when building a filter. (#6374)
* SQL: Unwrap IS_TRUE, IS_FALSE and friends when building a filter.

* Fix test.
2018-09-25 10:37:02 -07:00
Dayue Gao edf0c13807 add a sql option to force user to specify time condition (#6246)
* add a sql option to force user to specify time condition

* rename forceTimeCondition to requireTimeCondition, refine error message
2018-09-17 13:52:24 -07:00
Roman Leventov 0c4bd2b57b Prohibit some Random usage patterns (#6226)
* Prohibit Random usage patterns

* Fix FlattenJSONBenchmarkUtil
2018-09-14 13:35:51 -07:00
Gian Merlino 4669f0878f SQL: UNION ALL operator. (#6314)
* SQL: UNION ALL operator.

* Remove unused import.
2018-09-09 22:32:56 -07:00
Dayue Gao 743547fc3b Unauthorized sql request should return 403 (#6279) 2018-09-01 09:17:18 -07:00
Gian Merlino 431d3d8497
Rename io.druid to org.apache.druid. (#6266)
* Rename io.druid to org.apache.druid.

* Fix META-INF files and remove some benchmark results.

* MonitorsConfig update for metrics package migration.

* Reorder some dimensions in inner queries for some reason.

* Fix protobuf tests.
2018-08-30 09:56:26 -07:00
Himanshu 1fae6513e1 add "subtotalsSpec" attribute to groupBy query (#5280)
* add subtotalsSpec attribute to groupBy query

* dont sent subtotalsSpec to downstream nodes from broker and other updates

* address review comment

* fix checkstyle issues after merge to master

* add docs for subtotalsSpec feature

* address doc review comments
2018-08-28 17:46:38 -07:00
Gian Merlino 80224df36a SQL: Fix post-aggregator naming logic for sort-project. (#6250)
The old code assumes that post-aggregator prefixes are one character
long followed by numbers. This isn't always true (we may pad with
underscores to avoid conflicts). Instead, the new code uses a different
base prefix for sort-project postaggregators ("s" instead of "p") and
uses the usual Calcites.findUnusedPrefix function to avoid conflicts.
2018-08-28 10:59:32 -07:00
Dayue Gao a879022bc8 fix AssertionError of semi join query (#6244) 2018-08-27 17:49:51 -07:00
Dayue Gao 2325844a38 fix incorrect check of maxSemiJoinRowsInMemory (#6242) 2018-08-27 16:28:29 -07:00
Gian Merlino 0172326c62 SQL: Support more result formats, add columns header. (#6191)
* SQL: Support more result formats, add columns header.

- Add result formats for line-based JSON and CSV.
- Add X-Druid-Sql-Columns header with a list of all columns that
the response will contain.
- Add more comprehensive documentation on what callers should expect
when making Druid SQL queries.

* Fix some tests.

* Adjust tests.

* Adjust trailer, add types header.

* Fix trailers.
2018-08-26 23:00:14 -06:00
Gian Merlino 28e6ae3664
SQL: Finalize aggregations for inner queries when necessary. (#6221)
* SQL: Finalize aggregations for inner queries when necessary.

Fixes #5779.

* Fixed test method name.
2018-08-25 13:56:23 -07:00
Jihoon Son ecee3e0a24 Further optimize memory for Travis jobs (#6150)
* Further optimize memory for Travis jobs

* fix build

* sudo false
2018-08-10 22:03:36 -07:00
Nishant Bangarwa 75c8a87ce1 Part 2 of changes for SQL Compatible Null Handling (#5958)
* Part 2 of changes for SQL Compatible Null Handling

* Review comments - break lines longer than 120 characters

* review comments

* review comments

* fix license

* fix test failure

* fix CalciteQueryTest failure

* Null Handling - Review comments

* review comments

* review comments

* fix checkstyle

* fix checkstyle

* remove unrelated change

* fix test failure

* fix failing test

* fix travis failures

* Make StringLast and StringFirst aggregators nullable and fix travis failures
2018-08-02 08:20:25 -07:00
Roman Leventov 0754d78a2e Prohibit Lists.newArrayList() with a single argument (#6068)
* Prohibit Lists.newArrayList() with a single argument

* Test fixes

* Add Javadoc to Node constructor
2018-07-31 20:09:10 -07:00
Benedict Jin 331a0afb98 Remove redundant type parameters and enforce some other style and inspection rules (#5980)
* Various changes about druid-services module

* Patch improvements from reviewer

* Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile

* Fix ArraysAsListWithZeroOrOneArgument

* Fix conflict

* Fix ToArrayCallWithZeroLengthArrayArgument

* Fix AliEqualsAvoidNull

* Remove blank line

* Remove unused import clauses

* Fix code style in TopNQueryRunnerTest

* Fix conflict

* Don't use Collections.singletonList when converting the type of array type

* Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase

* Roll back the latest commit

* Add java.io.File#toURL() into druid-forbidden-apis

* Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord

* Add a new regexp element into stylecode xml file

* Fix style error for new regexp

* Set the level of ArraysAsListWithZeroOrOneArgument as WARNING

* Fix style error for new regexp

* Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile

* Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR

* Add toArray(new Object[0]) regexp into checkstyle config file & fix them

* Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it

* Add a comment for string equals regexp in checkstyle config

* Fix code format

* Add RedundantTypeArguments as ERROR level inspection

* Fix cannot resolve symbol datasource
2018-07-27 16:56:49 -05:00
Jonathan Wei efab3b0160 Add concat and textcat SQL functions (#6005) 2018-07-20 11:21:04 -07:00
Gian Merlino cd8ea3da8d
SQL: Add server-wide default time zone config. (#5993)
* SQL: Add server-wide default time zone config.

* Switch API.
2018-07-18 13:12:40 -07:00
Gian Merlino 04ea3c9f8c
Update license headers. (#5976)
* Update license headers.

For compliance with http://www.apache.org/legal/src-headers.html.

* More license adjustments.

* Fix mistakenly edited package line.
2018-07-11 09:55:18 -07:00
Gian Merlino 948e73da77 Extend various test timeouts. (#5978)
False failures on Travis due to spurious timeout (in turn due to noisy
neighbors) is a bigger problem than legitimate failures taking too long
to time out. So it makes sense to extend timeouts.
2018-07-10 13:02:14 -07:00
Surekha 441c9819d9 Support limit for timeseries query (#5894) (#5931)
* Support limit for timeseries query (#5894)

* Fix tests

* Address PR comments

* Try to fix teamcity inspection checks

* Remove unused method from VirtualColumns

* Remove unused import statement
2018-07-09 08:58:42 -07:00
Jihoon Son 10a01d6846 [SQL] Fix missing postAggregations for Timeseries and TopN (#5912)
* [SQL] Fix missing postAggregations for Timeseries and TopN

* fix build

* fix test
2018-06-29 10:36:55 -07:00
Jonathan Wei 0eae89170e
Make DruidPlanner constructor public again (#5891) 2018-06-20 11:10:50 -07:00