343 Commits

Author SHA1 Message Date
AmatyaAvadhanula
f970757efc
Optimize overlord GET /tasks memory usage (#12404)
The web-console (indirectly) calls the Overlord’s GET tasks API to fetch the tasks' summary which in turn queries the metadata tasks table. This query tries to fetch several columns, including payload, of all the rows at once. This introduces a significant memory overhead and can cause unresponsiveness or overlord failure when the ingestion tab is opened multiple times (due to several parallel calls to this API)

Another thing to note is that the task table (the payload column in particular) can be very large. Extracting large payloads from such tables can be very slow, leading to slow UI. While we are fixing the memory pressure in the overlord, we can also fix the slowness in UI caused by fetching large payloads from the table. Fetching large payloads also puts pressure on the metadata store as reported in the community (Metadata store query performance degrades as the tasks in druid_tasks table grows · Issue #12318 · apache/druid )

The task summaries returned as a response for the API are several times smaller and can fit comfortably in memory. So, there is an opportunity here to fix the memory usage, slow ingestion, and under-pressure metadata store by removing the need to handle large payloads in every layer we can. Of course, the solution becomes complex as we try to fix more layers. With that in mind, this page captures two approaches. They vary in complexity and also in the degree to which they fix the aforementioned problems.
2022-06-16 22:30:37 +05:30
superivaj
f9bdb3b236
Fix usage of maxColumnsToMerge in auto-compaction tuning config (#12551)
Issue: 
Even though `CompactionTuningConfig` allows a `maxColumnsToMerge` config
(to optimize memory usage, particulary for datasources with many dimensions),
the corresponding client object `ClientCompactionTaskQueryTuningConfig`
(used by the coordinator duty `CompactSegments` to trigger auto-compaction)
does not contain this field. Thus, the value of `maxColumnsToMerge` specified
in any datasource compaction config is ignored.

Changes:
- Add field `maxColumnsToMerge` in `ClientCompactionTaskQueryTuningConfig`
  and `UserCompactionTaskQueryTuningConfig`
- Fix tests
2022-05-20 22:23:08 +05:30
Gian Merlino
65a1375b67
SQL: Add is_active to sys.segments, update examples and docs. (#11550)
* SQL: Add is_active to sys.segments, update examples and docs.

is_active is short for:

  (is_published = 1 AND is_overshadowed = 0) OR is_realtime = 1

It's important because this represents "all the segments that should
be queryable, whether or not they actually are right now". Most of the
time, this is the set of segments that people will want to look at.

The web console already adds this filter to a lot of its queries,
proving its usefulness.

This patch also reworks the caveat at the bottom of the sys.segments
section, so its information is mixed into the description of each result
field. This should make it more likely for people to see the information.

* Wording updates.

* Adjustments for spellcheck.

* Adjust IT.
2022-05-19 14:23:28 -07:00
Jihoon Son
73ce5df22d
Add support for authorizing query context params (#12396)
The query context is a way that the user gives a hint to the Druid query engine, so that they enforce a certain behavior or at least let the query engine prefer a certain plan during query planning. Today, there are 3 types of query context params as below.

Default context params. They are set via druid.query.default.context in runtime properties. Any user context params can be default params.
User context params. They are set in the user query request. See https://druid.apache.org/docs/latest/querying/query-context.html for parameters.
System context params. They are set by the Druid query engine during query processing. These params override other context params.
Today, any context params are allowed to users. This can cause 
1) a bad UX if the context param is not matured yet or 
2) even query failure or system fault in the worst case if a sensitive param is abused, ex) maxSubqueryRows.

This PR adds an ability to limit context params per user role. That means, a query will fail if you have a context param set in the query that is not allowed to you. To do that, this PR adds a new built-in resource type, QUERY_CONTEXT. The resource to authorize has a name of the context param (such as maxSubqueryRows) and the type of QUERY_CONTEXT. To allow a certain context param for a user, the user should be granted WRITE permission on the context param resource. Here is an example of the permission.

{
  "resourceAction" : {
    "resource" : {
      "name" : "maxSubqueryRows",
      "type" : "QUERY_CONTEXT"
    },
    "action" : "WRITE"
  },
  "resourceNamePattern" : "maxSubqueryRows"
}
Each role can have multiple permissions for context params. Each permission should be set for different context params.

When a query is issued with a query context X, the query will fail if the user who issued the query does not have WRITE permission on the query context X. In this case,

HTTP endpoints will return 403 response code.
JDBC will throw ForbiddenException.
Note: there is a context param called brokerService that is used only by the router. This param is used to pin your query to run it in a specific broker. Because the authorization is done not in the router, but in the broker, if you have brokerService set in your query without a proper permission, your query will fail in the broker after routing is done. Technically, this is not right because the authorization is checked after the context param takes effect. However, this should not cause any user-facing issue and thus should be OK. The query will still fail if the user doesn’t have permission for brokerService.

The context param authorization can be enabled using druid.auth.authorizeQueryContextParams. This is disabled by default to avoid any hassle when someone upgrades his cluster blindly without reading release notes.
2022-04-21 14:21:16 +05:30
Maytas Monsereenusorn
c25a556827
Fix bug in auto compaction preserveExistingMetrics feature (#12438)
* fix bug

* fix test

* fix IT
2022-04-15 15:47:47 -07:00
Agustin Gonzalez
0460d45e92
Make tombstones ingestible by having them return an empty result set. (#12392)
* Make tombstones ingestible by having them return an empty result set.

* Spotbug

* Coverage

* Coverage

* Remove unnecessary exception (checkstyle)

* Fix integration test and add one more to test dropExisting set to false over tombstones

* Force dropExisting to true in auto-compaction when the interval contains only tombstones

* Checkstyle, fix unit test

* Changed flag by mistake, fixing it

* Remove method from interface since this method is specific to only DruidSegmentInputentity

* Fix typo

* Adapt to latest code

* Update comments when only tombstones to compact

* Move empty iterator to a new DruidTombstoneSegmentReader

* Code review feedback

* Checkstyle

* Review feedback

* Coverage
2022-04-15 09:08:06 -07:00
Maytas Monsereenusorn
36e17a20ea
Improve metrics for Auto Compaction (#12413)
* add impl

* add docs

* fix
2022-04-08 20:14:36 -07:00
Maytas Monsereenusorn
8edea5a82d
Add a new flag for ingestion to preserve existing metrics (#12185)
* add impl

* add impl

* fix checkstyle

* add impl

* add unit test

* fix stuff

* fix stuff

* fix stuff

* add unit test

* add more unit tests

* add more unit tests

* add IT

* add IT

* add IT

* add IT

* add ITs

* address comments

* fix test

* fix test

* fix test

* address comments

* address comments

* address comments

* fix conflict

* fix checkstyle

* address comments

* fix test

* fix checkstyle

* fix test

* fix test

* fix IT
2022-04-08 11:02:02 -07:00
AmatyaAvadhanula
c5531be553
Add feature flag for Kinesis listShards API usage (#12383)
listShards API was used to get all the shards for kinesis ingestion to improve its resiliency as part of #12161.

However, this may require additional permissions in the IAM policy where the stream is present. (Please refer to: https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListShards.html).

A dynamic configuration useListShards has been added to KinesisSupervisorTuningConfig to control the usage of this API and prevent issues upon upgrade. It can be safely turned on (and is recommended when using kinesis ingestion) by setting this configuration to true.
2022-04-04 14:58:10 +05:30
Jihoon Son
49a3f4291a
Add an integration test for null-only columns (#12365)
* integration test for null-only-columns

* metadata query

* fix test
2022-03-28 16:40:45 -07:00
Jihoon Son
b6eeef31e5
Store null columns in the segments (#12279)
* Store null columns in the segments

* fix test

* remove NullNumericColumn and unused dependency

* fix compile failure

* use guava instead of apache commons

* split new tests

* unused imports

* address comments
2022-03-23 16:54:04 -07:00
Maytas Monsereenusorn
dbb9518f50
Fix auto compaction by adjusting compaction task's interval to align with segmentGranularity when segmentGranularity is set (#12334)
* add impl

* add ITs

* address comments

* address comments

* address comments

* fix failure

* fix checkstyle

* fix checkstyle
2022-03-18 12:46:16 -07:00
Jihoon Son
5e23674fe5
Fix a race condition in the '/tasks' Overlord API (#12330)
* finds complete and active tasks from the same snapshot

* overlord resource

* unit test

* integration test

* javadoc and cleanup

* more cleanup

* fix test and add more
2022-03-17 10:47:45 +09:00
Agustin Gonzalez
abe76ccb90
Batch ingestion replace (#12137)
* Tombstone support for replace functionality

* A used segment interval is the interval of a current used segment that overlaps any of the input intervals for the spec

* Update compaction test to match replace behavior

* Adapt ITAutoCompactionTest to work with tombstones rather than dropping segments. Add support for tombstones in the broker.

* Style plus simple queriableindex test

* Add segment cache loader tombstone test

* Add more tests

* Add a method to the LogicalSegment to test whether it has any data

* Test filter with some empty logical segments

* Refactor more compaction/dropexisting tests

* Code coverage

* Support for all empty segments

* Skip tombstones when looking-up broker's timeline. Discard changes made to tool chest to avoid empty segments since they will no longer have empty segments after lookup because we are skipping over them.

* Fix null ptr when segment does not have a queriable index

* Add support for empty replace interval (all input data has been filtered out)

* Fixed coverage & style

* Find tombstone versions from lock versions

* Test failures & style

* Interner was making this fail since the two segments were consider equal due to their id's being equal

* Cleanup tombstone version code

* Force timeChunkLock whenever replace (i.e. dropExisting=true) is being used

* Reject replace spec when input intervals are empty

* Documentation

* Style and unit test

* Restore test code deleted by mistake

* Allocate forces TIME_CHUNK locking and uses lock versions. TombstoneShardSpec added.

* Unused imports. Dead code. Test coverage.

* Coverage.

* Prevent killer from throwing an exception for tombstones. This is the killer used in the peon for killing segments.

* Fix OmniKiller + more test coverage.

* Tombstones are now marked using a shard spec

* Drop a segment factory.json in the segment cache for tombstones

* Style

* Style + coverage

* style

* Add TombstoneLoadSpec.class to mapper in test

* Update core/src/main/java/org/apache/druid/segment/loading/TombstoneLoadSpec.java

Typo

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Update docs/configuration/index.md

Missing

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>

* Typo

* Integrated replace with an existing test since the replace part was redundant and more importantly, the test file was very close or exceeding the 10 min default "no output" CI Travis threshold.

* Range does not work with multi-dim

Co-authored-by: Jonathan Wei <jon-wei@users.noreply.github.com>
2022-03-08 20:07:02 -07:00
Xavier Léauté
1434197ee1
update airline dependency to 2.x (#12270)
* upgrade Airline to Airline 2
  https://github.com/airlift/airline is no longer maintained, updating to
  https://github.com/rvesse/airline (Airline 2) to use an actively
  maintained version, while minimizing breaking changes.

  Note, this is a backwards incompatible change, and extensions relying on
  the CliCommandCreator extension point will also need to be updated.

* fix dependency checks where jakarta.inject is now resolved first instead
  of javax.inject, due to Airline 2 using jakarta
2022-02-27 15:19:28 -08:00
AmatyaAvadhanula
1ec57cb935
Improve kinesis task assignment after resharding (#12235)
Problem:
- When a kinesis stream is resharded, the original shards are closed.
   Any intermediate shard created in the process is eventually closed as well.
- If a shard is closed before any record is put into it, it can be safely ignored for ingestion.
- It is expensive to determine if a closed shard is empty, since it requires a call to the Kinesis cluster.

Changes:
- Maintain a cache of closed empty and closed non-empty shards in `KinesisSupervisor`
- Add config `skipIngorableShards` to `KinesisSupervisorTuningConfig`
- The caches are used and updated only when `skipIgnorableShards = true`
2022-02-18 12:37:06 +05:30
Abhishek Agarwal
575874705f
Fix the flakiness in getLockedIntervals test (#12172)
Fix the flakiness in getLockedIntervals test
2022-02-17 12:08:46 +05:30
Daniel Koepke
47153cd7bd
Increase retries for Kinesis sharding integration tests. (#12255)
This fixes intermittent, spurious failures that we've observed in
the Kinesis sharding integration tests due to Kinesis taking
longer than the code expected to start a sharding operation. The
method that's changed is part of the integration test suite and
only used by the test cases that we've seen are flaky.

Prior to this change, the tests expected a sharding operation to
start in 9 seconds (30 retries * 300ms delay/retry). This change
bumps the number of retries to 100, giving Kinesis 30 seconds to
start the sharding.

This PR also makes a small, clarifying change to the condition
used to determine if sharding has started. Instead of checking if
the number of shards has increased (which was technically correct
even if the test is reducing the number of shards due to a Kinesis
implementation detail), we now just check if the shard count has
changed.
2022-02-14 23:33:13 -08:00
Maytas Monsereenusorn
2b8e7fc0b4
Add a flag to allow auto compaction task slot ratio to consider auto scaler slots (#12228)
* add impl

* fix checkstyle

* add unit tests

* checkstyle

* add IT

* fix IT

* add comments

* fix checkstyle
2022-02-06 20:46:05 -08:00
Jihoon Son
20347e0c86
Wait for datasource to be ready for SQL in integration tests (#12189)
* Wait for datasource to be ready for SQL in integration tests

* add limit to the check query
2022-01-25 10:14:26 -08:00
AmatyaAvadhanula
1f63b447c4
Mitigate Kinesis stream LimitExceededException by using listShards API (#12161)
Makes kinesis ingestion resilient to `LimitExceededException` caused by resharding.
Replace `describeStream` with `listShards` (recommended) to get shard related info.
`describeStream` has a limit (100) to the number of shards returned per call and a low default TPS limit of 10.
`listShards` returns the info for at most 1000 shards and has a higher TPS limit of 100 as well.

Key changed/added classes in this PR
 * `KinesisRecordSupplier`
 * `KinesisAdminClient`
2022-01-21 10:15:51 +05:30
Maytas Monsereenusorn
bd7fe45da0
Support adding metrics in Auto Compaction (#12125)
* add impl

* add impl

* add unit tests

* add unit tests

* add unit tests

* add unit tests

* add unit tests

* add integration tests

* add integration tests

* fix LGTM

* fix test

* remove doc
2022-01-17 20:19:31 -08:00
Jihoon Son
4a74c5adcc
Use Druid's extension loading for integration test instead of maven (#12095)
* Use Druid's extension loading for integration test instead of maven

* fix maven command

* override config path

* load input format extensions and kafka by default; add prepopulated-data group

* all docker-composes are overridable

* fix s3 configs

* override config for all

* fix docker_compose_args

* fix security tests

* turn off debug logs for overlord api calls

* clean up stuff

* revert docker-compose.yml

* fix override config for query error test; fix circular dependency in docker compose

* add back some dependencies in docker compose

* new maven profile for integration test

* example file filter
2022-01-05 23:33:04 -08:00
Maytas Monsereenusorn
b53e7f4d12
Support overlapping segment intervals in auto compaction (#12062)
* add impl

* add impl

* fix more bugs

* add tests

* fix checkstyle

* address comments

* address comments

* fix test
2022-01-04 11:47:38 -08:00
Frank Chen
58245b4617
Support JsonPath functions in JsonPath expressions (#11722)
* Add jsonPath functions support

* Add jsonPath function test for Avro

* Add jsonPath function length() to Orc

* Add jsonPath function length() to Parquet

* Add more tests to ORC format

* update doc

* Fix exception during ingestion

* Add IT test case

* Revert "Fix exception during ingestion"

This reverts commit 5a5484b9ea9d984622149c8113a566269cc10842.

* update IT test case

* Add 'keys()'

* Commit IT test case

* Fix UT
2021-12-10 10:53:23 +08:00
Jihoon Son
fc9513b6cd
Make NodeRole available during binding; add support for dynamic registration of DruidService (#12012)
* Make nodeRole available during binding; add support for dynamic registration of DruidService

* fix checkstyle and test

* fix customRole test

* address comments

* add more javadoc
2021-12-03 11:59:00 -08:00
Paul Rogers
a66f10eea1
Code cleanup from query profile project (#11822)
* Code cleanup from query profile project

* Fix spelling errors
* Fix Javadoc formatting
* Abstract out repeated test code
* Reuse constants in place of some string literals
* Fix up some parameterized types
* Reduce warnings reported by Eclipse

* Reverted change due to lack of tests
2021-11-30 11:35:38 -08:00
Frank Chen
98957be044
Return HTTP 404 instead of 400 for supervisor/task endpoints (#11724)
* Use 404 instead of 400

* Use 404 instead of 400

* Add UT test cases

* Add IT testcases

* add UT for task resource filter

Signed-off-by: frank chen <frank.chen021@outlook.com>

* Using org.testing.Assert instead of org.junit.Assert

* Resolve comments and fix test

* Fix test

* Fix tests

* Resolve comments
2021-11-25 13:09:47 +08:00
Maytas Monsereenusorn
bb3d2a433a
Support filtering data in Auto Compaction (#11922)
* add impl

* fix checkstyle

* add test

* add test

* add unit tests

* fix unit tests

* fix unit tests

* fix unit tests

* add IT

* add IT

* add comments

* fix spelling
2021-11-24 10:56:38 -08:00
Gian Merlino
b13f07a057
Harmonize local input sources; fix batch index integration test. (#11965)
* Make LocalInputSource.files a List instead of Set and adjust wikipedia_index_task to use file list.

Rationale: the behavior of wikipedia_index_task.json is order-dependent with regard to its input
files; some orders produce 4 segments and some produce 5 segments. Some integration tests, like
ITSystemTableBatchIndexTaskTest and ITAutoCompactionTest, are written assuming that the
4-segment case will always happen. Providing the file list in a specific order ensures that this
will happen as expected by the tests.

I didn't see a specific reason why the LocalInputSource.files parameter needed to be a Set, so
changing it to a List was the simplest way to achieve the consistent ordering. I think it will
also make the behavior make more sense if someone does specify the same input file multiple
times in a spec: I think they'd expect it to be loaded multiple times instead of deduped. This
is consistent with the behavior of other input sources like S3, GCS, HTTP.

* Sort files in LocalFirehoseFactory.
2021-11-21 22:26:31 -08:00
TSFenwick
1487f558b1
Use a simple class to sanitize JDBC exceptions and also log them (#11843)
* Use a simple class to sanitize sanitizable errors and log them

The purpose of this is to sanitize JDBC errors, but can sanitize other errors
if they implement SanitizableError Interface

add a class to log errors and sanitize them
added a simple test that tests out that the error gets sanitized
add @NonNull annotation to serverconfig's ErrorResponseTransfromStrategy

* return less information as part of too many connections, and instead only log specific details

This is so an end user gets relevant information but not too much info since they might now how
many brokers they have

* return only runtime exceptions

added new error types that need to be sanitized
also sanitize deprecated and unsupported exceptions.

* dont reqrewite exceptions unless necessary for checked exceptions

add docs
avoid blanket turning all exceptions into runtime exceptions

* address comments, to fix up docs.

add more javadocs
add support UOE sanitization

* use try catch instead and sanitize at public methods

* checkstyle fixes

* throw noSuchStatement and NoSuchConnection as Avatica is affected by those

* address comments. move log error back to druid meta

clean up bad formatting and commented code. add missed catch for NoSuchStatementException
clean up comments for error handler and add comment explainging not wanting to santize avatica exceptions

* alter test to reflect new error message
2021-11-16 13:13:03 -08:00
Gian Merlino
6f6e88e02e
SQL: Add type headers to response formats. (#11914)
This allows clients to interpret the results of SQL queries without having to guess types.
2021-11-13 11:30:57 +05:30
Clint Wylie
5baa22148e
revert ColumnAnalysis type, add typeSignature and use it for DruidSchema (#11895)
* revert ColumnAnalysis type, add typeSignature and use it for DruidSchema

* review stuffs

* maybe null

* better maybe null

* Update docs/querying/segmentmetadataquery.md

* Update docs/querying/segmentmetadataquery.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* fix null right

* sad

* oops

* Update batch_hadoop_queries.json

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2021-11-10 18:46:29 -08:00
Jihoon Son
13bec7468a
Fix NPE for SQL queries when a query parameter is missing in the mid (#11900)
* Fix NPE for SQL queries when a query parameter is missing in the mid

* checkstyle

* Throw SqlPlanningException instead of IAE
2021-11-10 10:02:26 -08:00
Maytas Monsereenusorn
ddc68c6a81
Support changing dimension schema in Auto Compaction (#11874)
* add impl

* add unit tests

* fix checkstyle

* add impl

* add impl

* add impl

* add impl

* add impl

* add impl

* fix test

* add IT

* add IT

* fix docs

* add test

* address comments

* fix conflict
2021-11-08 21:17:08 -08:00
Maytas Monsereenusorn
ba2874ee1f
Support changing query granularity in Auto Compaction (#11856)
* add queryGranularity

* fix checkstyle

* fix test
2021-11-01 15:18:44 -07:00
Karan Kumar
90640bb316
Support for hadoop 3 via maven profiles (#11794)
Add support for hadoop 3 profiles . Most of the details are captured in #11791 .
We use a combination of maven profiles and resource filtering to achieve this. Hadoop2 is supported by default and a new maven profile with the name hadoop3 is created. This will allow the user to choose the profile which is best suited for the use case.
2021-10-30 22:46:24 +05:30
Maytas Monsereenusorn
33d9d9bd74
Add rollup config to auto and manual compaction (#11850)
* add rollup to auto and manual compaction

* add unit tests

* add unit tests

* add IT

* fix checkstyle
2021-10-29 10:22:25 -07:00
Kashif Faraz
abac9e39ed
Revert permission changes to Supervisor and Task APIs (#11819)
* Revert "Require Datasource WRITE authorization for Supervisor and Task access (#11718)"

This reverts commit f2d6100124dbe7cbc92ad91d28bd12a1800a1f2a.

* Revert "Require DATASOURCE WRITE access in SupervisorResourceFilter and TaskResourceFilter (#11680)"

This reverts commit 6779c4652d531b4d2c7056a69660f4e318f4aef6.

* Fix docs for the reverted commits

* Fix and restore deleted tests

* Fix and restore SystemSchemaTest
2021-10-25 14:50:38 +05:30
Agustin Gonzalez
887cecf29e
Simplify ITHttpInputSourceTest to mitigate flakiness (#11751)
* Increment retry count to add more time for tests to pass

* Re-enable ITHttpInputSourceTest

* Restore original count

* This test is about input source, hash partitioning takes longer and not required thus changing to dynamic

* Further simplify by removing sketches
2021-10-12 11:51:27 -05:00
Kashif Faraz
f2d6100124
Require Datasource WRITE authorization for Supervisor and Task access (#11718)
Follow up PR for #11680

Description
Supervisor and Task APIs are related to ingestion and must always require Datasource WRITE
authorization even if they are purely informative.

Changes
Check Datasource WRITE in SystemSchema for tables "supervisors" and "tasks"
Check Datasource WRITE for APIs /supervisor/history and /supervisor/{id}/history
Check Datasource for all Indexing Task APIs
2021-10-08 10:39:48 +05:30
Jihoon Son
1c0b76ba93
Add killAndRestart for container for integration tests (#11754) 2021-09-30 13:47:57 -07:00
Clint Wylie
11017ef00a
support jdbc even if trailing / is missing (#11737)
* support jdbc even if trailing / is missing

* fix tests
2021-09-29 13:59:26 -07:00
Maytas Monsereenusorn
a04b08e45c
Add new config to filter internal Druid-related messages from Query API response (#11711)
* add impl

* add impl

* add tests

* add unit test

* fix checkstyle

* address comments

* fix checkstyle

* fix checkstyle

* fix checkstyle

* fix checkstyle

* fix checkstyle

* address comments

* address comments

* address comments

* fix test

* fix test

* fix test

* fix test

* fix test

* change config name

* change config name

* change config name

* address comments

* address comments

* address comments

* address comments

* address comments

* address comments

* fix compile

* fix compile

* change config

* add more tests

* fix IT
2021-09-29 12:55:49 +07:00
Agustin Gonzalez
988623b7ae
ITHttpInputSourceTest instability blocking the development pipeline (#11749) 2021-09-28 13:42:01 -07:00
Clint Wylie
3525c0b195
make authorization integration test more extensible (#11730) 2021-09-22 08:15:30 -07:00
Clint Wylie
5de26cf6d9
add optional system schema authorization (#11720)
* add optional system schema authorization

* remove unused

* adjust docs

* doc fixes, missing ldap config change for integration tests

* style
2021-09-21 13:28:26 -07:00
Lucas Capistrant
5c3f3da146
Add handoff wait time to IngestionStatsAndErrorsTaskReportData (#11090)
* Add handoff wait time to ingestion stats report. Refactor some code for batch handoff

* fix checkstyle

* Add assertion to AbstractITBatchIndexTask to make sure report reflects wait for segments happened

* add docs to the task reports section of doc
2021-09-20 22:48:44 -07:00
Jihoon Son
82049bbf0a
Cancel API for sqls (#11643)
* initial work

* reduce lock in sqlLifecycle

* Integration test for sql canceling

* javadoc, cleanup, more tests

* log level to debug

* fix test

* checkstyle

* fix flaky test; address comments

* rowTransformer

* cancelled state

* use lock

* explode instead of noop

* oops

* unused import

* less aggressive with state

* fix calcite charset

* don't emit metrics when you are not authorized
2021-09-05 10:57:45 -07:00
Jihoon Son
7e90d00cc0
Configurable maxStreamLength for doubles sketches (#11574)
* Configurable maxStreamLength for doubles sketches

* fix equals/hashcode and it test failure

* fix test

* fix it test

* benchmark

* doc

* grouping key

* fix comment

* dependency check

* Update docs/development/extensions-core/datasketches-quantiles.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/sql.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/sql.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/sql.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/sql.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/sql.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/sql.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/sql.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2021-08-31 14:56:37 -07:00