Commit Graph

1225 Commits

Author SHA1 Message Date
Clint Wylie 277aaa5c57
remove druid.processing.columnCache.sizeBytes and CachingIndexed, combine string column implementations (#14500)
* combine string column implementations
changes:
* generic indexed, front-coded, and auto string columns now all share the same column and index supplier implementations
* remove CachingIndexed implementation, which I think is largely no longer needed by the switch of many things to directly using ByteBuffer, avoiding the cost of creating Strings
* remove ColumnConfig.columnCacheSizeBytes since CachingIndexed was the only user
2023-07-02 19:37:15 -07:00
Gian Merlino 58f3faf299
SortMergeJoinFrameProcessor: Fix two bugs with buffering. (#14196)
1) Fix a problem where the fault wasn't reported when the left-hand side
   had too many buffered frames. (Instead, frames continued to be buffered,
   eventually running the server out of memory.)

2) Always update the mark when rewinding isn't necessary. It fixes a problem where
   frames would be needlessly buffered when there isn't a key match across
   the two sides.

3) Memory reserved for building the trackers now change based on the heap sized
2023-07-02 19:52:52 +05:30
Gian Merlino 048dbcee88
MSQ: Improve InsertTimeOutOfBounds error message. (#14511)
Nicer and actionable error message for `InsertTimeOutOfBounds` fault
2023-07-02 01:44:19 +05:30
Gian Merlino 67fbd8e7fc
Add "stringEncoding" parameter to DataSketches HLL. (#11201)
* Add "stringEncoding" parameter to DataSketches HLL.

Builds on the concept from #11172 and adds a way to feed HLL sketches
with UTF-8 bytes.

This must be an option rather than always-on, because prior to this
patch, HLL sketches used UTF-16LE encoding when hashing strings. To
remain compatible with sketch images created prior to this patch -- which
matters during rolling updates and when reading sketches that have been
written to segments -- we must keep UTF-16LE as the default.

Not currently documented, because I'm not yet sure how best to expose
this functionality to users. I think the first place would be in the SQL
layer: we could have it automatically select UTF-8 or UTF-16LE when
building sketches at query time. We need to be careful about this, though,
because UTF-8 isn't always faster. Sometimes, like for the results of
expressions, UTF-16LE is faster. I expect we will sort this out in
future patches.

* Fix benchmark.

* Fix style issues, improve test coverage.

* Put round back, to make IT updates easier.

* Fix test.

* Fix issue with filtered aggregators and add test.

* Use DS native update(ByteBuffer) method. Improve test coverage.

* Add another suppression.

* Fix ITAutoCompactionTest.

* Update benchmarks.

* Updates.

* Fix conflict.

* Adjustments.
2023-06-30 12:45:55 -07:00
Gian Merlino a6cabbe10f
SQL: Avoid "intervals" for non-table-based datasources. (#14336)
In these other cases, stick to plain "filter". This simplifies lots of
logic downstream, and doesn't hurt since we don't have intervals-specific
optimizations outside of tables.

Fixes an issue where we couldn't properly filter on a column from an
external datasource if it was named __time.
2023-06-29 09:57:11 +05:30
Gian Merlino c798d3fb2e
Fix flaky SqlStatementResourceTest. (#14498)
Mocks generally have state and should not be static. In particular, the
"Yielder" included in one of the mocks can only be iterated once, which
made the test suite order-dependent.
2023-06-29 05:42:44 +05:30
Jonathan Wei c36f12f1d8
Support complex variance object inputs for variance SQL agg function (#14463)
* Support complex variance object inputs for variance SQL agg function

* Add test

* Include complexTypeChecker, address PR comments

* Checkstyle, javadoc link
2023-06-28 13:14:19 -05:00
Adarsh Sanjeev 233233c92d
Add query context parameter to control limiting select rows (#14476)
* Add query context parameter to control limiting select rows

* Add unit tests

* Address review comments

* Address review comments

* Address review comments
2023-06-28 17:54:24 +05:30
Karan Kumar cb3a9d2b57
Adding Interactive API's for MSQ engine (#14416)
This PR aims to expose a new API called
"@path("/druid/v2/sql/statements/")" which takes the same payload as the current "/druid/v2/sql" endpoint and allows users to fetch results in an async manner.
2023-06-28 17:51:58 +05:30
Adarsh Sanjeev 0335aaa279
Add query results directory and prevent the auto cleaner from cleaning it (#14446)
Adds support for automatic cleaning of a "query-results" directory in durable storage. This directory will be cleaned up only if the task id is not known to the overlord. This will allow the storage of query results after the task has finished running.
2023-06-28 10:14:04 +05:30
Laksh Singla f546cd64a9
MSQ: Ensure that the allocated segment aligns with the requested granularity (#14475)
Changes:
- Throw an `InsertCannotAllocateSegmentFault` if the allocated segment is not aligned with
the requested granularity.
- Tests to verify new behaviour
2023-06-27 09:25:32 +05:30
Gian Merlino 8211379de6
MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL. (#14310)
* MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL.

This is an undocumented parameter that controls how cluster-by statistics
are merged. In PARALLEL mode, statistics are gathered from workers all
at once. In SEQUENTIAL mode, statistics are gathered time chunk by time
chunk. This improves accuracy for jobs with many time chunks, and reduces
memory usage.

The main downside of SEQUENTIAL is that it can take longer, but in most
situations I've seen, PARALLEL is only really usable in cases where the
sketches are small enough that SEQUENTIAL would also run relatively
quickly. So it seems like SEQUENTIAL is a better default.

* Switch off-test from SEQUENTIAL to PARALLEL.

* Fix sequential merge for situations where there are no time chunks at all.

* Add a couple more tests.
2023-06-26 10:54:28 -07:00
Laksh Singla 114380749d
MSQ: Improve the parse exception errors and the handling of null UTF characters in Strings in Frames (#14398) 2023-06-26 18:14:29 +05:30
Laksh Singla 1647d5f4a0
Limit the subquery results by memory usage (#13952)
Users can now add a guardrail to prevent subquery’s results from exceeding the set number of bytes by setting druid.server.http.maxSubqueryRows in Broker's config or maxSubqueryRows in the query context. This feature is experimental for now and would default back to row-based limiting in case it fails to get the accurate size of the results consumed by the query.
2023-06-26 18:12:28 +05:30
Tejaswini Bandlamudi 72cf91fbc0
Upgrade Avro to latest version (#14440)
Upgraded Avro to 1.11.1
2023-06-24 14:51:30 +05:30
Gian Merlino 3d19b748fb
SQL OperatorConversions: Introduce.aggregatorBuilder, allow CAST-as-literal. (#14249)
* SQL OperatorConversions: Introduce.aggregatorBuilder, allow CAST-as-literal.

Four main changes:

1) Provide aggregatorBuilder, a more consistent way of defining the
   SqlAggFunction we need for all of our SQL aggregators. The mechanism
   is analogous to the one we already use for SQL functions
   (OperatorConversions.operatorBuilder).

2) Allow CASTs of constants to be considered as "literalOperands". This
   fixes an issue where various of our operators are defined with
   OperandTypes.LITERAL as part of their checkers, which doesn't allow
   casts. However, in these cases we generally _do_ want to allow casts.
   The important piece is that the value must be reducible to a constant,
   not that the SQL text is literally a literal.

3) Update DataSketches SQL aggregators to use the new aggregatorBuilder
   functionality. The main user-visible effect here is [2]: the aggregators
   would now accept, for example, "CAST(0.99 AS DOUBLE)" as a literal
   argument. Other aggregators could be updated in a future patch.

4) Rename "requiredOperands" to "requiredOperandCount", because the
   old name was confusing. (It rhymes with "literalOperands" but the
   arguments mean different things.)

* Adjust method calls.
2023-06-23 16:25:04 -07:00
Gian Merlino ddd0fc1b85
S3: Attach SSE key to doesObjectExist calls. (#14290)
* S3: Attach SSE key to doesObjectExist calls.

We did not previously attach the SSE key to the doesObjectExist request,
leading to an inconsistency that may cause problems on "S3-compatible"
implementations. This patch implements doesObjectExist using similar
logic to the S3 client itself, but calls our implementation of
getObjectMetadata rather than the S3 client's, ensuring the request
is decorated with the SSE key.

* Fix tests.
2023-06-23 15:23:59 -07:00
imply-cheddar 7e2cf35d7b
Fix compatibility issue with SqlTaskResource (#14466)
* Fix compatibility issue with SqlTaskResource

The DruidException changes broke the response format
for errors coming back from the SqlTaskResource, so fix those
2023-06-23 01:15:32 -07:00
imply-cheddar cfd07a95b7
Errors take 3 (#14004)
Introduce DruidException, an exception whose goal in life is to be delivered to a user.

DruidException itself has javadoc on it to describe how it should be used.  This commit both introduces the Exception and adjusts some of the places that are generating exceptions to generate DruidException objects instead, as a way to show how the Exception should be used.

This work was a 3rd iteration on top of work that was started by Paul Rogers.  I don't know if his name will survive the squash-and-merge, so I'm calling it out here and thanking him for starting on this.
2023-06-19 01:11:13 -07:00
Abhishek Radhakrishnan 04fb75719e
Fail query planning if a `CLUSTERED BY` column contains descending order (#14436)
* Throw ValidationException if CLUSTERED BY column descending order is specified.

- Fails query planning

* Some more tests.

* fixup existing comment

* Update comment

* checkstyle fix: remove unused imports

* Remove InsertCannotOrderByDescendingFault and deprecate the fault in readme.

* move deprecated field to the bottom
2023-06-16 18:10:12 -04:00
Gian Merlino 85656a467c
MSQ: Load broadcast tables on workers. (#14437)
They were not previously loaded because supportsQueries was false.
This patch sets supportsQueries to true, and clarifies in Task
javadocs that supportsQueries can be true for tasks that aren't
directly queryable over HTTP.
2023-06-16 12:02:20 +05:30
Laksh Singla 4935f2470a
Limit results generated by SELECT queries in MSQ (#14370)
* Limit select results in MSQ

* reduce number of files in test

* add truncated flag

* avoid materializing select results to list, use iterable instead

* javadocs
2023-06-15 13:13:11 +05:30
Clint Wylie ff5ae4db6c
fix kafka input format reader schema discovery and partial schema discovery (#14421)
* fix kafka input format reader schema discovery and partial schema discovery to actually work right, by re-using dimension filtering logic of MapInputRowParser
2023-06-15 00:11:04 -07:00
Pranav 5314db9f85
Adding the file mapper to handle v2 buffer deserialization (#14429) 2023-06-14 19:41:44 -07:00
Clint Wylie 61120dc49a
fix Kafka input format to throw ParseException if timestamp is missing (#14413) 2023-06-13 09:00:11 -07:00
Adarsh Sanjeev 267cbac6ff
Add logs for deleting files using storage connector (#14350)
* Add logs for deleting files using storage connector

* Address review comments

* Update log message format
2023-06-11 21:24:30 +05:30
Kashif Faraz 6e158704cb
Do not retry INSERT task into metadata if max_allowed_packet limit is violated (#14271)
Changes
- Add a `DruidException` which contains a user-facing error message, HTTP response code
- Make `EntryExistsException` extend `DruidException`
- If metadata store max_allowed_packet limit is violated while inserting a new task, throw
`DruidException` with response code 400 (bad request) to prevent retries
- Add `SQLMetadataConnector.isRootCausePacketTooBigException` with impl for MySQL
2023-06-10 12:15:44 +05:30
Atul Mohan 6a4cbab4b8
Upgrade parquet-mr version (#14070)
* Upgrade parquet version

* Move parquet version to hadoop3

* Fix license

* Exclude audience annotations
2023-06-07 08:54:54 -07:00
Soumyava 01b22ca022
Hll Sketch and Theta sketch estimate can now be used as an expression (#14312)
* Hll Sketch estimate can now be used as an expression
* Theta sketch estimate now can be used as an expression
2023-06-06 20:14:25 -07:00
Abhishek Radhakrishnan 2d258a95ad
Fix `EARLIEST_BY`/`LATEST_BY` signature and include function name in signature. (#14352)
* Fix EarliestLatestBySqlAggregator signature; Include function name for all signatures.

* Single quote function signatures, space between args and remove \n.

* fixup UT assertion
2023-06-06 09:41:05 -07:00
Laksh Singla 5da601c47e
fix npe (#14369) 2023-06-06 17:01:42 +05:30
Gian Merlino a0d49baad6
MSQ: Fix issue with rollup ingestion and aggregators with multiple names. (#14367)
The same aggregator can have two output names for a SQL like:

  INSERT INTO foo
  SELECT x, COUNT(*) AS y, COUNT(*) AS z
  FROM t
  GROUP BY 1
  PARTITIONED BY ALL

In this case, the SQL planner will create a query with a single "count"
aggregator mapped to output names "y" and "z". The prior MSQ code did
not properly handle this case, instead throwing an error like:

  Expected single output for query column[a0] but got [[1, 2]]
2023-06-06 10:28:41 +05:30
zachjsh 04a82da63d
Input source security fixes (#14266)
It was found that several supported tasks / input sources did not have implementations for the methods used by the input source security feature, causing these tasks and input sources to fail when used with this feature. This pr adds the needed missing implementations. Also securing the sampling endpoint with input source security, when enabled.
2023-06-01 16:37:19 -07:00
zachjsh e75fb8e8e3
Account for data format and compression in MSQ auto taskAssignment (#14307)
### Description

This change allows for consideration of the input format and compression  when computing how to split the input files among available tasks, in MSQ ingestion, when considering the value of the  `maxInputBytesPerWorker` query context parameter. This query parameter allows users to control the maximum number of bytes, with granularity of input file / object, that ingestion tasks will be assigned to ingest. With this change, this context parameter now denotes the estimated weighted size in bytes of the input to split on, with consideration for input format and compression format, rather than the actual file size, reported by the file system.  We assume uncompressed newline delimited json as a baseline, with scaling factor of `1`. This means that when computing the byte weight that a file has towards the input splitting, we take the file size as is, if uncompressed json, 1:1. It was found during testing that gzip compressed json, and parquet, has scale factors of `4` and `8` respectively, meaning that each byte of data is weighted 4x and 8x respectively, when computing input splits. This weighted byte scaling is only considered for MSQ ingestion that uses either LocalInputSource or CloudObjectInputSource at the moment. The default value of the `maxInputBytesPerWorker` query context parameter has been updated from 10 GiB, to 512 MiB
2023-06-01 12:53:49 -07:00
panhongan c244c3de53
fix hdfs initialization issue (#14276)
* fix hdfs initialization issue

* add PR

* remove conf settings

* Improve comments

* move hdfs storage validation to start handler

* restore exception
2023-05-30 12:41:54 -07:00
Alexander Saydakov 4131c0df13
use the latest datasketches-java-4.0.0 (#14334)
* use the latest datasketches-java-4.0.0

* updated versions of datasketches

* adjusted expectation

* fixed the expectations

---------

Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>
2023-05-27 22:19:18 -07:00
Karan Kumar 8d256e35b4
MSQ ignores tombstone segments for downloads. (#14342) 2023-05-27 14:21:52 +05:30
Abhishek Radhakrishnan a5e04d95a4
Add `TYPE_NAME` to the complex serde classes and replace the hardcoded names. (#14317)
* Add TYPE_NAME to the serde classes and reuse them instead of hardcoded strings.

* Static check fixes.
2023-05-23 00:54:47 -05:00
Adarsh Sanjeev e8ef31fe92
Fix condition for timeout in worker task launcher (#14270)
* Fix condition for timeout in worker task launcher
2023-05-16 08:30:00 +05:30
Adarsh Sanjeev 10bce22e68
Configure maxBytesPerWorker directly instead of using StageDefinition (#14257)
* Configure maxBytesPerWorker directly instead of using StageDefinition
2023-05-15 16:51:57 +05:30
imply-cheddar f9861808bc
Be able to load segments on Peons (#14239)
* Be able to load segments on Peons

This change introduces a new config on WorkerConfig
that indicates how many bytes of each storage
location to use for storage of a task.  Said config
is divided up amongst the locations and slots
and then used to set TaskConfig.tmpStorageBytesPerTask

The Peons use their local task dir and
tmpStorageBytesPerTask as their StorageLocations for
the SegmentManager such that they can accept broadcast
segments.
2023-05-12 16:51:00 -07:00
Kashif Faraz ba11b3d462
Refactor: Add OverlordDuty to replace OverlordHelper and align with CoordinatorDuty (#14235)
Changes:
- Replace `OverlordHelper` with `OverlordDuty` to align with `CoordinatorDuty`
  - Each duty has a `run()` method and defines a `Schedule` with an initial delay and period.
  - Update existing duties `TaskLogAutoCleaner` and `DurableStorageCleaner`
- Add utility class `Configs`
- Update log, error messages and javadocs
- Other minor style improvements
2023-05-12 22:39:56 +05:30
Clint Wylie 625c4745b1
add context flag "useAutoColumnSchemas" to use new auto types for MSQ segment generation (#14175) 2023-05-10 15:37:14 -07:00
Adarsh Sanjeev fb38085ddb
Add wait for worker shutdown to MSQ task cancel (#14198)
* Add wait for worker shutdown to MSQ task cancel

* Fix checkstyle
2023-05-05 16:29:59 -07:00
Abhishek Radhakrishnan 46dabab36d
Fix NPE in test parse exception report. Add more tests with different thresholds. (#14209) 2023-05-05 10:05:41 -07:00
zachjsh 48cde236c4
Add columnMappings to explain plan output (#14187)
* Add columnMappings to explain plan output

* * fix checkstyle
* add tests

* * improve test coverage

* * temporarily remove unit-test need to run ITs

* * depend on build

* * temporarily lower unit test threshold

* * add back dependency on unit-tests

* * add license headers

* * fix header order

* * review comments

* * fix intellij inspection errors

* * revert code coverage change
2023-05-04 10:36:28 -07:00
Abhishek Radhakrishnan 68f908e511
Fix uncaught `ParseException` when reading Avro from Kafka (#14183)
In StreamChunkParser#parseWithInputFormat, we call byteEntityReader.read() without handling a potential ParseException, which is thrown during this function call by the delegate AvroStreamReader#intermediateRowIterator.
A ParseException can be thrown if an Avro stream has corrupt data or data that doesn't conform to the schema specified or for other decoding reasons. This exception if uncaught, can cause ingestion to fail.
2023-05-04 12:35:36 +05:30
Abhishek Radhakrishnan 954f3917ef
Add check for required avroBytesDecoder property that otherwise causes NPE. (#14177) 2023-05-03 09:53:58 -07:00
Karan Kumar 6f0cdd0c3f
`TaskStartTimeoutFault` now depends on the last successful worker launch time. (#14172)
* `TaskStartTimeoutFault` now depends on the last successful worker launch time.
2023-05-03 00:05:15 +05:30
Laksh Singla 387e682fbc
Fix memory calculations for WorkerMemoryParameters for machines with relatively less heap space (#14117)
* update worker memory parameters
2023-05-02 09:24:56 +05:30