* update link and title
* Discard changes to website/package.json
* Apply suggestions from code review
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
---------
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Support ListBasedInputRow in Kafka ingestion with header
* Fix up buildBlendedEventMap
* Add new test for KafkaInputFormat with csv value and headers
* Do not use forbidden APIs
* Move utility method to TestUtils
index_realtime tasks were removed from the documentation in #13107. Even
at that time, they weren't really documented per se— just mentioned. They
existed solely to support Tranquility, which is an obsolete ingestion
method that predates migration of Druid to ASF and is no longer being
maintained. Tranquility docs were also de-linked from the sidebars and
the other doc pages in #11134. Only a stub remains, so people with
links to the page can see that it's no longer recommended.
index_realtime_appenderator tasks existed in the code base, but were
never documented, nor as far as I am aware were they used for any purpose.
This patch removes both task types completely, as well as removes all
supporting code that was otherwise unused. It also updates the stub
doc for Tranquility to be firmer that it is not compatible. (Previously,
the stub doc said it wasn't recommended, and pointed out that it is
built against an ancient 0.9.2 version of Druid.)
ITUnionQueryTest has been migrated to the new integration tests framework and updated to use Kafka ingestion.
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
* Initial support for bootstrap segments.
- Adds a new API in the coordinator.
- All processes that have storage locations configured (including tasks)
talk to the coordinator if they can, and fetch bootstrap segments from it.
- Then load the segments onto the segment cache as part of startup.
- This addresses the segment bootstrapping logic required by processes before
they can start serving queries or ingesting.
This patch also lays the foundation to speed up upgrades.
* Fail open by default if there are any errors talking to the coordinator.
* Add test for failure scenario and cleanup logs.
* Cleanup and add debug log
* Assert the events so we know the list exactly.
* Revert RunRules test.
The rules aren't evaluated if there are no clusters.
* Revert RunRulesTest too.
* Remove debug info.
* Make the API POST and update log.
* Fix up UTs.
* Throw 503 from MetadataResource; clean up exception handling and DruidException.
* Remove unused logger, add verification of metrics and docs.
* Update error message
* Update server/src/main/java/org/apache/druid/server/coordination/SegmentLoadDropHandler.java
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
* Apply suggestions from code review
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
* Adjust test metric expectations with the rename.
* Add BootstrapSegmentResponse container in the response for future extensibility.
* Rename to BootstrapSegmentsInfo for internal consistency.
* Remove unused log.
* Use a member variable for broadcast segments instead of segmentAssigner.
* Minor cleanup
* Add test for loadable bootstrap segments and clarify comment.
* Review suggestions.
---------
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
Changes:
- Add new task action `RetrieveSegmentsByIdAction`
- Use new task action to retrieve segments irrespective of their visibility
- During rolling upgrades, this task action would fail as Overlord would be on old version
- If new action fails, fall back to just fetching used segments as before
This PR fixes a few bugs with MSQ export. The main change is calling SqlResults#coerce before writing the column. This allows sketches and json to be correctly deserialized. The format of the exported complex columns are similar to those produced by Async MSQ queries with CSV format.
Notes:
Fix printing of complex columns during export. Sketches and JSON are now correctly formatted during export.
Fix an NPE if the writer has not been initialized. Empty export queries will create an empty file at the location.
Fix a bug with counters for MSQ export, where rows were reported for only the first partition.
As part of #16481, we have started uploading the chunks in parallel.
That means that it's not necessary for the part that finished uploading last
to be less than or equal to the chunkSize (as the final part could've been uploaded earlier).
This made a test in RetryableS3OutputStreamTest flaky where we were
asserting that the final part should be smaller than chunk size.
This commit fixes the test, and also adds another test where the file size
is such that all chunk sizes would be of equal size.
* Fix build
* Push tasklogs alongwith service logs
* temp changes to run standard its without unit test results
* test
* minor change
* test
* test
* Update datasource name for ITSystemTableBatchIndexTaskTest
* Publish task logs
* Revert other changes
* update standard-it yaml
* Turn invalid periods into user-facing exception providing more context.
The current exception is targeting the ADMIN persona. Catch that and turn
it into a USER persona instead. Also, provide more context in the error
message.
* Review comment: pass the wrapping expression and stringify.
* Update processing/src/main/java/org/apache/druid/query/expression/ExprUtils.java
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
---------
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
Adds versions of
DruidException.defensive(String, Object...)
InvalidInput.exception(String, Object...)
InvalidInput.exception(Throwable, String, Object...)
the versions add a boolean as the first arg and only create and throw
an exception if it's false. It can be used similar to
Preconditions.checkState/checkArgument
* Fix flakey BrokerClientTest.
The testError() method reliably fails in the IDE. This is because the
the test runner has
<surefire.rerunFailingTestsCount>3</surefire.rerunFailingTestsCount> is set to 3, so maven
retries this "flaky test" multiple times and the test code returns a successful response
in the third attempt.
The exception handling in BrokerClientTest was broken:
- All non-2xx errors were being turned as 5xx errors. Remove that block of
code. If we need to handle retries of more specific 5xx error codes, that should be
hanlded explicitly. Or if there's a source of 4xx class error that needs to be 5xx,
fix that in the source of error.
* Fix CodeQL warning for unused parameter.
* Link fix
* Update docs/operations/auth.md
Co-authored-by: Andreas Maechler <amaechler@gmail.com>
---------
Co-authored-by: Andreas Maechler <amaechler@gmail.com>
* contains Make a full copy of the parser and apply our modifications to it #16503
* some minor api changes pair/entry
* some unnecessary aggregation was removed from a set of queries in `CalciteSubqueryTest`
* `AliasedOperatorConversion` was detecting `CHAR_LENGTH` as not a function ; I've removed the check
* the field it was using doesn't look maintained that much
* the `kind` is passed for the created `SqlFunction` so I don't think this check is actually needed
* some decoupled test cases become broken - will be fixed later
* some aggregate related changes: due to the fact that SUM() and COUNT() of no inputs are different
* upgrade avatica to 1.25.0
* `CalciteQueryTest#testExactCountDistinctWithFilter` is now executable
Closeapache/druid#16503
* fix NestedDataColumnIndexerV4 to not report cardinality
changes:
* fix issue similar to #16489 but for NestedDataColumnIndexerV4, which can report STRING type if it only processes a single type of values. this should be less common than the auto indexer problem
* fix some issues with sql benchmarks
This fixes an issue where in some cases, a SQL syntax error encountered when parsing / planning a query results in an error returned to the user with persona a `admin` when it should instead be `user`.
Update default value of druid.indexer.tasklock.batchAllocationWaitTime to 0.
Thus, a segment allocation request is processed immediately unless there are already some requests queued before this one. While in queue, a segment allocation request may get clubbed together with other similar requests into a batch to reduce load on the metadata store.
* Remove unused constants
* Refactor getBlockBlobLength
* Better link
* Upper-case log
* Mark defaultStorageAccount nullable
This is the case if you do not use Azure for deep-storage but ingest from Azure blobs.
* Do not always create a new container if it doesn't exist
Specifically, only create a container if uploading a blob or writing a blob stream
* Add lots of comments, group methods
* Revert "Mark defaultStorageAccount nullable"
* Add mockito for junit
* Add extra test
* Add comment
Thanks George.
* Pass blockSize as Long
* Test more branches...