* MSQ web console
* fix typo in comments
* remove useless conditional
* wrap SQL_DATA_TYPES
* fixes sus regex
* rewrite regex
* remove problematic regex
* fix UTs
* convert PARTITIONED / CLUSTERED BY to ORDER BY for preview
* fix log
* updated to use shuffle
* Web console: Use Ace.Completion directly (#1405)
* Use Ace.Completion directly
* Another Ace.Completion
* better comment
* fix column ordering in e2e test
* add nested data example also
Co-authored-by: John Gozde <john.gozde@imply.io>
* Fix serialization in TaskReportFileWriters.
For some reason, serializing a Map<String, TaskReport> would omit the
"type" field. Explicitly sending each value through the ObjectMapper
fixes this, because the type information does not get lost.
* Fixes for static analysis.
This commit is a first draft of the revised integration test framework which provides:
- A new directory, integration-tests-ex that holds the new integration test structure. (For now, the existing integration-tests is left unchanged.)
- Maven module druid-it-tools to hold code placed into the Docker image.
- Maven module druid-it-image to build the Druid-only test image from the tarball produced in distribution. (Dependencies live in their "official" image.)
- Maven module druid-it-cases that holds the revised tests and the framework itself. The framework includes file-based test configuration, test-specific clients, test initialization and updated versions of some of the common test support classes.
The integration test setup is primarily a huge mass of details. This approach refactors many of those details: from how the image is built and configured to how the Docker Compose scripts are structured to test configuration. An extensive set of "readme" files explains those details. Rather than repeat that material here, please consult those files for explanations.
Fixes KafkaEmitter not emitting queryType for a native query. The Event to JSON serialization was extracted to the external class: EventToJsonSerializer. This was done to simplify the testing logic for the serialization as well as extract the responsibility of serialization to the separate class.
The logic builds ObjectNode incrementally based on the event .toMap method. Parsing each entry individually ensures that the Jackson polymorphic annotations are respected. Not respecting these annotation caused the missing of the queryType from output event.
The Avro parsing code leaks some "object" representations.
We need to convert them into Maps/Lists so that other code
can understand and expect good things. Previously, these
objects were handled with .toString(), but that's not a
good contract in terms of how to work with objects.
The method wasn't following its contract, leading to pollution of the
overall planner context, when really we just want to create a new
context for a specific query.
Added link to the relevant section of the Basic Cluster Tuning page on each process page.
This is in order to improve access to this information, which is not easy to find through search or nav.
* Fix race in TaskQueue.notifyStatus.
It was possible for manageInternal to relaunch a task while it was
being cleaned up, due to a race that happens when notifyStatus is
called to clean up a successful task:
1) In a critical section, notifyStatus removes the task from "tasks".
2) Outside a critical section, notifyStatus calls taskRunner.shutdown
to let the task runner know it can clear out its data structures.
3) In a critical section, syncFromStorage adds the task back to "tasks",
because it is still present in metadata storage.
4) In a critical section, manageInternalCritical notices that the task
is in "tasks" and is not running in the taskRunner, so it launches
it again.
5) In a (different) critical section, notifyStatus updates the metadata
store to set the task status to SUCCESS.
6) The task continues running even though it should not be.
The possibility for this race was introduced in #12099, which shrunk
the critical section in notifyStatus. Prior to that patch, a single
critical section encompassed (1), (2), and (5), so the ordering above
was not possible.
This patch does the following:
1) Fixes the race by adding a recentlyCompletedTasks set that prevents
the main management loop from doing anything with tasks that are
currently being cleaned up.
2) Switches the order of the critical sections in notifyStatus, so
metadata store updates happen first. This is useful in case of
server failures: it ensures that if the Overlord fails in the midst
of notifyStatus, then completed-task statuses are still available in
ZK or on MMs for the next Overlord. (Those are cleaned up by
taskRunner.shutdown, which formerly ran first.) This isn't related
to the race described above, but is fixed opportunistically as part
of the same patch.
3) Changes the "tasks" list to a map. Many operations require retrieval
or removal of individual tasks; those are now O(1) instead of O(N)
in the number of running tasks.
4) Changes various log messages to use task ID instead of full task
payload, to make the logs more readable.
* Fix format string.
* Update comment.
* SQL: Morph QueryMakerFactory into SqlEngine.
Groundwork for introducing an indexing-service-task-based SQL engine
under the umbrella of #12262. Also includes some other changes related
to improving error behavior.
Main changes:
1) Elevate the QueryMakerFactory interface (an extension point that allows
customization of how queries are made) into SqlEngine. SQL engines
can influence planner behavior through EngineFeatures, and can fully
control the mechanics of query execution using QueryMakers.
2) Remove the server-wide QueryMakerFactory choice, in favor of the choice
being made by the SQL entrypoint. The indexing-service-task-based
SQL engine would be associated with its own entrypoint, like
/druid/v2/sql/task.
Other changes:
1) Adjust DruidPlanner to try either DRUID or BINDABLE convention based
on analysis of the planned rels; never try both. In particular, we
no longer try BINDABLE when DRUID fails. This simplifies the logic
and improves error messages.
2) Adjust error message "Cannot build plan for query" to omit the SQL
query text. Useful because the text can be quite long, which makes it
easy to miss the text about the problem.
3) Add a feature to block context parameters used internally by the SQL
planner from being supplied by end users.
4) Add a feature to enable adding row signature to the context for
Scan queries. This is useful in building the task-based engine.
5) Add saffron.properties file that turns off sets and graphviz dumps
in "cannot plan" errors. Significantly reduces log spam on the Broker.
* Fixes from CI.
* Changes from review.
* Can vectorize, now that join-to-filter is on by default.
* Checkstyle! And variable renames!
* Remove throws from test.
* Error handling improvements for frame channels.
Two changes:
1) Send errors down in-memory channels (BlockingQueueFrameChannel) on
failure. This ensures that in situations where a chain of processors
has been set up on a single machine, all processors see the root
cause error. In particular, this means the final processor in the
chain reports the root cause error, which ensures that someone with
a handle to the final processor will get the proper error.
2) Update FrameFileHttpResponseHandler to expect that the final fetch,
rather than being simply empty, is also empty with a special header.
This ensures that the handler is able to tell the difference between
an empty fetch due to being at EOF, and an empty fetch due to a
truncated HTTP response (after the 200 OK and headers are sent down,
but before any content appears).
* Fix tests, imports.
* Checkstyle!
* Refactor SqlLifecycle into statement classes
Create direct & prepared statements
Remove redundant exceptions from tests
Tidy up Calcite query tests
Make PlannerConfig more testable
* Build fixes
* Added builder to SqlQueryPlus
* Moved Calcites system properties to saffron.properties
* Build fix
* Resolve merge conflict
* Fix IntelliJ inspection issue
* Revisions from reviews
Backed out a revision to Calcite tests that didn't work out as planned
* Build fix
* Fixed spelling errors
* Fixed failed test
Prepare now enforces security; before it did not.
* Rebase and fix IntelliJ inspections issue
* Clean up exception handling
* Fix handling of JDBC auth errors
* Build fix
* More tweaks to security messages
* changes to run examples on macos where cd command returns current directory
* Update examples/bin/run-druid
Co-authored-by: Frank Chen <frankchen@apache.org>
* merging
* sending output of cd command to /dev/null
Co-authored-by: Frank Chen <frankchen@apache.org>
This is used to control access to the EXTERN function, which allows
reading external data in SQL. The EXTERN function is not usable in
production as of today, but it is used by the task-based SQL engine
contemplated in #12262.
* Introduce defaultOnDiskStorage config for groupBy
* add debug log to groupby query config
* Apply config change suggestion from review
* Remove accidental new lines
* update default value of new default disk storage config
* update debug log to have more descriptive text
* Make maxOnDiskStorage and defaultOnDiskStorage HumanRedadableBytes
* improve test coverage
* Provide default implementation to new default method on advice of reviewer
In the current druid code base, we have the interface DataSegmentPusher which allows us to push segments to the appropriate deep storage without the extension being worried about the semantics of how to push too deep storage.
While working on #12262, whose some part of the code will go as an extension, I realized that we do not have an interface that allows us to do basic "write, get, delete, deleteAll" operations on the appropriate deep storage without let's say pulling the s3-storage-extension dependency in the custom extension.
Hence, the idea of StorageConnector was born where the storage connector sits inside the druid core so all extensions have access to it.
Each deep storage implementation, for eg s3, GCS, will implement this interface.
Now with some Jackson magic, we bind the implementation of the correct deep storage implementation on runtime using a type variable.
The Netty pipeline set up by the client can deliver multiple exceptions,
and can deliver chunks even after delivering exceptions. This makes it
difficult to implement HttpResponseHandlers. Looking at existing handler
implementations, I do not see attempts to handle this case, so it's also
a potential source of bugs.
This patch updates the client to track whether an exception was
encountered, and if so, to not call any additional methods on the handler
after exceptionCaught. It also harmonizes exception handling between
exceptionCaught and channelDisconnected.