3.6 KiB
id | title | sidebar_label |
---|---|---|
known-issues | SQL-based ingestion known issues | Known issues |
This page describes SQL-based batch ingestion using the
druid-multi-stage-query
extension, new in Druid 24.0. Refer to the ingestion methods table to determine which ingestion method is right for you.
Multi-stage query task runtime
-
Fault tolerance is not implemented. If any task fails, the entire query fails.
-
Worker task stage outputs are stored in the working directory given by
druid.indexer.task.baseDir
. Stages that generate a large amount of output data may exhaust all available disk space. In this case, the query fails with an UnknownError with a message including "No space left on device".
SELECT
-
SELECT from a Druid datasource does not include unpublished real-time data.
-
GROUPING SETS and UNION ALL are not implemented. Queries using these features return a QueryNotSupported error.
-
For some COUNT DISTINCT queries, you'll encounter a QueryNotSupported error that includes
Must not have 'subtotalsSpec'
as one of its causes. This is caused by the planner attempting to use GROUPING SETs, which are not implemented. -
The numeric varieties of the EARLIEST and LATEST aggregators do not work properly. Attempting to use the numeric varieties of these aggregators lead to an error like
java.lang.ClassCastException: class java.lang.Double cannot be cast to class org.apache.druid.collections.SerializablePair
. The string varieties, however, do work properly.
INSERT and REPLACE
-
INSERT and REPLACE with column lists, like
INSERT INTO tbl (a, b, c) SELECT ...
, is not implemented. -
INSERT ... SELECT
andREPLACE ... SELECT
insert columns from the SELECT statement based on column name. This differs from SQL standard behavior, where columns are inserted based on position. -
INSERT and REPLACE do not support all options available in ingestion specs, including the
createBitmapIndex
andmultiValueHandling
dimension properties, and theindexSpec
tuningConfig
property.
EXTERN
-
The schemaless dimensions feature is not available. All columns and their types must be specified explicitly using the
signature
parameter of the EXTERN function. -
EXTERN with input sources that match large numbers of files may exhaust available memory on the controller task.
-
EXTERN does not accept
druid
input sources. Use FROM instead.