3.8 KiB
id | title | sidebar_label |
---|---|---|
known-issues | SQL-based ingestion known issues | Known issues |
:::info
This page describes SQL-based batch ingestion using the druid-multi-stage-query
extension, new in Druid 24.0. Refer to the ingestion methods table to determine which
ingestion method is right for you.
:::
Multi-stage query task runtime
-
Fault tolerance is partially implemented. Workers get relaunched when they are killed unexpectedly. The controller does not get relaunched if it is killed unexpectedly.
-
Worker task stage outputs are stored in the working directory given by
druid.indexer.task.baseDir
. Stages that generate a large amount of output data may exhaust all available disk space. In this case, the query fails with an UnknownError with a message including "No space left on device".
SELECT
Statement
-
SELECT
from a Druid datasource does not include unpublished real-time data. -
GROUPING SETS
andUNION ALL
are not implemented. Queries using these features return a QueryNotSupported error. -
For some
COUNT DISTINCT
queries, you'll encounter a QueryNotSupported error that includesMust not have 'subtotalsSpec'
as one of its causes. This is caused by the planner attempting to useGROUPING SET
s, which are not implemented. -
The numeric varieties of the
EARLIEST
andLATEST
aggregators do not work properly. Attempting to use the numeric varieties of these aggregators lead to an error likejava.lang.ClassCastException: class java.lang.Double cannot be cast to class org.apache.druid.collections.SerializablePair
. The string varieties, however, do work properly.
INSERT
and REPLACE
Statements
-
The
INSERT
andREPLACE
statements with column lists, likeINSERT INTO tbl (a, b, c) SELECT ...
, is not implemented. -
INSERT ... SELECT
andREPLACE ... SELECT
insert columns from theSELECT
statement based on column name. This differs from SQL standard behavior, where columns are inserted based on position. -
INSERT
andREPLACE
do not support all options available in ingestion specs, including thecreateBitmapIndex
andmultiValueHandling
dimension properties, and theindexSpec
tuningConfig
property.
EXTERN
Function
-
The schemaless dimensions feature is not available. All columns and their types must be specified explicitly using the
signature
parameter of theEXTERN
function. -
EXTERN
with input sources that match large numbers of files may exhaust available memory on the controller task. -
EXTERN
refers to external files. UseFROM
to accessdruid
input sources.