4.1 KiB
id | title | sidebar_label |
---|---|---|
known-issues | SQL-based ingestion known issues | Known issues |
:::info
This page describes SQL-based batch ingestion using the druid-multi-stage-query
extension, new in Druid 24.0. Refer to the ingestion methods table to determine which
ingestion method is right for you.
:::
Multi-stage query task runtime
-
Fault tolerance is partially implemented. Workers get relaunched when they are killed unexpectedly. The controller does not get relaunched if it is killed unexpectedly.
-
Worker task stage outputs are stored in the working directory given by
druid.indexer.task.baseDir
. Stages that generate a large amount of output data may exhaust all available disk space. In this case, the query fails with an UnknownError with a message including "No space left on device".
SELECT
Statement
GROUPING SETS
are not implemented. Queries using these features return a QueryNotSupported error.
INSERT
and REPLACE
Statements
-
The
INSERT
andREPLACE
statements with column lists, likeINSERT INTO tbl (a, b, c) SELECT ...
, is not implemented. -
INSERT ... SELECT
andREPLACE ... SELECT
insert columns from theSELECT
statement based on column name. This differs from SQL standard behavior, where columns are inserted based on position. -
INSERT
andREPLACE
do not support all options available in ingestion specs, including thecreateBitmapIndex
andmultiValueHandling
dimension properties, and theindexSpec
tuningConfig
property.
EXTERN
Function
-
The schemaless dimensions feature is not available. All columns and their types must be specified explicitly using the
signature
parameter of theEXTERN
function. -
EXTERN
with input sources that match large numbers of files may exhaust available memory on the controller task. -
EXTERN
refers to external files. UseFROM
to accessdruid
input sources.
WINDOW
Function
- The maximum number of elements in a window cannot exceed a value of 100,000.
- To avoid
leafOperators
in MSQ engine, window functions have an extra scan stage after the window stage for cases where native engine has a non-emptyleafOperator
.
Automatic compaction
The following known issues and limitations affect automatic compaction with the MSQ task engine:
- The
metricSpec
field is only supported for certain aggregators. For more information, see Supported aggregators. - Only dynamic and range-based partitioning are supported.
- Set
rollup
totrue
if and only ifmetricSpec
is not empty or null. - You can only partition on string dimensions. However, multi-valued string dimensions are not supported.
- The
maxTotalRows
config is not supported inDynamicPartitionsSpec
. UsemaxRowsPerSegment
instead. - Segments can only be sorted on
__time
as the first column.