druid/docs/multi-stage-query/msq-known-issues.md

6.5 KiB

id title sidebar_label
known-issues SQL-based ingestion known issues Known issues

SQL-based ingestion using the multi-stage query task engine is our recommended solution starting in Druid 24.0. Alternative ingestion solutions, such as native batch and Hadoop-based ingestion systems, will still be supported. We recommend you read all known issues and test the feature in a development environment before rolling it out in production. Using the multi-stage query task engine with SELECT statements that do not write to a datasource is experimental.

General query execution

  • There's no fault tolerance. If any task fails, the entire query fails.

  • Only one local file system per server is used for stage output data during multi-stage query execution. If your servers have multiple local file systems, this causes queries to exhaust available disk space earlier than expected.

  • When msqMaxNumTasks is higher than the total capacity of the cluster, more tasks may be launched than can run at once. This leads to a TaskStartTimeout error code, as there is never enough capacity to run the query. To avoid this, set msqMaxNumTasks to a number of tasks that can run simultaneously on your cluster.

  • When msqTaskAssignment is set to auto, the system generates one task per input file for certain splittable input sources where file sizes are not known ahead of time. This includes the http input source, where the system generates one task per URI.

Memory usage

  • INSERT queries can consume excessive memory when using complex types due to inaccurate footprint estimation. This can appear as an OutOfMemoryError during the SegmentGenerator stage when using sketches. If you run into this issue, try manually lowering the value of the msqRowsInMemory parameter.

  • EXTERN loads an entire row group into memory at once when reading from Parquet files. Row groups can be up to 1 GB in size, which can lead to excessive heap usage when reading many files in parallel. This can appear as an OutOfMemoryError during stages that read Parquet input files. If you run into this issue, try using a smaller number of worker tasks or you can increase the heap size of your Indexers or of your Middle Manager-launched indexing tasks.

  • Ingesting a very long row may consume excessive memory and result in an OutOfMemoryError. If a row is read which requires more memory than is available, the service might throw OutOfMemoryError. If you run into this issue, allocate enough memory to be able to store the largest row to the indexer.

SELECT queries

  • SELECT query results do not include real-time data until it has been published.

  • TIMESTAMP types are formatted as numbers rather than ISO8601 timestamp strings, which differs from Druid's standard result format.

  • BOOLEAN types are formatted as numbers like 1 and 0 rather than true or false, which differs from Druid's standard result format.

  • TopN is not implemented. The context parameter useApproximateTopN is ignored and always treated as if it were false. Therefore, topN-shaped queries will always run using the groupBy engine. There is no loss of functionality, but there may be a performance impact, since these queries will run using an exact algorithm instead of an approximate one.

  • GROUPING SETS is not implemented. Queries that use GROUPING SETS will fail.

  • The numeric flavors of the EARLIEST and LATEST aggregators do not work properly. Attempting to use the numeric flavors of these aggregators will lead to an error like java.lang.ClassCastException: class java.lang.Double cannot be cast to class org.apache.druid.collections.SerializablePair. The string flavors, however, do work properly.

INSERT queries

  • The schemaless dimensions feature is not available. All columns and their types must be specified explicitly.

  • Segment metadata queries on datasources ingested with the Multi-Stage Query Engine will return values fortimestampSpec that are not usable for introspection.

  • When INSERT with GROUP BY does the match the criteria mentioned in GROUP BY, the multi-stage engine generates segments that Druid's compaction functionality is not able to further roll up. This applies to automatic compaction as well as manually issued compact tasks. Individual queries executed with the multi-stage engine always guarantee perfect rollup for their output, so this only matters if you are performing a sequence of INSERT queries that each append data to the same time chunk. If necessary, you can compact such data using another SQL query instead of a compact task.

  • When using INSERT with GROUP BY, splitting of large partitions is not currently implemented. If a single partition key appears in a very large number of rows, an oversized segment will be created. You can mitigate this by adding additional columns to your partition key. Note that partition splitting does work properly when performing INSERT without GROUP BY.

  • INSERT with column lists, like INSERT INTO tbl (a, b, c) SELECT ..., is not implemented.

EXTERN queries

  • EXTERN does not accept druid input sources.

Missing guardrails

  • Maximum number of input files. Since there's no limit, the controller can potentially run out of memory tracking all input files

  • Maximum amount of local disk space to use for temporary data. No guardrail today means worker tasks may exhaust all available disk space. In this case, you will receive an UnknownError) with a message including "No space left on device".