druid/docs/querying/sql-query-context.md

9.3 KiB
Raw Blame History

id title sidebar_label
sql-query-context SQL query context SQL query context

:::info Apache Druid supports two query languages: Druid SQL and native queries. This document describes the SQL language. :::

Druid supports query context parameters which affect SQL query planning. See Query context for general query context parameters for all query types.

SQL query context parameters

The following table lists query context parameters you can use to configure Druid SQL planning. You can override a parameter's default value by setting a runtime property in the format druid.query.default.context.{query_context_key}. For more information, see Overriding default query context values.

Parameter Description Default value
sqlQueryId SQL query ID. For HTTP client, Druid returns it in the X-Druid-SQL-Query-Id header.

To specify a SQL query ID, use sqlQueryId instead of queryId. Setting queryId for a SQL request has no effect. All native queries underlying SQL use an auto-generated queryId.
auto-generated
sqlTimeZone Time zone for a connection. For example, "America/Los_Angeles" or an offset like "-08:00". This parameter affects how time functions and timestamp literals behave. UTC
sqlStringifyArrays If true, Druid serializes result columns with array values as JSON strings in the response instead of arrays. true, except for JDBC connections, where it's always false
useApproximateCountDistinct Whether to use an approximate cardinality algorithm for COUNT(DISTINCT foo). true
useGroupingSetForExactDistinct Whether to use grouping sets to execute queries with multiple exact distinct aggregations. false
useApproximateTopN If true, Druid converts SQL queries to approximate TopN queries wherever possible. If false, Druid uses exact GroupBy queries instead. true
enableTimeBoundaryPlanning If true, Druid converts SQL queries to time boundary queries wherever possible. Time boundary queries are very efficient for min-max calculation on the __time column in a datasource. false
useNativeQueryExplain If true, EXPLAIN PLAN FOR returns the explain plan as a JSON representation of equivalent native query, else it returns the original version of explain plan generated by Calcite.

This property is provided for backwards compatibility. We don't recommend setting this parameter unless your application depends on the older behavior.
true
sqlFinalizeOuterSketches If false (default behavior in Druid 25.0.0 and later), DS_HLL, DS_THETA, and DS_QUANTILES_SKETCH return sketches in query results. If true (default behavior in Druid 24.0.1 and earlier), Druid finalizes sketches from these functions when they appear in query results.

This property is provided for backwards compatibility with behavior in Druid 24.0.1 and earlier. We don't recommend setting this parameter unless your application uses Druid 24.0.1 or earlier. Instead, use a function that doesn't return a sketch, such as APPROX_COUNT_DISTINCT_DS_HLL, APPROX_COUNT_DISTINCT_DS_THETA, APPROX_QUANTILE_DS, DS_THETA_ESTIMATE, or DS_GET_QUANTILE.
false
sqlUseBoundAndSelectors If false (default behavior if druid.generic.useDefaultValueForNull=false in Druid 27.0.0 and later), the SQL planner uses equality, null, and range filters instead of selector and bounds. For filtering ARRAY typed values, sqlUseBoundAndSelectors must be false. Defaults to same value as druid.generic.useDefaultValueForNull.
sqlReverseLookup Whether to consider the reverse-lookup rewrite of the LOOKUP function during SQL planning.

Druid reverses calls to LOOKUP only when the number of matching keys is lower than both inSubQueryThreshold and sqlReverseLookupThreshold.
true
sqlReverseLookupThreshold Maximum size of IN filter to create when applying a reverse-lookup rewrite. If a LOOKUP call matches more keys than the specified threshold, it remains unchanged.

If inSubQueryThreshold is lower than sqlReverseLookupThreshold, Druid uses inSubQueryThreshold threshold instead.
10000
sqlPullUpLookup Whether to consider the pull-up rewrite of the LOOKUP function during SQL planning. true
enableJoinLeftTableScanDirect This parameter applies to queries with joins. By default, when the left child is a simple scan with a filter, Druid runs the scan as a query, then joins it with the right child on the Broker. Setting this parameter to true overrides that behavior and pushes the join to the data servers instead. Even if a query doesn't explicitly include a join, this parameter may still apply since the SQL planner can translate the query into a join internally. false
maxNumericInFilters Max limit for the amount of numeric values that Druid can compare for a string type dimension when the entire SQL WHERE clause of a query translates only to an OR of bound filter. By default, Druid doesn't restrict the amount of numeric bound filters on string columns, although this situation may block other queries from running. Set this parameter to a smaller value to prevent Druid from running queries that have prohibitively long segment processing times. The optimal limit requires some trial and error. We recommend starting with 100. Users who submit a query that exceeds the limit of maxNumericInFilters should rewrite their queries to use strings in the WHERE clause instead of numbers. For example, WHERE someString IN (123, 456). This value can't exceed the set system configuration druid.sql.planner.maxNumericInFilters. If druid.sql.planner.maxNumericInFilters isn't set explicitly, Druid ignores this value. -1
inFunctionThreshold At or beyond this threshold number of values, Druid converts SQL IN to SCALAR_IN_ARRAY. A threshold of 0 forces this conversion in all cases. A threshold of Integer.MAX_VALUE disables this conversion. The converted function is eligible for fewer planning-time optimizations, which speeds up planning, but may prevent certain planning-time optimizations. 100
inFunctionExprThreshold At or beyond this threshold number of values, SQL IN is eligible for execution using the native function scalar_in_array rather than an || of ==, even if the number of values is below inFunctionThreshold. This property only affects translation of SQL IN to a native expression. It doesn't affect translation of SQL IN to a native filter. This property is provided for backwards compatibility purposes, and may be removed in a future release. 2
inSubQueryThreshold At or beyond this threshold number of values, Druid converts SQL IN to JOIN on an inline table. inFunctionThreshold takes priority over this setting. A threshold of 0 forces usage of an inline table in all cases where the size of a SQL IN is larger than inFunctionThreshold. A threshold of 2147483647 disables the rewrite of SQL IN to JOIN. 2147483647

Set the query context

You can configure query context parameters in the context object of the JSON API or as a JDBC connection properties object.

The following example shows how to set a query context parameter using the JSON API:

{
  "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP '2000-01-01 00:00:00'",
  "context" : {
    "sqlTimeZone" : "America/Los_Angeles"
  }
}

The following example shows how to set query context parameters using JDBC:

String url = "jdbc:avatica:remote:url=http://localhost:8082/druid/v2/sql/avatica/";

// Set any query context parameters you need here.
Properties connectionProperties = new Properties();
connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles");
connectionProperties.setProperty("useCache", "false");

try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
  // create and execute statements, process result sets, etc
}