remove mentions of DruidQueryRel from docs (#13033)

* remove mentions of DruidQueryRel

* Update docs/querying/sql-translation.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/querying/sql-translation.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
This commit is contained in:
Vadim Ogievetsky 2022-09-06 13:37:27 -07:00 committed by GitHub
parent 2a039e7e6a
commit 897689c03b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 127 additions and 25 deletions

View File

@ -63,37 +63,139 @@ appreciated.
## Interpreting EXPLAIN PLAN output ## Interpreting EXPLAIN PLAN output
The [EXPLAIN PLAN](sql.md#explain-plan) functionality can help you understand how a given SQL query will The [EXPLAIN PLAN](sql.md#explain-plan) functionality can help you understand how a given SQL query will
be translated to native. For simple queries that do not involve subqueries or joins, the output of EXPLAIN PLAN be translated to native.
is easy to interpret. The native query that will run is embedded as JSON inside a "DruidQueryRel" line: EXPLAIN PLAN statements return a `RESOURCES` column that describes the resource being queried as well as a `PLAN` column that contains a JSON array of native queries that Druid will run.
For example, consider the following query:
``` ```sql
> EXPLAIN PLAN FOR SELECT COUNT(*) FROM wikipedia EXPLAIN PLAN FOR
SELECT
DruidQueryRel(query=[{"queryType":"timeseries","dataSource":"wikipedia","intervals":"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z","granularity":"all","aggregations":[{"type":"count","name":"a0"}]}], signature=[{a0:LONG}]) channel,
COUNT(*)
FROM wikipedia
WHERE channel IN (SELECT page FROM wikipedia GROUP BY page ORDER BY COUNT(*) DESC LIMIT 10)
GROUP BY channel
``` ```
For more complex queries that do involve subqueries or joins, EXPLAIN PLAN is somewhat more difficult to interpret. The EXPLAIN PLAN statement returns the following plan:
For example, consider this query:
``` ```json
> EXPLAIN PLAN FOR [
> SELECT {
> channel, "query": {
> COUNT(*) "queryType": "topN",
> FROM wikipedia "dataSource": {
> WHERE channel IN (SELECT page FROM wikipedia GROUP BY page ORDER BY COUNT(*) DESC LIMIT 10) "type": "join",
> GROUP BY channel "left": {
"type": "table",
DruidJoinQueryRel(condition=[=($1, $3)], joinType=[inner], query=[{"queryType":"groupBy","dataSource":{"type":"table","name":"__join__"},"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"granularity":"all","dimensions":["channel"],"aggregations":[{"type":"count","name":"a0"}]}], signature=[{d0:STRING, a0:LONG}]) "name": "wikipedia"
DruidQueryRel(query=[{"queryType":"scan","dataSource":{"type":"table","name":"wikipedia"},"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"resultFormat":"compactedList","columns":["__time","channel","page"],"granularity":"all"}], signature=[{__time:LONG, channel:STRING, page:STRING}]) },
DruidQueryRel(query=[{"queryType":"topN","dataSource":{"type":"table","name":"wikipedia"},"dimension":"page","metric":{"type":"numeric","metric":"a0"},"threshold":10,"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"granularity":"all","aggregations":[{"type":"count","name":"a0"}]}], signature=[{d0:STRING}]) "right": {
"type": "query",
"query": {
"queryType": "groupBy",
"dataSource": {
"type": "table",
"name": "wikipedia"
},
"intervals": {
"type": "intervals",
"intervals": [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
]
},
"granularity": {
"type": "all"
},
"dimensions": [
{
"type": "default",
"dimension": "page",
"outputName": "d0",
"outputType": "STRING"
}
],
"aggregations": [
{
"type": "count",
"name": "a0"
}
],
"limitSpec": {
"type": "default",
"columns": [
{
"dimension": "a0",
"direction": "descending",
"dimensionOrder": {
"type": "numeric"
}
}
],
"limit": 10
},
"context": {
"sqlOuterLimit": 101,
"sqlQueryId": "ee616a36-c30c-4eae-af00-245127956e42",
"useApproximateCountDistinct": false,
"useApproximateTopN": false
}
}
},
"rightPrefix": "j0.",
"condition": "(\"channel\" == \"j0.d0\")",
"joinType": "INNER"
},
"dimension": {
"type": "default",
"dimension": "channel",
"outputName": "d0",
"outputType": "STRING"
},
"metric": {
"type": "dimension",
"ordering": {
"type": "lexicographic"
}
},
"threshold": 101,
"intervals": {
"type": "intervals",
"intervals": [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
]
},
"granularity": {
"type": "all"
},
"aggregations": [
{
"type": "count",
"name": "a0"
}
],
"context": {
"sqlOuterLimit": 101,
"sqlQueryId": "ee616a36-c30c-4eae-af00-245127956e42",
"useApproximateCountDistinct": false,
"useApproximateTopN": false
}
},
"signature": [
{
"name": "d0",
"type": "STRING"
},
{
"name": "a0",
"type": "LONG"
}
]
}
]
``` ```
Here, there is a join with two inputs. The way to read this is to consider each line of the EXPLAIN PLAN output as In this case the JOIN operator gets translated to a `join` datasource. See the [Join translation](#joins) section
something that might become a query, or might just become a simple datasource. The `query` field they all have is
called a "partial query" and represents what query would be run on the datasource represented by that line, if that
line ran by itself. In some cases — like the "scan" query in the second line of this example — the query does not
actually run, and it ends up being translated to a simple table datasource. See the [Join translation](#joins) section
for more details about how this works. for more details about how this works.
We can see this for ourselves using Druid's [request logging](../configuration/index.md#request-logging) feature. After We can see this for ourselves using Druid's [request logging](../configuration/index.md#request-logging) feature. After