mirror of https://github.com/apache/druid.git
Docs: Clarify the situation with SELECT. (#13109)
This commit is contained in:
parent
5745e878ab
commit
4b2f1adecf
|
@ -42,15 +42,15 @@ You submit queries to the MSQ task engine using the `POST /druid/v2/sql/task/` e
|
|||
|
||||
#### Request
|
||||
|
||||
Currently, the MSQ task engine ignores the provided values of `resultFormat`, `header`,
|
||||
`typesHeader`, and `sqlTypesHeader`. SQL SELECT queries write out their results into the task report (in the `multiStageQuery.payload.results.results` key) formatted as if `resultFormat` is an `array`.
|
||||
The SQL task endpoint accepts [SQL requests in the JSON-over-HTTP form](../querying/sql-api.md#request-body) using the
|
||||
`query`, `context`, and `parameters` fields, but ignoring the `resultFormat`, `header`, `typesHeader`, and
|
||||
`sqlTypesHeader` fields.
|
||||
|
||||
For task queries similar to the [example queries](./examples.md), you need to escape characters such as quotation marks (") if you use something like `curl`.
|
||||
You don't need to escape characters if you use a method that can parse JSON seamlessly, such as Python.
|
||||
The Python example in this topic escapes quotation marks although it's not required.
|
||||
This endpoint accepts [INSERT](reference.md#insert) and [REPLACE](reference.md#replace) statements.
|
||||
|
||||
The following example is the same query that you submit when you complete [Convert a JSON ingestion
|
||||
spec](../tutorials/tutorial-msq-convert-spec.md) where you insert data into a table named `wikipedia`.
|
||||
As an experimental feature, this endpoint also accepts SELECT queries. SELECT query results are collected from workers
|
||||
by the controller, and written into the [task report](#get-the-report-for-a-query-task) as an array of arrays. The
|
||||
behavior and result format of plain SELECT queries (without INSERT or REPLACE) is subject to change.
|
||||
|
||||
<!--DOCUSAURUS_CODE_TABS-->
|
||||
|
||||
|
@ -199,9 +199,12 @@ A report provides detailed information about a query task, including things like
|
|||
|
||||
Keep the following in mind when using the task API to view reports:
|
||||
|
||||
- For SELECT queries, the report includes the results. At this time, if you want to view results for SELECT queries, you need to retrieve them as a generic map from the report and extract the results.
|
||||
- The task report stores query details for controller tasks.
|
||||
- If you encounter `500 Server Error` or `404 Not Found` errors, the task may be in the process of starting up or shutting down.
|
||||
- The task report for an entire job is associated with the `query_controller` task. The `query_worker` tasks do not have
|
||||
their own reports; their information is incorporated into the controller report.
|
||||
- The task report API may report `404 Not Found` temporarily while the task is in the process of starting up.
|
||||
- As an experimental feature, the SQL task engine supports running SELECT queries. SELECT query results are written into
|
||||
the `multiStageQuery.payload.results.results` task report key as an array of arrays. The behavior and result format of plain
|
||||
SELECT queries (without INSERT or REPLACE) is subject to change.
|
||||
|
||||
For an explanation of the fields in a report, see [Report response fields](#report-response-fields).
|
||||
|
||||
|
@ -230,11 +233,8 @@ import requests
|
|||
# Make sure you replace `username`, `password`, `your-instance`, `port`, and `taskId` with the values for your deployment.
|
||||
url = "https://<username>:<password>@<hostname>:<port>/druid/indexer/v1/task/<taskId>/reports"
|
||||
|
||||
payload={}
|
||||
headers = {}
|
||||
|
||||
response = requests.request("GET", url, headers=headers, data=payload)
|
||||
|
||||
response = requests.request("GET", url, headers=headers)
|
||||
print(response.text)
|
||||
```
|
||||
|
||||
|
|
|
@ -29,14 +29,15 @@ sidebar_label: "Key concepts"
|
|||
|
||||
## SQL task engine
|
||||
|
||||
The `druid-multi-stage-query` extension adds a multi-stage query (MSQ) task engine that executes SQL SELECT,
|
||||
[INSERT](reference.md#insert), and [REPLACE](reference.md#replace) statements as batch tasks in the indexing service,
|
||||
which execute on [Middle Managers](../design/architecture.md#druid-services). INSERT and REPLACE tasks publish
|
||||
The `druid-multi-stage-query` extension adds a multi-stage query (MSQ) task engine that executes SQL statements as batch
|
||||
tasks in the indexing service, which execute on [Middle Managers](../design/architecture.md#druid-services).
|
||||
[INSERT](reference.md#insert) and [REPLACE](reference.md#replace) tasks publish
|
||||
[segments](../design/architecture.md#datasources-and-segments) just like [all other forms of batch
|
||||
ingestion](../ingestion/index.md#batch). Each query occupies at least two task slots while running: one controller task,
|
||||
and at least one worker task.
|
||||
and at least one worker task. As an experimental feature, the MSQ task engine also supports running SELECT queries as
|
||||
batch tasks. The behavior and result format of plain SELECT (without INSERT or REPLACE) is subject to change.
|
||||
|
||||
You can execute queries using the MSQ task engine through the **Query** view in the [web
|
||||
You can execute SQL statements using the MSQ task engine through the **Query** view in the [web
|
||||
console](../operations/web-console.md) or through the [`/druid/v2/sql/task` API](api.md).
|
||||
|
||||
For more details on how SQL queries are executed using the MSQ task engine, see [multi-stage query
|
||||
|
|
|
@ -30,11 +30,12 @@ description: Introduces multi-stage query architecture and its task engine
|
|||
|
||||
Apache Druid supports SQL-based ingestion using the bundled [`druid-multi-stage-query` extension](#load-the-extension).
|
||||
This extension adds a [multi-stage query task engine for SQL](concepts.md#sql-task-engine) that allows running SQL
|
||||
[INSERT](concepts.md#insert) and [REPLACE](concepts.md#replace) statements as batch tasks.
|
||||
[INSERT](concepts.md#insert) and [REPLACE](concepts.md#replace) statements as batch tasks. As an experimental feature,
|
||||
the task engine also supports running SELECT queries as batch tasks.
|
||||
|
||||
Nearly all SELECT capabilities are available for `INSERT ... SELECT` and `REPLACE ... SELECT` queries, with certain
|
||||
exceptions listed on the [Known issues](./known-issues.md#select) page. This allows great flexibility to apply
|
||||
transformations, filters, JOINs, aggregations, and so on while ingesting data. This also allows in-database
|
||||
Nearly all SELECT capabilities are available in the SQL task engine, with certain exceptions listed on the [Known
|
||||
issues](./known-issues.md#select) page. This allows great flexibility to apply transformations, filters, JOINs,
|
||||
aggregations, and so on as part of `INSERT ... SELECT` and `REPLACE ... SELECT` statements. This also allows in-database
|
||||
transformation: creating new tables based on queries of other tables.
|
||||
|
||||
## Vocabulary
|
||||
|
|
Loading…
Reference in New Issue