Docs: Clarify the situation with SELECT. (#13109)

2022-09-17 10:47:57 -07:00 · 2022-09-17 10:47:57 -07:00 · 4b2f1adecf
parent 5745e878ab
commit 4b2f1adecf
3 changed files with 25 additions and 23 deletions
--- a/docs/multi-stage-query/api.md
+++ b/docs/multi-stage-query/api.md
@ -42,15 +42,15 @@ You submit queries to the MSQ task engine using the `POST /druid/v2/sql/task/` e

 #### Request

-Currently, the MSQ task engine ignores the provided values of `resultFormat`, `header`,
-`typesHeader`, and `sqlTypesHeader`. SQL SELECT queries write out their results into the task report (in the `multiStageQuery.payload.results.results` key) formatted as if `resultFormat` is an `array`.
+The SQL task endpoint accepts [SQL requests in the JSON-over-HTTP form](../querying/sql-api.md#request-body) using the
+`query`, `context`, and `parameters` fields, but ignoring the `resultFormat`, `header`, `typesHeader`, and
+`sqlTypesHeader` fields.

-For task queries similar to the [example queries](./examples.md), you need to escape characters such as quotation marks (") if you use something like `curl`. 
-You don't need to escape characters if you use a method that can parse JSON seamlessly, such as Python.
-The Python example in this topic escapes quotation marks although it's not required.
+This endpoint accepts [INSERT](reference.md#insert) and [REPLACE](reference.md#replace) statements.

-The following example is the same query that you submit when you complete [Convert a JSON ingestion
-spec](../tutorials/tutorial-msq-convert-spec.md) where you insert data into a table named `wikipedia`. 
+As an experimental feature, this endpoint also accepts SELECT queries. SELECT query results are collected from workers
+by the controller, and written into the [task report](#get-the-report-for-a-query-task) as an array of arrays. The
+behavior and result format of plain SELECT queries (without INSERT or REPLACE) is subject to change.

 <!--DOCUSAURUS_CODE_TABS-->

@ -199,9 +199,12 @@ A report provides detailed information about a query task, including things like

 Keep the following in mind when using the task API to view reports:

- For SELECT queries, the report includes the results. At this time, if you want to view results for SELECT queries, you need to retrieve them as a generic map from the report and extract the results.
- The task report stores query details for controller tasks.
- If you encounter `500 Server Error` or `404 Not Found` errors, the task may be in the process of starting up or shutting down.
+- The task report for an entire job is associated with the `query_controller` task. The `query_worker` tasks do not have
+  their own reports; their information is incorporated into the controller report.
+- The task report API may report `404 Not Found` temporarily while the task is in the process of starting up.
+- As an experimental feature, the SQL task engine supports running SELECT queries. SELECT query results are written into
+the `multiStageQuery.payload.results.results` task report key as an array of arrays. The behavior and result format of plain
+SELECT queries (without INSERT or REPLACE) is subject to change.

 For an explanation of the fields in a report, see [Report response fields](#report-response-fields).

@ -230,11 +233,8 @@ import requests
 # Make sure you replace `username`, `password`, `your-instance`, `port`, and `taskId` with the values for your deployment.
 url = "https://<username>:<password>@<hostname>:<port>/druid/indexer/v1/task/<taskId>/reports"

-payload={}
 headers = {}
-
-response = requests.request("GET", url, headers=headers, data=payload)
-
+response = requests.request("GET", url, headers=headers)
 print(response.text)
 ```

--- a/docs/multi-stage-query/concepts.md
+++ b/docs/multi-stage-query/concepts.md
@ -29,14 +29,15 @@ sidebar_label: "Key concepts"

 ## SQL task engine

-The `druid-multi-stage-query` extension adds a multi-stage query (MSQ) task engine that executes SQL SELECT,
-[INSERT](reference.md#insert), and [REPLACE](reference.md#replace) statements as batch tasks in the indexing service,
-which execute on [Middle Managers](../design/architecture.md#druid-services). INSERT and REPLACE tasks publish
+The `druid-multi-stage-query` extension adds a multi-stage query (MSQ) task engine that executes SQL statements as batch
+tasks in the indexing service, which execute on [Middle Managers](../design/architecture.md#druid-services).
+[INSERT](reference.md#insert) and [REPLACE](reference.md#replace) tasks publish
 [segments](../design/architecture.md#datasources-and-segments) just like [all other forms of batch
 ingestion](../ingestion/index.md#batch). Each query occupies at least two task slots while running: one controller task,
-and at least one worker task.
+and at least one worker task. As an experimental feature, the MSQ task engine also supports running SELECT queries as
+batch tasks. The behavior and result format of plain SELECT (without INSERT or REPLACE) is subject to change.

-You can execute queries using the MSQ task engine through the **Query** view in the [web
+You can execute SQL statements using the MSQ task engine through the **Query** view in the [web
 console](../operations/web-console.md) or through the [`/druid/v2/sql/task` API](api.md).

 For more details on how SQL queries are executed using the MSQ task engine, see [multi-stage query
--- a/docs/multi-stage-query/index.md
+++ b/docs/multi-stage-query/index.md
@ -30,11 +30,12 @@ description: Introduces multi-stage query architecture and its task engine

 Apache Druid supports SQL-based ingestion using the bundled [`druid-multi-stage-query` extension](#load-the-extension).
 This extension adds a [multi-stage query task engine for SQL](concepts.md#sql-task-engine) that allows running SQL
-[INSERT](concepts.md#insert) and [REPLACE](concepts.md#replace) statements as batch tasks.
+[INSERT](concepts.md#insert) and [REPLACE](concepts.md#replace) statements as batch tasks. As an experimental feature,
+the task engine also supports running SELECT queries as batch tasks.

-Nearly all SELECT capabilities are available for `INSERT ... SELECT` and `REPLACE ... SELECT` queries, with certain
-exceptions listed on the [Known issues](./known-issues.md#select) page. This allows great flexibility to apply
-transformations, filters, JOINs, aggregations, and so on while ingesting data. This also allows in-database
+Nearly all SELECT capabilities are available in the SQL task engine, with certain exceptions listed on the [Known
+issues](./known-issues.md#select) page. This allows great flexibility to apply transformations, filters, JOINs,
+aggregations, and so on as part of `INSERT ... SELECT` and `REPLACE ... SELECT` statements. This also allows in-database
 transformation: creating new tables based on queries of other tables.

 ## Vocabulary