对如何在 Druid 中进行查询进行翻译

This commit is contained in:
YuCheng Hu 2021-07-31 11:22:42 -04:00
parent 0842729271
commit 7219b643e1
No known key found for this signature in database
GPG Key ID: C395DC68EF030B59
1 changed files with 44 additions and 73 deletions

View File

@ -1,76 +1,47 @@
--- # 查询数据
id: tutorial-query
title: "Tutorial: Querying data"
sidebar_label: "Querying data"
---
<!-- 本教程文档主要为了对如何在 Apache Druid 使用 SQL 进行查询进行说明。
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file 假设你已经完成了 [快速开始](../tutorials/index.md) 页面中的内容或者下面页面中有关的内容的内容。因为在 Apache Druid 中进行查询之前,
~ distributed with this work for additional information 你需要将注入导入到 Druid 后才能够让进行下一步的操作:
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the * [教程:载入一个文件](../tutorials/tutorial-batch.md)
~ "License"); you may not use this file except in compliance * [教程:从 Kafka 中载入流数据](../tutorials/tutorial-kafka.md)
~ with the License. You may obtain a copy of the License at * [教程:使用 Hadoop 载入一个文件](../tutorials/tutorial-batch-hadoop.md)
~
~ http://www.apache.org/licenses/LICENSE-2.0 有多种方法能在 Druid 中运行 SQL 查询:从 Druid 控制台中进行查询使用命令行工具command line utility进行查询查询使用 HTTP 方式进行查询。
~
~ Unless required by applicable law or agreed to in writing, 下面我们将会针对各种查询方式进行说明。
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
This tutorial demonstrates how to query data in Apache Druid using SQL. ## 从 Druid 控制台Druid console中进行查询
It assumes that you've completed the [Quickstart](../tutorials/index.md) Druid 控制台提供了视图能够让用户更加容易的在 Druid 进行查询测试,并且查看查询返回的结果。
or one of the following tutorials, since we'll query datasources that you would have created
by following one of them:
* [Tutorial: Loading a file](../tutorials/tutorial-batch.md) 1. 如果你的 Druid 没有启动的话,你需要先行启动 Druid 集群,然后通过你的浏览器访问 Druid 控制台。
* [Tutorial: Loading stream data from Kafka](../tutorials/tutorial-kafka.md)
* [Tutorial: Loading a file using Hadoop](../tutorials/tutorial-batch-hadoop.md)
There are various ways to run Druid SQL queries: from the Druid console, using a command line utility 2. 单击顶部的 **查询Query** 导航进入查询界面:
and by posting the query by HTTP. We'll look at each of these.
## Query SQL from the Druid console
The Druid console includes a view that makes it easier to build and test queries, and
view their results.
1. Start up the Druid cluster, if it's not already running, and open the Druid console in your web
browser.
2. Click **Query** from the header to open the Query view:
![Query view](../assets/tutorial-query-01.png "Query view") ![Query view](../assets/tutorial-query-01.png "Query view")
You can always write queries directly in the edit pane, but the Query view also provides 你可以在编辑器中直接写查询语句和脚本,同时 Query 查询输入对话框也提供了代码自动完成功能帮助你完成需要使用的查询语句。
facilities to help you construct SQL queries, which we will use to generate a starter query.
3. Expand the wikipedia datasource tree in the left pane. We'll 3. 从左侧的面板中打开 wikipedia 数据源,我们将会从这里对数据源中的 page 进行查询。
create a query for the page dimension.
4. Click `page` and then **Show:page** from the menu: 4. 单击 `page` 然后从菜单中选项 **Show:page**
![Query select page](../assets/tutorial-query-02.png "Query select page") ![Query select page](../assets/tutorial-query-02.png "Query select page")
A SELECT query appears in the query edit pane and immediately runs. However, in this case, the query SELECT 查询语句将会在查询编辑器中显示。但是,现在如果你进行查询的话是没有任何返回数据的,这是因为默认的查询时间为最近的一天,
returns no data, since by default the query filters for data from the last day, while our data is considerably 但是我们的数据已经远比这个数据老。因此我们需要删除这个过滤器filter
older than that. Let's remove the filter.
5. In the datasource tree, click `__time` and **Remove Filter**. 5. 在 datasource 的树中,单击 `__time` 然后选择 **Remove Filter**
![Clear WHERE filter](../assets/tutorial-query-03.png "Clear WHERE filter") ![Clear WHERE filter](../assets/tutorial-query-03.png "Clear WHERE filter")
6. Click **Run** to run the query. 6. 单击 **Run** 来运行这个查询。
You should now see two columns of data, a page name and the count: 你应该在返回的对话框中看到 2 列的数据,这个包括有 page name 和 count:
![Query results](../assets/tutorial-query-04.png "Query results") ![Query results](../assets/tutorial-query-04.png "Query results")
@ -111,7 +82,7 @@ returns the number of edits for the page. Make the same column name change in th
8. Click the countryName dimension in the left pane and choose the first filtering option. It's not exactly what we want, but 8. Click the countryName dimension in the left pane and choose the first filtering option. It's not exactly what we want, but
we'll edit it by hand. The new WHERE clause should appear in your query. we'll edit it by hand. The new WHERE clause should appear in your query.
8. Modify the WHERE clause to exclude results that do not have a value for countryName: 9. Modify the WHERE clause to exclude results that do not have a value for countryName:
```sql ```sql
WHERE "countryName" IS NOT NULL WHERE "countryName" IS NOT NULL
@ -120,29 +91,29 @@ we'll edit it by hand. The new WHERE clause should appear in your query.
![Finished query](../assets/tutorial-query-035.png "Finished query") ![Finished query](../assets/tutorial-query-035.png "Finished query")
9. Under the covers, every Druid SQL query is translated into a query in the JSON-based _Druid native query_ format before it runs 10. Under the covers, every Druid SQL query is translated into a query in the JSON-based _Druid native query_ format before it runs
on data nodes. You can view the native query for this query by clicking `...` and **Explain SQL Query**. on data nodes. You can view the native query for this query by clicking `...` and **Explain SQL Query**.
While you can use Druid SQL for most purposes, familiarity with native query is useful for composing complex queries and for troubleshooting While you can use Druid SQL for most purposes, familiarity with native query is useful for composing complex queries and for troubleshooting
performance issues. For more information, see [Native queries](../querying/querying.md). performance issues. For more information, see [Native queries](../querying/querying.md).
![Explain query](../assets/tutorial-query-06.png "Explain query") ![Explain query](../assets/tutorial-query-06.png "Explain query")
> Another way to view the explain plan is by adding EXPLAIN PLAN FOR to the front of your query, as follows: > Another way to view the explain plan is by adding EXPLAIN PLAN FOR to the front of your query, as follows:
> >
>```sql >```sql
>EXPLAIN PLAN FOR >EXPLAIN PLAN FOR
>SELECT >SELECT
> "page", > "page",
> "countryName", > "countryName",
> COUNT(*) AS "Edits" > COUNT(*) AS "Edits"
>FROM "wikipedia" >FROM "wikipedia"
>WHERE "countryName" IS NOT NULL >WHERE "countryName" IS NOT NULL
>GROUP BY 1, 2 >GROUP BY 1, 2
>ORDER BY "Edits" DESC >ORDER BY "Edits" DESC
>``` >```
>This is particularly useful when running queries >This is particularly useful when running queries
from the command line or over HTTP. from the command line or over HTTP.
9. Finally, click `...` and **Edit context** to see how you can add additional parameters controlling the execution of the query execution. In the field, enter query context options as JSON key-value pairs, as described in [Context flags](../querying/query-context.md). 9. Finally, click `...` and **Edit context** to see how you can add additional parameters controlling the execution of the query execution. In the field, enter query context options as JSON key-value pairs, as described in [Context flags](../querying/query-context.md).