翻译 Kafka 查询你的数据-清理-延伸阅读

2021-08-06 14:36:30 -04:00 · 2021-08-06 14:36:30 -04:00 · 73fa800ec1
parent 7c25345065
commit 73fa800ec1
1 changed files with 10 additions and 108 deletions
--- a/tutorials/tutorial-kafka.md
+++ b/tutorials/tutorial-kafka.md
@ -228,120 +228,22 @@ curl -XPOST -H'Content-Type: application/json' -d @quickstart/tutorial/wikipedia
 你也可以从 Druid 的控制台中查看当前的 supervisors 和任务。针对本地服务器的访问地址为： [http://localhost:8888/unified-console.html#tasks](http://localhost:8888/unified-console.html#tasks) 。
-## Querying your data
+## 查询你的数据
-After data is sent to the Kafka stream, it is immediately available for querying.
+当数据发送到 Kafka 后，Druid 应该能够马上查询到导入的数据的。
-Please follow the [query tutorial](../tutorials/tutorial-query.md) to run some example queries on the newly loaded data.
+请访问 [query tutorial](../tutorials/tutorial-query.md) 页面中的内容来了解如何针对新导入的数据运行一些查询。
-## Cleanup
+## 清理
 如果你希望其他的一些入门教程的话，你需要首先关闭 Druid 集群；删除 `var` 目录中的所有内容；再重新启动 Druid 集群。
 这是因为本教程中其他的导入数据方式也会写入相同的 "wikipedia" 数据源，如果你使用不同的数据源的话就不需要进行清理了。
-To go through any of the other ingestion tutorials, you will need to shut down the cluster and reset the cluster state by removing the contents of the `var` directory in the Druid home, as the other tutorials will write to the same "wikipedia" datasource.
+同时你可能也希望清理掉 Kafka 中的数据。你可以通过 <kbd>CTRL-C</kbd> 来关闭 Kafka 的进程。在关闭 Kafka 进程之前，请不要关闭 ZooKeeper 和 Druid 服务。
-
+然后删除 Kafka 的 log 目录`/tmp/kafka-logs`:
 You should additionally clear out any Kafka state. Do so by shutting down the Kafka broker with CTRL-C before stopping ZooKeeper and the Druid services, and then deleting the Kafka log directory at `/tmp/kafka-logs`:
 ```bash
 rm -rf /tmp/kafka-logs
 ```
-
+## 延伸阅读
-## Further reading
+有关更多如何从 Kafka 中载入流数据的方法，请参考 [Druid Kafka indexing service documentation](../development/extensions-core/kafka-ingestion.md) 页面中的内容。
 For more information on loading data from Kafka streams, please see the [Druid Kafka indexing service documentation](../development/extensions-core/kafka-ingestion.md).
 #### 通过控制台提交supervisor
 在控制台中点击 `Submit supervisor` 打开提交supervisor对话框：
 ![](img-2/tutorial-kafka-submit-supervisor-01.png)
 粘贴以下规范后点击 `Submit`
 ```json
 {
  "type": "kafka",
  "spec" : {
    "dataSchema": {
      "dataSource": "wikipedia",
      "timestampSpec": {
        "column": "time",
        "format": "auto"
      },
      "dimensionsSpec": {
        "dimensions": [
          "channel",
          "cityName",
          "comment",
          "countryIsoCode",
          "countryName",
          "isAnonymous",
          "isMinor",
          "isNew",
          "isRobot",
          "isUnpatrolled",
          "metroCode",
          "namespace",
          "page",
          "regionIsoCode",
          "regionName",
          "user",
          { "name": "added", "type": "long" },
          { "name": "deleted", "type": "long" },
          { "name": "delta", "type": "long" }
        ]
      },
      "metricsSpec" : [],
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "DAY",
        "queryGranularity": "NONE",
        "rollup": false
      }
    },
    "tuningConfig": {
      "type": "kafka",
      "reportParseExceptions": false
    },
    "ioConfig": {
      "topic": "wikipedia",
      "inputFormat": {
        "type": "json"
      },
      "replicas": 2,
      "taskDuration": "PT10M",
      "completionTimeout": "PT20M",
      "consumerProperties": {
        "bootstrap.servers": "localhost:9092"
      }
    }
  }
 }
 ```
 这将启动supervisor，该supervisor继而产生一些任务，这些任务将开始监听传入的数据。
 #### 直接提交supervisor
 为了直接启动服务，我们可以在Druid的根目录下运行以下命令来提交一个supervisor规范到Druid Overlord中
 ```json
 curl -XPOST -H'Content-Type: application/json' -d @quickstart/tutorial/wikipedia-kafka-supervisor.json http://localhost:8081/druid/indexer/v1/supervisor
 ```
 如果supervisor被成功创建后，将会返回一个supervisor的ID，在本例中看到的是 `{"id":"wikipedia"}`
 更详细的信息可以查看[Druid Kafka索引服务文档](../ingestion/kafka.md)
 您可以在[Druid控制台]( http://localhost:8888/unified-console.html#tasks)中查看现有的supervisors和tasks
 ### 数据查询
 数据被发送到Kafka流之后，立刻就可以被查询了。
 按照[查询教程](./chapter-4.md)的操作，对新加载的数据执行一些示例查询
 ### 清理数据
 如果您希望阅读其他任何入门教程，则需要关闭集群并通过删除druid软件包下的`var`目录的内容来重置集群状态，因为其他教程将写入相同的"wikipedia"数据源。
 ### 进一步阅读
 更多关于从Kafka流加载数据的信息，可以查看[Druid Kafka索引服务文档](../ingestion/kafka.md)