Update nested columns doc (#13314)

* Updated nested columns doc

* Update nested-columns.md

* Update nested-columns.md
This commit is contained in:
Jill Osborne 2022-11-10 01:53:28 +00:00 committed by GitHub
parent 0512ae4922
commit 965e41538e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 78 additions and 0 deletions

View File

@ -217,6 +217,84 @@ FROM (
PARTITIONED BY ALL
```
## Streaming ingestion
You can ingest nested data into Druid using the [streaming method](../ingestion/index.md#streaming)—for example, from a Kafka topic.
When you [define your supervisor spec](../development/extensions-core/kafka-ingestion.md#define-a-supervisor-spec), include a dimension with type `json` for each nested column. For example, the following supervisor spec from the [Kafka ingestion tutorial](../tutorials/tutorial-kafka.md) contains dimensions for the nested columns `event`, `agent`, and `geo_ip` in datasource `kttm-kafka`.
```json
{
"type": "kafka",
"spec": {
"ioConfig": {
"type": "kafka",
"consumerProperties": {
"bootstrap.servers": "localhost:9092"
},
"topic": "kttm",
"inputFormat": {
"type": "json"
},
"useEarliestOffset": true
},
"tuningConfig": {
"type": "kafka"
},
"dataSchema": {
"dataSource": "kttm-kafka",
"timestampSpec": {
"column": "timestamp",
"format": "iso"
},
"dimensionsSpec": {
"dimensions": [
"session",
"number",
"client_ip",
"language",
"adblock_list",
"app_version",
"path",
"loaded_image",
"referrer",
"referrer_host",
"server_ip",
"screen",
"window",
{
"type": "long",
"name": "session_length"
},
"timezone",
"timezone_offset",
{
"type": "json",
"name": "event"
},
{
"type": "json",
"name": "agent"
},
{
"type": "json",
"name": "geo_ip"
}
]
},
"granularitySpec": {
"queryGranularity": "none",
"rollup": false,
"segmentGranularity": "day"
}
}
}
}
```
The [Kafka tutorial](../tutorials/tutorial-kafka.md) guides you through the steps to load sample nested data into a Kafka topic, then ingest the data into Druid.
### Transform data during SQL-based ingestion
You can use the [JSON nested columns functions](./sql-json-functions.md) to transform JSON data in your ingestion query.