mirror of https://github.com/apache/druid.git
fix docs (#7556)
This commit is contained in:
parent
2a65431b08
commit
09b7700d13
|
@ -33,17 +33,20 @@ Note: `druid-parquet-extensions` depends on the `druid-avro-extensions` module,
|
|||
## Parquet Hadoop Parser
|
||||
|
||||
This extension provides two ways to parse Parquet files:
|
||||
|
||||
* `parquet` - using a simple conversion contained within this extension
|
||||
* `parquet-avro` - conversion to avro records with the `parquet-avro` library and using the `druid-avro-extensions`
|
||||
module to parse the avro data
|
||||
|
||||
Selection of conversion method is controlled by parser type, and the correct hadoop input format must also be set in
|
||||
the `ioConfig`, `org.apache.druid.data.input.parquet.DruidParquetInputFormat` for `parquet` and
|
||||
`org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat` for `parquet-avro`.
|
||||
the `ioConfig`:
|
||||
|
||||
* `org.apache.druid.data.input.parquet.DruidParquetInputFormat` for `parquet`
|
||||
* `org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat` for `parquet-avro`
|
||||
|
||||
|
||||
Both parse options support auto field discovery and flattening if provided with a
|
||||
[flattenSpec](../../ingestion/flatten-json.html) with `parquet` or `avro` as the `format`. Parquet nested list and map
|
||||
[flattenSpec](../../ingestion/flatten-json.html) with `parquet` or `avro` as the format. Parquet nested list and map
|
||||
[logical types](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md) _should_ operate correctly with
|
||||
json path expressions for all supported types. `parquet-avro` sets a hadoop job property
|
||||
`parquet.avro.add-list-element-records` to `false` (which normally defaults to `true`), in order to 'unwrap' primitive
|
||||
|
|
Loading…
Reference in New Issue