druid/parquet.md at 8885805bb397f4b12647ffeecb7f8e532b838ed1

mirror of https://github.com/apache/druid.git synced 2025-02-07 10:38:18 +00:00

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Victoria Lim <lim.t.victoria@gmail.com>

2023-05-19 09:42:27 -07:00

1.8 KiB

Raw Blame History

id	title
parquet	Apache Parquet Extension

This Apache Druid module extends Druid Hadoop based indexing to ingest data directly from offline Apache Parquet files.

Note: If using the parquet-avro parser for Apache Hadoop based indexing, druid-parquet-extensions depends on the druid-avro-extensions module, so be sure to include both.

The druid-parquet-extensions provides the Parquet input format, the Parquet Hadoop parser, and the Parquet Avro Hadoop Parser with druid-avro-extensions. The Parquet input format is available for native batch ingestion and the other 2 parsers are for Hadoop batch ingestion. Please see corresponding docs for details.

1.8 KiB Raw Blame History

1.8 KiB

Raw Blame History