druid

Commit Graph

Author	SHA1	Message	Date
Abhishek Agarwal	efb32810c4	Clean up the core API required for Iceberg extension (#14614 ) Changes: - Replace `AbstractInputSourceBuilder` with `InputSourceFactory` - Move iceberg specific logic to `IcebergInputSource`	2023-07-21 13:01:33 +05:30
Kashif Faraz	993d8a9bf6	Bump up version in iceberg pom (#14605 )	2023-07-18 18:07:19 +05:30
Atul Mohan	03d6d395a0	Extension to read and ingest iceberg data files (#14329 ) This adds a new contrib extension: druid-iceberg-extensions which can be used to ingest data stored in Apache Iceberg format. It adds a new input source of type iceberg that connects to a catalog and retrieves the data files associated with an iceberg table and provides these data file paths to either an S3 or HDFS input source depending on the warehouse location. Two important dependencies associated with Apache Iceberg tables are: Catalog : This extension supports reading from either a Hive Metastore catalog or a Local file-based catalog. Support for AWS Glue is not available yet. Warehouse : This extension supports reading data files from either HDFS or S3. Adapters for other cloud object locations should be easy to add by extending the AbstractInputSourceAdapter.	2023-07-18 08:59:57 +05:30

Author

SHA1

Message

Date

Abhishek Agarwal

efb32810c4

Clean up the core API required for Iceberg extension (#14614 )

Changes:
- Replace `AbstractInputSourceBuilder` with `InputSourceFactory`
- Move iceberg specific logic to `IcebergInputSource`

2023-07-21 13:01:33 +05:30

Kashif Faraz

993d8a9bf6

Bump up version in iceberg pom (#14605 )

2023-07-18 18:07:19 +05:30

Atul Mohan

03d6d395a0

Extension to read and ingest iceberg data files (#14329 )

This adds a new contrib extension: druid-iceberg-extensions which can be used to ingest data stored in Apache Iceberg format. It adds a new input source of type iceberg that connects to a catalog and retrieves the data files associated with an iceberg table and provides these data file paths to either an S3 or HDFS input source depending on the warehouse location.

Two important dependencies associated with Apache Iceberg tables are:

Catalog : This extension supports reading from either a Hive Metastore catalog or a Local file-based catalog. Support for AWS Glue is not available yet.
Warehouse : This extension supports reading data files from either HDFS or S3. Adapters for other cloud object locations should be easy to add by extending the AbstractInputSourceAdapter.

2023-07-18 08:59:57 +05:30

3 Commits