druid

History

Didip Kerabat 6ddb828c7a Able to filter Cloud objects with glob notation. (#12659 ) In a heterogeneous environment, sometimes you don't have control over the input folder. Upstream can put any folder they want. In this situation the S3InputSource.java is unusable. Most people like me solved it by using Airflow to fetch the full list of parquet files and pass it over to Druid. But doing this explodes the JSON spec. We had a situation where 1 of the JSON spec is 16MB and that's simply too much for Overlord. This patch allows users to pass {"filter": "*.parquet"} and let Druid performs the filtering of the input files. I am using the glob notation to be consistent with the LocalFirehose syntax.		2022-06-24 11:40:08 +05:30
..
_bin	De-incubation cleanup in code, docs, packaging (#9108 )	2020-01-03 12:33:19 -05:00
assets	Update screenshots for Druid console doc (#12593 )	2022-06-15 16:42:20 -07:00
comparisons	Update druid-vs-kudu.md (#11470 )	2021-07-21 22:58:14 +08:00
configuration	Disable autokill of segments by default. (#12693 )	2022-06-23 17:17:11 -07:00
dependencies	Doc updates for metadata cleanup and storage (#12190 )	2022-01-27 11:40:54 -08:00
design	Segments doc update (#12344 )	2022-06-16 13:25:17 -07:00
development	Update screenshots for Druid console doc (#12593 )	2022-06-15 16:42:20 -07:00
ingestion	Able to filter Cloud objects with glob notation. (#12659 )	2022-06-24 11:40:08 +05:30
misc	Docs – expressions link back and timestamp hint (#11674 )	2022-03-29 09:12:30 -07:00
operations	Update screenshots for Druid console doc (#12593 )	2022-06-15 16:42:20 -07:00
querying	Add TIME_IN_INTERVAL SQL operator. (#12662 )	2022-06-21 13:05:37 -07:00
tutorials	Add TIME_IN_INTERVAL SQL operator. (#12662 )	2022-06-21 13:05:37 -07:00