druid/docs/development
Pramod Immaneni 59bca0951a
Parallelize storage of incremental segments (#13982)
During ingestion, incremental segments are created in memory for the different time chunks and persisted to disk when certain thresholds are reached (max number of rows, max memory, incremental persist period etc). In the case where there are a lot of dimension and metrics (1000+) it was observed that the creation/serialization of incremental segment file format for persistence and persisting the file took a while and it was blocking ingestion of new data. This affected the real-time ingestion. This serialization and persistence can be parallelized across the different time chunks. This update aims to do that.

The patch adds a simple configuration parameter to the ingestion tuning configuration to specify number of persistence threads. The default value is 1 if it not specified which makes it the same as it is today.
2024-02-07 10:43:05 +05:30
..
extensions-contrib Extension to read and ingest Delta Lake tables (#15755) 2024-01-30 21:53:50 -08:00
extensions-core Parallelize storage of incremental segments (#13982) 2024-02-07 10:43:05 +05:30
build.md Update Hadoop3 as default build version (#14005) 2023-04-26 12:52:51 +05:30
docs-contribute.md docs: Anchor link checker (#15624) 2024-01-08 15:19:05 -08:00
experimental-features.md docs: Anchor link checker (#15624) 2024-01-08 15:19:05 -08:00
experimental.md Docusaurus build framework + ingestion doc refresh. (#8311) 2019-08-20 21:48:59 -07:00
javascript.md cleaning up and fixing links (#10528) 2020-12-17 13:37:43 -08:00
modules.md Revamp design page (#15486) 2023-12-08 11:40:24 -08:00
overview.md Fix a broken link in the development doc (#11226) 2021-05-10 16:14:06 +08:00
versioning.md De-incubation cleanup in code, docs, packaging (#9108) 2020-01-03 12:33:19 -05:00