druid/extensions-core
Pramod Immaneni 59bca0951a
Parallelize storage of incremental segments (#13982)
During ingestion, incremental segments are created in memory for the different time chunks and persisted to disk when certain thresholds are reached (max number of rows, max memory, incremental persist period etc). In the case where there are a lot of dimension and metrics (1000+) it was observed that the creation/serialization of incremental segment file format for persistence and persisting the file took a while and it was blocking ingestion of new data. This affected the real-time ingestion. This serialization and persistence can be parallelized across the different time chunks. This update aims to do that.

The patch adds a simple configuration parameter to the ingestion tuning configuration to specify number of persistence threads. The default value is 1 if it not specified which makes it the same as it is today.
2024-02-07 10:43:05 +05:30
..
avro-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
azure-extensions Batch kill in azure (#15770) 2024-01-31 13:41:15 -05:00
datasketches Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
druid-aws-rds-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
druid-basic-security Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
druid-bloom-filter Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
druid-catalog Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
druid-kerberos Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
druid-pac4j pac4j: fix incompatible dependencies + authorization regression (#15753) 2024-02-01 09:35:23 -08:00
druid-ranger-security Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
ec2-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
google-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
hdfs-storage Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
histogram Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
kafka-extraction-namespace Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
kafka-indexing-service Parallelize storage of incremental segments (#13982) 2024-02-07 10:43:05 +05:30
kinesis-indexing-service Parallelize storage of incremental segments (#13982) 2024-02-07 10:43:05 +05:30
kubernetes-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
lookups-cached-global Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
lookups-cached-single Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
multi-stage-query Fix serialization bug in PassthroughAggregatorFactory (#15830) 2024-02-05 15:11:10 +05:30
mysql-metadata-storage Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
orc-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
parquet-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
postgresql-metadata-storage Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
protobuf-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
s3-extensions Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
simple-client-sslcontext Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
stats Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
testing-tools Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30