druid/docs/development/extensions-core
Pramod Immaneni 59bca0951a
Parallelize storage of incremental segments (#13982)
During ingestion, incremental segments are created in memory for the different time chunks and persisted to disk when certain thresholds are reached (max number of rows, max memory, incremental persist period etc). In the case where there are a lot of dimension and metrics (1000+) it was observed that the creation/serialization of incremental segment file format for persistence and persisting the file took a while and it was blocking ingestion of new data. This affected the real-time ingestion. This serialization and persistence can be parallelized across the different time chunks. This update aims to do that.

The patch adds a simple configuration parameter to the ingestion tuning configuration to specify number of persistence threads. The default value is 1 if it not specified which makes it the same as it is today.
2024-02-07 10:43:05 +05:30
..
approximate-histograms.md Docusaurus2 upgrade for master (#14411) 2023-08-16 19:01:21 -07:00
avro.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
azure.md Add support for Azure Goverment storage (#15523) 2024-01-09 22:33:32 +05:30
bloom-filter.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
datasketches-extension.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
datasketches-hll.md Docusaurus2 upgrade for master (#14411) 2023-08-16 19:01:21 -07:00
datasketches-kll.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
datasketches-quantiles.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
datasketches-theta.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
datasketches-tuple.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
druid-aws-rds.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
druid-basic-security.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
druid-kerberos.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
druid-lookups.md Docusaurus2 upgrade for master (#14411) 2023-08-16 19:01:21 -07:00
druid-pac4j.md Update OIDCConfig with scope information (#13973) 2023-03-28 14:50:00 +05:30
druid-ranger-security.md ranger-security: exclude jackson-jaxrs from + fix outdated documentation (#15481) 2023-12-05 08:24:37 -08:00
examples.md De-incubation cleanup in code, docs, packaging (#9108) 2020-01-03 12:33:19 -05:00
google.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
hdfs.md Removes support for Hadoop 2 (#14763) 2023-08-09 17:47:52 +05:30
kafka-extraction-namespace.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
kafka-ingestion.md Add docs for ingesting Kafka topic name (#14894) 2023-08-24 19:19:59 +05:30
kafka-supervisor-operations.md docs: fix code tabs (#15403) 2023-11-20 11:16:10 -08:00
kafka-supervisor-reference.md Parallelize storage of incremental segments (#13982) 2024-02-07 10:43:05 +05:30
kinesis-ingestion.md Kinesis adaptive memory management (#15360) 2024-01-19 14:30:21 -05:00
kubernetes.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
lookups-cached-global.md Reverse, pull up lookups in the SQL planner. (#15626) 2024-01-12 00:06:31 -08:00
mysql.md docs: suggest metadata store with instant ADD COLUMN semantics (#15334) 2023-11-09 12:56:30 -08:00
orc.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
parquet.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
postgresql.md docs: suggest metadata store with instant ADD COLUMN semantics (#15334) 2023-11-09 12:56:30 -08:00
protobuf.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
s3.md Update Ingestion section (#14023) 2023-05-19 09:42:27 -07:00
simple-client-sslcontext.md Fix broken links to Oracle JDK docs (#13687) 2023-01-18 14:46:08 +05:30
stats.md Docusaurus2 upgrade for master (#14411) 2023-08-16 19:01:21 -07:00
test-stats.md De-incubation cleanup in code, docs, packaging (#9108) 2020-01-03 12:33:19 -05:00