mirror of https://github.com/apache/druid.git
59bca0951a
During ingestion, incremental segments are created in memory for the different time chunks and persisted to disk when certain thresholds are reached (max number of rows, max memory, incremental persist period etc). In the case where there are a lot of dimension and metrics (1000+) it was observed that the creation/serialization of incremental segment file format for persistence and persisting the file took a while and it was blocking ingestion of new data. This affected the real-time ingestion. This serialization and persistence can be parallelized across the different time chunks. This update aims to do that. The patch adds a simple configuration parameter to the ingestion tuning configuration to specify number of persistence threads. The default value is 1 if it not specified which makes it the same as it is today. |
||
---|---|---|
.. | ||
approximate-histograms.md | ||
avro.md | ||
azure.md | ||
bloom-filter.md | ||
datasketches-extension.md | ||
datasketches-hll.md | ||
datasketches-kll.md | ||
datasketches-quantiles.md | ||
datasketches-theta.md | ||
datasketches-tuple.md | ||
druid-aws-rds.md | ||
druid-basic-security.md | ||
druid-kerberos.md | ||
druid-lookups.md | ||
druid-pac4j.md | ||
druid-ranger-security.md | ||
examples.md | ||
google.md | ||
hdfs.md | ||
kafka-extraction-namespace.md | ||
kafka-ingestion.md | ||
kafka-supervisor-operations.md | ||
kafka-supervisor-reference.md | ||
kinesis-ingestion.md | ||
kubernetes.md | ||
lookups-cached-global.md | ||
mysql.md | ||
orc.md | ||
parquet.md | ||
postgresql.md | ||
protobuf.md | ||
s3.md | ||
simple-client-sslcontext.md | ||
stats.md | ||
test-stats.md |