mirror of https://github.com/apache/druid.git
23b9039a02
This patch reworks memory management to better support multi-threaded workers running in shared JVMs. There are two main changes. First, processing buffers and threads are moved from a per-JVM model to a per-worker model. This enables queries to hold processing buffers without blocking other concurrently-running queries. Changes: - Introduce ProcessingBuffersSet and ProcessingBuffers to hold the per-worker and per-work-order processing buffers (respectively). On Peons, this is the JVM-wide processing pool. On Indexers, this is a per-worker pool of on-heap buffers. (This change fixes a bug on Indexers where excessive processing buffers could be used if MSQ tasks ran concurrently with realtime tasks.) - Add "bufferPool" argument to GroupingEngine#process so a per-worker pool can be passed in. - Add "druid.msq.task.memory.maxThreads" property, which controls the maximum number of processing threads to use per task. This allows usage of multiple processing buffers per task if admins desire. - IndexerWorkerContext acquires processingBuffers when creating the FrameContext for a work order, and releases them when closing the FrameContext. - Add "usesProcessingBuffers()" to FrameProcessorFactory so workers know how many sets of processing buffers are needed to run a given query. Second, adjustments to how WorkerMemoryParameters slices up bundles, to favor more memory for sorting and segment generation. Changes: - Instead of using same-sized bundles for processing and for sorting, workers now use minimally-sized processing bundles (just enough to read inputs plus a little overhead). The rest is devoted to broadcast data buffering, sorting, and segment-building. - Segment-building is now limited to 1 concurrent segment per work order. This allows each segment-building action to use more memory. Note that segment-building is internally multi-threaded to a degree. (Build and persist can run concurrently.) - Simplify frame size calculations by removing the distinction between "standard" and "large" frames. The new default frame size is the same as the old "standard" frames, 1 MB. The original goal of of the large frames was to reduce the number of temporary files during sorting, but I think we can achieve the same thing by simply merging a larger number of standard frames at once. - Remove the small worker adjustment that was added in #14117 to account for an extra frame involved in writing to durable storage. Instead, account for the extra frame whenever we are actually using durable storage. - Cap super-sorter parallelism using the number of output partitions, rather than using a hard coded cap at 4. Note that in practice, so far, this cap has not been relevant for tasks because they have only been using a single processing thread anyway. Co-authored-by: Gian Merlino <gianmerlino@gmail.com> |
||
---|---|---|
.. | ||
aliyun-oss-extensions | ||
ambari-metrics-emitter | ||
cassandra-storage | ||
cloudfiles-extensions | ||
compressed-bigdecimal | ||
ddsketch | ||
distinctcount | ||
dropwizard-emitter | ||
druid-deltalake-extensions | ||
druid-iceberg-extensions | ||
gce-extensions | ||
graphite-emitter | ||
influx-extensions | ||
influxdb-emitter | ||
kafka-emitter | ||
kubernetes-overlord-extensions | ||
materialized-view-maintenance | ||
materialized-view-selection | ||
momentsketch | ||
moving-average-query | ||
opentelemetry-emitter | ||
opentsdb-emitter | ||
prometheus-emitter | ||
rabbit-stream-indexing-service | ||
redis-cache | ||
spectator-histogram | ||
sqlserver-metadata-storage | ||
statsd-emitter | ||
tdigestsketch | ||
thrift-extensions | ||
time-min-max | ||
virtual-columns | ||
README.md |
README.md
Community Extensions
Please contribute all community extensions in this directory and include a doc of how your extension can be used under docs/development/extensions-contrib/.
Please note that community extensions are maintained by their original contributors and are not packaged with the core Druid distribution. If you'd like to take on maintenance for a community extension, please post on dev@druid.apache.org to let us know!