druid/extensions-contrib
Kashif Faraz 23b9039a02
MSQ: Rework memory management. (#17057) (#17210)
This patch reworks memory management to better support multi-threaded
workers running in shared JVMs. There are two main changes.

First, processing buffers and threads are moved from a per-JVM model to
a per-worker model. This enables queries to hold processing buffers
without blocking other concurrently-running queries. Changes:

- Introduce ProcessingBuffersSet and ProcessingBuffers to hold the
  per-worker and per-work-order processing buffers (respectively). On Peons,
  this is the JVM-wide processing pool. On Indexers, this is a per-worker
  pool of on-heap buffers. (This change fixes a bug on Indexers where
  excessive processing buffers could be used if MSQ tasks ran concurrently
  with realtime tasks.)

- Add "bufferPool" argument to GroupingEngine#process so a per-worker pool
  can be passed in.

- Add "druid.msq.task.memory.maxThreads" property, which controls the
  maximum number of processing threads to use per task. This allows usage of
  multiple processing buffers per task if admins desire.

- IndexerWorkerContext acquires processingBuffers when creating the FrameContext
  for a work order, and releases them when closing the FrameContext.

- Add "usesProcessingBuffers()" to FrameProcessorFactory so workers know
  how many sets of processing buffers are needed to run a given query.

Second, adjustments to how WorkerMemoryParameters slices up bundles, to
favor more memory for sorting and segment generation. Changes:

- Instead of using same-sized bundles for processing and for sorting,
  workers now use minimally-sized processing bundles (just enough to read
  inputs plus a little overhead). The rest is devoted to broadcast data
  buffering, sorting, and segment-building.

- Segment-building is now limited to 1 concurrent segment per work order.
  This allows each segment-building action to use more memory. Note that
  segment-building is internally multi-threaded to a degree. (Build and
  persist can run concurrently.)

- Simplify frame size calculations by removing the distinction between
  "standard" and "large" frames. The new default frame size is the same
  as the old "standard" frames, 1 MB. The original goal of of the large
  frames was to reduce the number of temporary files during sorting, but
  I think we can achieve the same thing by simply merging a larger number
  of standard frames at once.

- Remove the small worker adjustment that was added in #14117 to account
  for an extra frame involved in writing to durable storage. Instead,
  account for the extra frame whenever we are actually using durable storage.

- Cap super-sorter parallelism using the number of output partitions, rather
  than using a hard coded cap at 4. Note that in practice, so far, this cap
  has not been relevant for tasks because they have only been using a single
  processing thread anyway.

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
2024-10-01 19:50:24 +05:30
..
aliyun-oss-extensions Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
ambari-metrics-emitter Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
cassandra-storage Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
cloudfiles-extensions Bump jclouds.version from 2.5.0 to 2.6.0 (#16796) 2024-07-29 14:49:26 +08:00
compressed-bigdecimal generic block compressed complex columns (#16863) 2024-08-27 00:34:41 -07:00
ddsketch generic block compressed complex columns (#16863) 2024-08-27 00:34:41 -07:00
distinctcount transition away from StorageAdapter (#16985) (#17024) 2024-09-09 21:43:41 -07:00
dropwizard-emitter Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
druid-deltalake-extensions Update Delta Kernel to 3.2.1 (#17179) (#17198) 2024-09-30 11:50:04 -07:00
druid-iceberg-extensions Support Iceberg ingestion from REST based catalogs (#17124) (#17145) 2024-09-24 12:09:27 -07:00
gce-extensions Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
graphite-emitter Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
influx-extensions Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
influxdb-emitter Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
kafka-emitter Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
kubernetes-overlord-extensions Add support for selective loading of broadcast datasources in the task layer (#17027) (#17206) 2024-10-01 11:09:52 +05:30
materialized-view-maintenance remove Firehose and FirehoseFactory (#16758) 2024-07-19 14:37:21 -07:00
materialized-view-selection remove isDescending from Query interface, move to TimeseriesQuery (#16917) 2024-08-19 23:02:45 -07:00
momentsketch generic block compressed complex columns (#16863) 2024-08-27 00:34:41 -07:00
moving-average-query rework cursor creation (#16533) 2024-08-16 11:34:10 -07:00
opentelemetry-emitter Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.65.1 (#16731) 2024-07-29 14:51:39 +08:00
opentsdb-emitter Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
prometheus-emitter remove Firehose and FirehoseFactory (#16758) 2024-07-19 14:37:21 -07:00
rabbit-stream-indexing-service generic block compressed complex columns (#16863) 2024-08-27 00:34:41 -07:00
redis-cache Remove incorrect utf8 conversion of ResultCache keys (#16569) 2024-06-12 13:12:05 -07:00
spectator-histogram generic block compressed complex columns (#16863) 2024-08-27 00:34:41 -07:00
sqlserver-metadata-storage Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
statsd-emitter Add indexer task success and failure metrics (#16829) 2024-08-05 16:21:27 +05:30
tdigestsketch generic block compressed complex columns (#16863) 2024-08-27 00:34:41 -07:00
thrift-extensions Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
time-min-max Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
virtual-columns MSQ: Rework memory management. (#17057) (#17210) 2024-10-01 19:50:24 +05:30
README.md fix broken links (#9537) 2020-03-22 17:41:18 -07:00

README.md

Community Extensions

Please contribute all community extensions in this directory and include a doc of how your extension can be used under docs/development/extensions-contrib/.

Please note that community extensions are maintained by their original contributors and are not packaged with the core Druid distribution. If you'd like to take on maintenance for a community extension, please post on dev@druid.apache.org to let us know!