druid/extensions-core
AmatyaAvadhanula f970757efc
Optimize overlord GET /tasks memory usage (#12404)
The web-console (indirectly) calls the Overlord’s GET tasks API to fetch the tasks' summary which in turn queries the metadata tasks table. This query tries to fetch several columns, including payload, of all the rows at once. This introduces a significant memory overhead and can cause unresponsiveness or overlord failure when the ingestion tab is opened multiple times (due to several parallel calls to this API)

Another thing to note is that the task table (the payload column in particular) can be very large. Extracting large payloads from such tables can be very slow, leading to slow UI. While we are fixing the memory pressure in the overlord, we can also fix the slowness in UI caused by fetching large payloads from the table. Fetching large payloads also puts pressure on the metadata store as reported in the community (Metadata store query performance degrades as the tasks in druid_tasks table grows · Issue #12318 · apache/druid )

The task summaries returned as a response for the API are several times smaller and can fit comfortably in memory. So, there is an opportunity here to fix the memory usage, slow ingestion, and under-pressure metadata store by removing the need to handle large payloads in every layer we can. Of course, the solution becomes complex as we try to fix more layers. With that in mind, this page captures two approaches. They vary in complexity and also in the degree to which they fix the aforementioned problems.
2022-06-16 22:30:37 +05:30
..
avro-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
azure-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
datasketches Use datasketches version 3.2.0 (#12509) 2022-05-13 11:28:15 +05:30
druid-aws-rds-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
druid-basic-security Improve build performance of modules (#12486) 2022-05-01 22:43:11 +08:00
druid-bloom-filter Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
druid-kerberos Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
druid-pac4j Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
druid-ranger-security Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
ec2-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
google-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
hdfs-storage Add authentication call before cleaning up intermediate files in hadoop ingestions (#12030) 2022-05-02 08:40:44 -05:00
histogram Free ByteBuffers in tests and fix some bugs. (#12521) 2022-05-19 07:42:29 -07:00
kafka-extraction-namespace Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
kafka-indexing-service Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
kinesis-indexing-service Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
kubernetes-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
lookups-cached-global Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
lookups-cached-single Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
mysql-metadata-storage Optimize overlord GET /tasks memory usage (#12404) 2022-06-16 22:30:37 +05:30
orc-extensions Upgrade ORC to 1.7.4 (#12572) 2022-05-28 17:44:36 +05:30
parquet-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
postgresql-metadata-storage Optimize overlord GET /tasks memory usage (#12404) 2022-06-16 22:30:37 +05:30
protobuf-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
s3-extensions Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
simple-client-sslcontext Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
stats Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30
testing-tools Bump up the versions (#12480) 2022-04-27 14:28:20 +05:30