From b652f66378590e43ae49fc8f58070c13ac846ab3 Mon Sep 17 00:00:00 2001 From: Alan Guo Xiang Tan Date: Wed, 11 Sep 2024 07:58:36 +0800 Subject: [PATCH] DEV: Cap number of `thread-loader` workers in `assets:precompile:build` (#28830) We were running into errors running `ember build` on machines with high CPU counts. It was then noted that `thread-loader`, which embroider uses, defaults to spinning up x workers where x is number of physical CPU cores - 1. That is probably too much so we set out to find out an optimial count to set for the `JOBS` env which embroider will use to set the number of `thread-loader` workers. I first built an image using the following Dockerfile. ``` FROM discourse/base:release RUN cd /var/www/discourse && sudo -EH -u discourse bundle exec rake plugin:install_all_official RUN cd /var/www/discourse && sudo -EH -u discourse bundle exec rake assets:precompile:prereqs ``` I then ran the following command on my M3 Max Macbook Pro that has 14 phyisal CPU cores. ``` for j in 1 2 4 8 14; do echo "JOBS=$j"; time docker run --rm -it -e JOBS=$j test:latest /bin/bash -c "su discourse -c 'cd /var/www/discourse && bundle exec rake assets:precompile:build'"; done ``` These are the results I got: ``` JOBS=1 0.04s user 0.03s system 0% cpu 1:01.92 total JOBS=2 0.04s user 0.02s system 0% cpu 42.605 total JOBS=4 0.04s user 0.02s system 0% cpu 37.012 total JOBS=8 0.04s user 0.02s system 0% cpu 35.199 total JOBs=14 0.04s user 0.02s system 0% cpu 37.941 total ``` We think JOBS=2 is a good default when the `JOBS` env has not been set. Anything above just consumes more resources for little benefit. --- lib/tasks/assets.rake | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/lib/tasks/assets.rake b/lib/tasks/assets.rake index 6d40d92aeaa..ba067db2f97 100644 --- a/lib/tasks/assets.rake +++ b/lib/tasks/assets.rake @@ -12,18 +12,26 @@ task "assets:precompile:build" do raise "Unknown ember version '#{ember_version}'" if !%w[5].include?(ember_version) + # If `JOBS` env is not set, `thread-loader` defaults to the number of CPUs - 1 on the machine but we want to cap it + # at 4 because benchmarking has shown that anything beyond 4 does not improve build times or the increase is marginal. + # Therefore, we cap it so that we don't spawn more processes than necessary. + jobs_env_count = (4 if !ENV["JOBS"].present? && Etc.nprocessors > 4) + compile_command = "CI=1 pnpm --dir=app/assets/javascripts/discourse ember build" heap_size_limit = check_node_heap_size_limit if heap_size_limit < 2048 STDERR.puts "Node.js heap_size_limit (#{heap_size_limit}) is less than 2048MB. Setting --max-old-space-size=2048 and CHEAP_SOURCE_MAPS=1" + jobs_env_count = 0 + compile_command = - "JOBS=0 CI=1 NODE_OPTIONS='--max-old-space-size=2048' CHEAP_SOURCE_MAPS=1 #{compile_command}" + "CI=1 NODE_OPTIONS='--max-old-space-size=2048' CHEAP_SOURCE_MAPS=1 #{compile_command}" end ember_env = ENV["EMBER_ENV"] || "production" compile_command = "#{compile_command} -prod" if ember_env == "production" + compile_command = "JOBS=#{jobs_env_count} #{compile_command}" if jobs_env_count only_ember_precompile_build_remaining = (ARGV.last == "assets:precompile:build") only_assets_precompile_remaining = (ARGV.last == "assets:precompile") @@ -230,11 +238,13 @@ def concurrent? if ENV["SPROCKETS_CONCURRENT"] == "1" concurrent_compressors = [] executor = Concurrent::FixedThreadPool.new(Concurrent.processor_count) + yield( Proc.new do |&block| concurrent_compressors << Concurrent::Future.execute(executor: executor) { block.call } end ) + concurrent_compressors.each(&:wait!) else yield(Proc.new { |&block| block.call })