We were running into errors running `ember build` on machines with high
CPU counts. It was then noted that `thread-loader`, which embroider uses, defaults to spinning
up x workers where x is number of physical CPU cores - 1. That is
probably too much so we set out to find out an optimial count to set for
the `JOBS` env which embroider will use to set the number of
`thread-loader` workers.
I first built an image using the following Dockerfile.
```
FROM discourse/base:release
RUN cd /var/www/discourse && sudo -EH -u discourse bundle exec rake plugin:install_all_official
RUN cd /var/www/discourse && sudo -EH -u discourse bundle exec rake assets:precompile:prereqs
```
I then ran the following command on my M3 Max Macbook Pro that has 14
phyisal CPU cores.
```
for j in 1 2 4 8 14; do echo "JOBS=$j"; time docker run --rm -it -e JOBS=$j test:latest /bin/bash -c "su discourse -c 'cd /var/www/discourse && bundle exec rake assets:precompile:build'"; done
```
These are the results I got:
```
JOBS=1 0.04s user 0.03s system 0% cpu 1:01.92 total
JOBS=2 0.04s user 0.02s system 0% cpu 42.605 total
JOBS=4 0.04s user 0.02s system 0% cpu 37.012 total
JOBS=8 0.04s user 0.02s system 0% cpu 35.199 total
JOBs=14 0.04s user 0.02s system 0% cpu 37.941 total
```
We think JOBS=2 is a good default when the `JOBS` env has not been set.
Anything above just consumes more resources for little benefit.
This will bring significant improvements to install speed & storage requirements. For information on how it may affect you, see https://meta.discourse.org/t/324521
This commit:
- removes the `yarn.lock` and replaces with `pnpm-lock.yaml`
- updates workspaces to pnpm format
- adjusts package dependencies to work with pnpm's stricter resolution strategy
- updates Rails app to load modules from more specific node_modules directories
- adds a `.pnpmfile` which automatically cleans up old yarn-managed `node_modules` directories
- updates various scripts to call `pnpm` instead of `yarn`
- updates patches to use pnpm's native patch system instead of patch-package
- adds a patch for licensee to support pnpm
We were writing theme-transpiler JS files to the filesystem on a per-process basis, and then immediately reading them back in. Plus, there was no cleanup mechanism, so the tmp directory would grow indefinitely.
This commit refactors things so that the `build.js` script outputs the theme-transpiler source to stdout. That way, we can read it directly into the process, and then into mini-racer, without needing to go via the filesystem. No cleanup required!
In production, the theme-transpiler is still cached in a file during `assets:precompile`
We have been seeing `ZLib::BufError` when running the `assets:precompile` rake
task.
```
I, [2024-07-30T05:19:58.807019 #1059] INFO -- : Writing /var/www/discourse/public/assets/scripts/discourse-test-listen-boot-9b14a0fc65c689577e6a428dcfd680205516fe211700a71c7adb5cbcf4df2cc5.js
rake aborted!
Zlib::BufError: buffer error (Zlib::BufError)
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/sprockets-3.7.3/lib/sprockets/cache/file_store.rb💯in `<<'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/sprockets-3.7.3/lib/sprockets/cache/file_store.rb💯in `set'
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/sprockets-3.7.3/lib/sprockets/cache.rb:212:in `set'
```
The hypothesis here is that some thread unsafe issue is causing the
problem since we download the Maxmind databases in a thread and run
decompression operations once the gzip file is downloaded.
In the near term, we plan to move downloading of Maxmind databases out
of the Rake task into a scheduled job so this patch should be considered
a temporary solution.
The trade-off here is that build time will slightly increase since we
are not longer downloading Maxmind databases while precompiling assets
at the same time.
- Use 'cheap-source-map' webpack config on low-memory machines
This results in worse quality sourcemaps in browser dev tools, but it significantly reduces memory use in our webpack build. In approximate local testing it drops from 1100mb to 590mb. This should make the rebuild process on low-memory machines much faster and less likely to trigger OOM errors.
In development, and on higher-memory machines, the higher-quality 'source-map' option is maintained.
- Disable Webpack's built-in `minimize` feature. Embroider already applies Terser after the webpack build is complete. There is no need to double-minimize the output.
- Update ember-cli-progress-ci to print to stderr instead of stdout. For some reason, pups (used by discourse_docker) buffers the stdout of commands and only prints when they are finished. stderr does not have this same limitation, so switching will mean sysadmins can see the progress of the ember build in real-time.
Given the number of variables it's hard to promise exact numbers. But, in my tests on a DO droplet with 1GB RAM (+2GB swap), this reduced the `ember build` portion of a `./launcher rebuild app` from ~50 minutes to ~15 minutes.
This service-worker caching functionality was disabled by default in 1c58395bca, and the setting to re-enable was marked as experimental. Now we are dropping all the related logic.
We were previously using the `EMBER_ENV=production` environment variable, which appears to produce the same output. But, some parts of ember-cli don't seem to support it, which leads to a confusing 'Environment: development' being printed on the console.
This commit adds `-prod` by default, which is the more common way to invoke ember-cli for production builds.
Why this change?
This ENV allows the brotli compression quality to be configurable such
that one can opt for a higher/lower level of compression based on their
preferences.
This commit introduces the scaffolding for us to easily switch between Ember 3.28 and Ember 5 on the `main` branch of Discourse. Unfortunately, there is no built-in system to apply this kind of flagging within yarn / ember-cli. There are projects like `ember-try` which are designed for running against multiple version of a dependency, but they do not allow us to 'lock' dependency/sub-dependency versions, and are therefore unsuitable for our use in production.
Instead, we will be maintaining two root `package.json` files, and two `yarn.lock` files. For ember-3, they remain as-is. For ember5, we use a yarn 'resolution' to override the version for ember-source across the entire yarn workspace.
To allow for easy switching with minimal diff against the repository, `package.json` and `yarn.lock` are symlinks which point to `package-ember3.json` and `yarn-ember3.lock` by default. To switch to Ember 5, we can run `script/switch ember version 5` to update the symlinks to point to `package-ember5.json` and `package-ember3.json` respectively. In production, and when using `bin/ember-cli` for development, the ember version can also be upgraded using the `EMBER_VERSION=5` environment variable.
When making changes to dependencies, these should be made against the default `ember3` versions, and then `script/regen_ember_5_lockfile` should be used to regenerate `yarn-ember5.lock` accordingly. A new 'Ember Version Lockfiles' GitHub workflow will automate this process on Dependabot PRs.
When running a local environment against Ember 5, the two symlink changes will show up as git diffs. To avoid us accidentally committing/pushing that change, another GitHub workflow is introduced which checks the default Ember version and raises an error if it is greater than v3.
Supporting two ember versions simultaneously obviously carries significant overhead, so our aim will be to get themes/plugins updated as quickly as possible, and then drop this flag.
- Remove the wildcard crawler. This was already excluding almost all file types, but the exclude list was missing '.gjs' which meant those files were unnecessarily being hoisted into the `public/` directory during precompile
- Automatically include all ember-cli-generated assets without needing them to be listed. The main motivation for this change is to allow us to start using async imports via Embroider/Webpack. The filenames for those new async bundles will not be known in advance.
- Skips sprockets fingerprinting on Embroider/Webpack chunk JS files. Their filenames already include a fingerprint, and having sprockets change the filenames will cause problems for the async import feature (where filenames are included deep inside js bundles)
This commit also updates our ember-cli build so that it skips building plugin tests in the production environment. This should provide a slight build speed improvement.
* DEV: refactor rake asset precompile tasks
add a separate ember build task that does not depend on rails env
allowing us to compile assets without db+redis connections
rename EMBER_CLI_COMPILE_DONE to SKIP_EMBER_CLI_COMPILE
better semantics in build steps
Until now, we have allowed testing themes in production environments via `/theme-qunit`. This was made possible by hacking the ember-cli build so that it would create the `tests.js` bundle in production. However, this is fundamentally problematic because a number of test-specific things are still optimized out of the Ember build in production mode. It also makes asset compilation significantly slower, and makes it more difficult for us to update our build pipeline (e.g. to introduce Embroider).
This commit removes the ability to run qunit tests in production builds of the JS app when the Embdroider flag is enabled. If a production instance of Discourse exists exclusively for the development of themes (e.g. discourse.theme-creator.io) then they can add `EMBER_ENV: development` to their `app.yml` file. This will build the entire app in development mode, and has a significant performance impact. This must not be used for real production sites.
This commit also refactors many of the request specs into system specs. This means that the tests are guaranteed to have Ember assets built, and is also a better end-to-end test than simply checking for the presence of certain `<script>` tags in the HTML.
Reverts e2705df and re-lands #23187 and #23219.
The issue was incorrect order of execution of Rails' `assets:precompile` task in our own precompilation stack.
Co-authored-by: David Taylor <david@taylorhq.com>
Previously workbox JS was vendored into our git repository, and would be loaded from the `public/javascripts` directory with a 1 day cache lifetime. The main aim of this commit is to add 'cachebuster' to the workbox URL so that the cache lifetime can be increased.
- Remove vendored copies of workbox.
- Use ember-cli/broccoli to collect workbox files from node_modules into assets/workbox-{digest}
- Add assets to sprockets manifest so that they're collected from the ember-cli output directory (and uploaded to s3 when configured)
Some of the sprockets-related changes in this commit are not ideal, but we hope to remove sprockets in the not-too-distant future.
Previously we were forcing node's max-old-space-size to be 2GB. This override was added in a01b1dd6 to avoid issues caused by a lower default node heap_size_limit on machines with less memory.
This commit makes that `max-old-space-size` override more specific so that it only applies to machines with less memory. Other machines will go use Node's defaults.
The override is also lowered to 1GB. This is still high enough for the build to complete, while reducing memory usage.
https://meta.discourse.org/t/245547
Previously the stylesheet cachebusting hash was based on the maximum mtime of files. This works well in development and during in-container updates (e.g. via docker_manager). However, when a fresh docker image is created for each deploy, the file mtimes will change even if the contents has not.
This commit changes the production logic to calculate the cachebuster from the filenames and contents of the relevant assets. This should be consistent across deploys, thereby improving cache hits and improving page load times.
Anyone still using `EMBER_CLI_PROD_ASSETS=0` in development or production will be gracefully switched to Ember CLI. In development, a repeated message will be logged to STDERR.
Similarly, passing `QUNIT_EMBER_CLI=0` to the qunit rake task will now do nothing. A warning will be printed, and ember-cli mode will be used. Note that we've chosen not to fail the task, so that existing plugin/theme CI jobs don't immediately start failing. We may switch to a hard fail in the coming days/weeks.
22a7905f restructured how we load Ember CLI assets in production. Unfortunately, it also broke sourcemaps for those assets. This commit fixes that regression via a couple of changes:
- It adds the necessary `.map` paths to `config.assets.precompile`
- It swaps Sprockets' default `SourcemappingUrlProcessor` with an extended version which maintains relative URLs of maps
Previously, accessing the Rails app directly in development mode would give you assets from our 'legacy' Ember asset pipeline. The only way to run with Ember CLI assets was to run ember-cli as a proxy. This was quite limiting when working on things which are bypassed when using the ember-cli proxy (e.g. changes to `application.html.erb`). Also, since `ember-auto-import` introduced chunking, visiting `/theme-qunit` under Ember CLI was failing to include all necessary chunks.
This commit teaches Sprockets about our Ember CLI assets so that they can be used in development mode, and are automatically collected up under `/public/assets` during `assets:precompile`. As a bonus, this allows us to remove all the custom manifest modification from `assets:precompile`.
The key changes are:
- Introduce a shared `EmberCli.enabled?` helper
- When ember-cli is enabled, add ember-cli `/dist/assets` as the top-priority Rails asset directory
- Have ember-cli output a `chunks.json` manifest, and teach `preload_script` to read it and append the correct chunks to their associated `afterFile`
- Remove most custom ember-cli logic from the `assets:precompile` step. Instead, rely on Rails to take care of pulling the 'precompiled' assets into the `public/assets` directory. Move the 'renaming' logic to runtime, so it can be used in development mode as well.
- Remove fingerprinting from `ember-cli-build`, and allow Rails to take care of things
Long-term, we may want to replace Sprockets with the lighter-weight Propshaft. The changes made in this commit have been made with that long-term goal in mind.
tldr: when you visit the rails app directly, you'll now be served the current ember-cli assets. To keep these up-to-date make sure either `ember serve`, or `ember build --watch` is running. If you really want to load the old non-ember-cli assets, then you should start the server with `EMBER_CLI_PROD_ASSETS=0`. (the legacy asset pipeline will be removed very soon)
This allows text editors to use correct syntax coloring for the heredoc sections.
Heredoc tag names we use:
languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
Example error:
```
/__w/discourse/discourse/lib/tasks/assets.rake:3: warning: already initialized constant EMBER_CLI
/__w/discourse/discourse/lib/tasks/assets.rake:3: warning: previous definition of EMBER_CLI was here
```
The `assets:precompile` rake task loads the full Ruby app, which can consume around 500mb of RAM by itself. Using `exec` to run `ember build` allows us to free up the Ruby memory and make more space for `ember build`
This makes a small improvement to 'cold cache' ember-cli build times, and a large improvement to 'warm cache' build times
The ember-auto-import update means that vendor is now split into multiple files for efficiency. These are named `chunk.*`, and should be included immediately after the `vendor.js` file. This commit also updates the rails app to render script tags for these chunks.
This change was previously merged, and caused memory-related errors on RAM-constrained machines. This was because Webpack 5 switches from multiple worker processes to a single multi-threaded process. This meant that it was hitting node's default heap size limit (~500mb on a 1GB RAM server). Discourse's standard install procedure recommends adding 2GB swap to 1GB-RAM machines, so we can afford to override's Node's default via the `--max-old-space-size` flag.
If no path is supplied, browsers will look for the map on the same path as the JS file itself. This fixes two problems that we see in production:
1. When compiling assets against one CDN, and then re-using them on a site with a different CDN, the sourceMappingUrls would be incorrect and print warnings in the console
2. If both an S3 CDN and an app CDN are configured, we were using the S3 CDN for the JS and the app CDN for the map. This commit will make sure we use the S3 CDN for both.