The self-hosted Github runners have been provisioned, and we can switch
to using them for evaluation.
To prefer Github-hosted runners, you can safely revert this commit.
See: t/123181.
* Updates GitHub Actions
* Switches from `bundler/inline` to an optional group in the `Gemfile` because the previous solution didn't work well with rspec
* Adds the converter framework and tests
* Allows loading private converters (see README)
* Switches from multiple CLI tools to a single CLI
* Makes DB connections reusable and adds a new abstraction for the `IntermediateDB`
* `IntermediateDB` acts as an interface for IPC calls when a converter steps runs in parallel (forks). Only the main process writes to the DB.
* Includes a simple example implementation of a converter for now.
This will bring significant improvements to install speed & storage requirements. For information on how it may affect you, see https://meta.discourse.org/t/324521
This commit:
- removes the `yarn.lock` and replaces with `pnpm-lock.yaml`
- updates workspaces to pnpm format
- adjusts package dependencies to work with pnpm's stricter resolution strategy
- updates Rails app to load modules from more specific node_modules directories
- adds a `.pnpmfile` which automatically cleans up old yarn-managed `node_modules` directories
- updates various scripts to call `pnpm` instead of `yarn`
- updates patches to use pnpm's native patch system instead of patch-package
- adds a patch for licensee to support pnpm
This regressed in b83a2a34a4 because the
Github actions docs doesn't make it clear that `runner.name` is actually
the runner's name plus some unique string appended at the end. Why they
would do that is beyond me.
We cannot just key on `runner.os` the number of CPU cores matter as
well. Therefore, we need to key on `runner.name` instead since each
runner has its own unique OS and CPU cores. Technically, two different
runner with different names can have the same `os` and `cpu cores` but
we don't have that problem now.
QUnit tests are failing in different ways on Chromium in Debian
bookworm. We have no interest in figuring out why as it is not a good
use of our time and the long term plan is to switch to Chrome for Testing
anyway.
Previously "themes frontend" CI job would:
1. pull compatible versions of themes that happened to be in the base image
2. clone all official themes (overriding the compatible versions from 1.)
3. run tests
`discourse-ai` has custom gems which need to be bumped in order to be
compatible with Ruby 3.3. However, its version is pinned so we can't
pull in the commits in which upgrades the gems to be compatible with
Ruby 3.3. Just avoid running the specs on `stable` branch for now until
we release a new stable.
This commit adds a step in our tests workflow on Github actions to update the themes to
use the compatible version when not running aginast the `main` branch.
This is to ensure that we are not running
the tests for themes against an incompatible version of Discourse.
* Moves existing files around. All essential scripts are in `migrations/bin`, and non-essential scripts like benchmarks are in `migrations/scripts`
* Dependabot configuration for migrations-tooling (disabled for now)
* Updates test configuration for migrations-tooling
* Shorter configuration for intermediate DB for now. We will add the rest table by table.
* Adds a couple of benchmark scripts
* RSpec setup especially for migrations-tooling and the first tests
* Adds sorting/formatting to the `generate_schema` script
Why this change?
Bundle cache should be keyed on ruby version as well as the debian
release name. Changes to the debian release can affect the way gems are
installed since gems may link to different versions of binaries.
The version of Chromium we have in our images (120) is relatively
unstable and our system specs break regularly.
This patch makes sure Chrome is used instead for system specs.
Why this change?
We run on different runners depending on the scenario. We should use the
right number of parallel jobs for bundle install based on the number of
CPU cores the runner has.
This is so the CI output on GitHub actions isn't showing
tons and tons of unnecessary log data every time you want
to see the important thing, which is the actual test failure.
Why this change?
The output is too verbose and prevents us from quickly identifying tests
failures. Now that our tests are way more stable and less flaky, we can
drop the documentation format since we do not need it for debugging
purposes that often anymore
Before this commit, we had a yarn package set up in the root directory and also in `app/assets/javascripts`. That meant two `yarn install` calls and two `node_modules` directories. This commit merges them both into the root location, and updates references to node_modules.
A previous attempt can be found at https://github.com/discourse/discourse/pull/21172. This commit re-uses that script to merge the `yarn.lock` files.
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
Why this change?
Our tests are more stable these days and there is little to no need for
us to be retrying on PRs which helps to increase confidence in our test
suite since flaky tests are raised earlier.
Why this change?
This regressed in 6e9fbb5bab because we
had a `request.xhr?` check before we decide to block requests. However,
there could not none-xhr requests which we need to block as well at the
end of each system test when `@@block_requests` is true.
This also reverts commit 6437f27f90.
Why this change?
On CI, we have been seeing flaky system tests because ActiveRecord is
unable to checkout a connection. This patch is meant to help us debug
which thread is not returning the connection to the queue.
Example of timeout issue: https://github.com/discourse/discourse/actions/runs/8012541636/job/21888013082
Why this change?
On CI, we have been seeing flaky system tests because ActiveRecord is
unable to checkout a connection. This patch is meant to help us debug
which thread is not returning the connection to the queue.
Why this change?
We have been seeing checkout timeouts happening on CI when using the
default of 5 seconds. This can happen in system tests when the server
has to process many requests using the same database connection.
Therefore, we will double the timeout for now and monitor if stuff
continues to timeout.
Why this change?
I have been investigating transaction related issues with our system
tests and I have a hard time figuring out what is causing the problem.
To help simplify our environment further, we will set the pool size in
the test environment to 1 so that it is impossible for us to be fetching
a different connection between the threads since they all share the
connection pool.
Also set `reaping_frequency` to `0` to ensure we don't reap any
connection ensuring the same connection is always used.
Why this change?
In https://www.postgresql.org/docs/current/non-durability.html, it is
recommended to create unlogged tables to avoid WAL writes which can help
speed at performance at the expense of durability. In the CI env, there is no need for durability at all.
Therefore, we are going to be creating unlogged tables by default.
Co-authored-by: Ted Johansson <ted@discourse.org>
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>