Why this change?
The `tests` workflow runs many jobs. Each job when ran is given a unique
id. Since a job can be re-run, we do not want the test reports to
override each other so we differentiate it further by the `job_id` given
by `${{ github.job }}`.
Why this change?
Pull requests can introduce flaky tests into the mix and we do not want
to be hiing that during the pull request process. While this does mean
builds for PR will be less stable than the `main` branch without
retries, we do not foresee this to be a problem long term since the
monitoring of flaky tests on the `main` branch will mean that the number
of flaky tests will eventually be reduced.
What does this change do?
1. Introduce the `DISCOURSE_TURBO_RSPEC_RETRY_AND_LOG_FLAKY_TESTS` env
variable which will initialize `TurboTest::Runner` with the `retry_and_log_flaky_tests`
kwarg set to true when set.
2. Change the tests workflow run to set `DISCOURSE_TURBO_RSPEC_RETRY_AND_LOG_FLAKY_TESTS` only when
the build type is `backend` or `system` and the `github.ref_name` is
`main`.
It's very unlikely that something will be introduced which works under Ember 5 and not Ember 3. To reduce GitHub actions costs, flakiness, and visual noise, let's cut down the matrix so we're only using Ember 3 for the 'core frontend' job. All others can run under Ember 5.
What motivated this change?
Our builds on Github actions have been extremely flaky mostly due to system tests. This has led to a drop in confidence
in our test suite where our developers tend to assume that a failed job is due to a flaky system test. As a result, we
have had occurrences where changes that resulted in legitimate test failures are merged into the `main` branch because developers
assumed it was a flaky test.
What does this change do?
This change seeks to reduce the flakiness of our builds on Github Actions by automatically re-running RSpec tests once when
they fail. If a failed test passes subsequently in the re-run, we mark the test as flaky by logging it into a file on disk
which is then uploaded as an artifact of the Github workflow run. We understand that automatically re-runs will lead to
lower accuracy of our tests but we accept this as an acceptable trade-off since a fragile build has a much greater impact
on our developers' time. Internally, the Discourse development team will be running a service to fetch the flaky tests
which have been logged for internal monitoring.
How is the change implemented?
1. A `--retry-and-log-flaky-tests` CLI flag is added to the `bin/turbo_rspec` CLI which will then initialize `TurboTests::Runner`
with the `retry_and_log_flaky_tests` kwarg set to `true`.
2. When the `retry_and_log_flaky_tests` kwarg is set to `true` for `TurboTests::Runner`, we will register an additional
formatter `Flaky::FailuresLoggerFormatter` to the `TurboTests::Reporter` in the `TurboTests::Runner#run` method.
The `Flaky::FailuresLoggerFormatter` has a simple job of logging all failed examples to a file on disk when running all the
tests. The details of the failed example which are logged can be found in `TurboTests::Flaky::FailedExample.to_h`.
3. Once all the tests have been run once, we check the result for any failed examples and if there are, we read the file on
disk to fetch the `location_rerun_location` of the failed examples which is then used to run the tests in a new RSpec process.
In the rerun, we configure a `TurboTests::Flaky::FlakyDetectorFormatter` with RSpec which removes all failed examples from the log file on disk since those examples are not flaky tests. Note that if there are too many failed examples on the first run, we will deem the failures to likely not be due to flaky tests and not re-run the test failures. As of writing, the threshold of failed examples is set to 10. If there are more than 10 failed examples, we will not re-run the failures.
This commit introduces the scaffolding for us to easily switch between Ember 3.28 and Ember 5 on the `main` branch of Discourse. Unfortunately, there is no built-in system to apply this kind of flagging within yarn / ember-cli. There are projects like `ember-try` which are designed for running against multiple version of a dependency, but they do not allow us to 'lock' dependency/sub-dependency versions, and are therefore unsuitable for our use in production.
Instead, we will be maintaining two root `package.json` files, and two `yarn.lock` files. For ember-3, they remain as-is. For ember5, we use a yarn 'resolution' to override the version for ember-source across the entire yarn workspace.
To allow for easy switching with minimal diff against the repository, `package.json` and `yarn.lock` are symlinks which point to `package-ember3.json` and `yarn-ember3.lock` by default. To switch to Ember 5, we can run `script/switch ember version 5` to update the symlinks to point to `package-ember5.json` and `package-ember3.json` respectively. In production, and when using `bin/ember-cli` for development, the ember version can also be upgraded using the `EMBER_VERSION=5` environment variable.
When making changes to dependencies, these should be made against the default `ember3` versions, and then `script/regen_ember_5_lockfile` should be used to regenerate `yarn-ember5.lock` accordingly. A new 'Ember Version Lockfiles' GitHub workflow will automate this process on Dependabot PRs.
When running a local environment against Ember 5, the two symlink changes will show up as git diffs. To avoid us accidentally committing/pushing that change, another GitHub workflow is introduced which checks the default Ember version and raises an error if it is greater than v3.
Supporting two ember versions simultaneously obviously carries significant overhead, so our aim will be to get themes/plugins updated as quickly as possible, and then drop this flag.
Why this change?
Plugin gems for official plugins are being installed over and over again
each time we run RSpec and QUnit tests for plugins. In particular, the
rugged gem installed by the discourse-code-review plugin takes
approximately 50-60 seconds to install because it is compiling libgit2.
Why this change?
Right now, the job names are `core system 3.2`, `core frontend 3.2` etc.
The problem here is that 3.2 is very vague. I thought about making the
job names something like `core system (Ruby 3.2)` but then wondered if
there is even value in including that when we are only running with one
ruby version in the matrix all the time. Therefore, I decided to drop
`3.2` from the job names.
Why this change?
As the number of themes which the Discourse team supports officially
grows, we want to ensure that changes made to Discourse core do not
break the plugins. As such, we are adding a step to our Github actions
test job to run the QUnit tests for all official themes.
What does this change do?
This change adds a new job to our tests Github actions workflow to run the QUnit
tests for all official plugins. This is achieved with the following
changes:
1. Update `testem.js` to rely on the `THEME_TEST_PAGES` env variable to set the
`test_page` option when running theme QUnit tests with testem. The
`test_page` option [allows an array to be specified](https://github.com/testem/testem#multiple-test-pages) such that tests for
multiple pages can be run at the same time. We are relying on a ENV variable
because the `testem` CLI does not support passing a list of pages
to the `--test_page` option.
2. Support a `/testem-theme-qunit/:testem_id/theme-qunit` Rails route in the development environment. This
is done because testem prefixes the path with a unique ID to the configured `test_page` URL.
This is problematic for us because we proxy all testem requests to the
Rails server and testem's proxy configuration option does not allow us
to easily rewrite the URL to remove the prefix. Therefore, we configure a proxy in testem to prefix `theme-qunit` requests with
`/testem-theme-qunit` which can then be easily identified by the Rails server and routed accordingly.
3. Update `qunit:test` to support a `THEME_IDS` environment variable
which will allow it to run QUnit tests for multiple themes at the
same time.
4. Support `bin/rake themes:qunit[ids,"<theme_id>|<theme_id>"]` to run
the QUnit tests for multiple themes at the same time.
5. Adds a `themes:qunit_all_official` Rake task which runs the QUnit
tests for all the official themes.
Why this change?
As the number of themes which the Discourse team supports officially
grows, we want to ensure that changes made to Discourse core do not
break the plugins. As such, we are adding a step to our Github actions
test job to run the system tests for all official themes.
What does this change do?
This change adds a step to our Github actions test job to run the system
tests for all official plugins. This is achieved by the introduction of
the `themes:install_all_official` Rake task which installs all the
themes that are officially supported by the Discourse team.
Using restore-keys means we will always use an old cache, and then add more dependencies to it. This leads to the cache growing over time and becoming increasingly slow. Instead, we should rebuild the cache from scratch each time our dependencies change.
Without this change the resulting comparison looks like
```
if [ tests-passed == "tests-passed" ]; then
```
and so it was always failing. This way the resulting base branch name will also be in quotes for the comparison.
Follow up to: #24273
* DEV: Adds a GitHub workflow to check target branch
Adds a GitHub workflow to check that the target branch for PRs in the
discourse-private-mirror repo aren't set to the tests-passed branch.
* Rename workflow
This fixes a similar issue to 8b3eca0 where an Errno::ETXTBSY error was raised because the minio_runner gem was trying to install the binary across multiple processes in rspec. If we just make sure the latest version is installed before the tests run, this shouldn't happen, since MinioRunner.start will not do any further attempts at installation if the latest version is installed.
This workflow runs only for code underneath the `migrations/` directory. The usual test workflow is skipped for migrations because running frontend and backend tests is a waste of time and resources when only migrations are changed.
Discourse core now builds and runs with Embroider! This commit adds
the Embroider-based build pipeline (`USE_EMBROIDER=1`) and start
testing it on CI.
The new pipeline uses Embroider's compat mode + webpack bundler to
build discourse code, and leave everything else (admin, wizard,
markdown-it, plugins, etc) exactly the same using the existing
Broccoli-based build as external bundles (<script> tags), passed
to the build as `extraPublicTress` (which just means they get
placed in the `/public` folder).
At runtime, these "external" bundles are glued back together with
`loader.js`. Specifically, the external bundles are compiled as
AMD modules (just as they were before) and registered with the
global `loader.js` instance. They expect their `import`s (outside
of whatever is included in the bundle) to be already available in
the `loader.js` runtime registry.
In the classic build, _every_ module gets compiled into AMD and
gets added to the `loader.js` runtime registry. In Embroider,
the goal is to do this as little as possible, to give the bundler
more flexibility to optimize modules, or omit them entirely if it
is confident that the module is unused (i.e. tree-shaking).
Even in the most compatible mode, there are cases where Embroider
is confident enough to omit modules in the runtime `loader.js`
registry (notably, "auto-imported" non-addon NPM packages). So we
have to be mindful of that an manage those dependencies ourselves,
as seen in #22703.
In the longer term, we will look into using modern features (such
as `import()`) to express these inter-dependencies.
This will only be behind a flag for a short period of time while we
perform some final testing. Within the next few weeks, we intend
to enable by default and remove the flag.
---------
Co-authored-by: David Taylor <david@taylorhq.com>
This commit adds some system specs to test uploads with
direct to S3 single and multipart uploads via uppy. This
is done with minio as a local S3 replacement. We are doing
this to catch regressions when uppy dependencies need to
be upgraded or we change uppy upload code, since before
this there was no way to know outside manual testing whether
these changes would cause regressions.
Minio's server lifecycle and the installed binaries are managed
by the https://github.com/discourse/minio_runner gem, though the
binaries are already installed on the discourse_test image we run
GitHub CI from.
These tests will only run in CI unless you specifically use the
CI=1 or RUN_S3_SYSTEM_SPECS=1 env vars.
For a history of experimentation here see https://github.com/discourse/discourse/pull/22381
Related PRs:
* https://github.com/discourse/minio_runner/pull/1
* https://github.com/discourse/minio_runner/pull/2
* https://github.com/discourse/minio_runner/pull/3
We can no long user Webdriver - SeleniumHQ/selenium#11066. Bumping selenium-webdriver did the trick, as well as manually setting the user_agent for mobile system specs. Unsure what changed to make this necessary, but it is necessary to get the app to boot in mobile view.
Why this change?
This is abit of a trial and error but we're starting to see selenium
session not created errors on CI. One of the reason for this is that the
system has run out of resources to create a new tab.
This commit reduces the number of parallel test processors in an attempt
to increase the amount of resources available to each test process and
hopefully lead to more stable CI system tests.
This reverts commit 865f7a9852.
The flakiness that we have been seeing and fixing on CI were not related
to system resource problems. Therefore, we can bump this up back to 5.
Why is this change required?
We've been seeing flaky tests due to server errors on CI but are unable
to debug it because we do not log any of the errors. This change gives
us a fighting chance the next time we encounter a server error during
system test runs.
See
https://github.com/discourse/discourse/actions/runs/5459248864/jobs/9935049920?pr=22424
for an example of server errors encountered during system tests.
Using the runtime information, we will be able to more efficiently group
the test files across the test processes hence leading to better
utilization of resources.
Using the runtime information, we will be able to more efficiently group
the test files across the test processes hence leading to better
utilization of resources.
4dd053a69c addressed most of the
instability we were seeing with system tests on CI and locally. Let's
try pushing the number of parallel processes up to squeeze as much time
savings as possible from the runner.
We're running on pretty crappy hardware on Github's CI and this has an
impact on the stability of our system tests on CI. Therefore, we are
bumping `CABPYARA_DEFAULT_MAX_WAIT_TIME` to 10 seconds to account for
the less than ideal hardware we're running the system tests on.
This change trades off speed for stability but speed is already bad on
CI so stability is more important for our case.
## How does this work?
Any time a lint rule is added or changed, you can run `yarn lint:fix` to handle all the auto-fixable situations.
But not all lints are auto-fixable -- for those, lint-to-the-future has tooling to automatically ignore present violations.
An alias has been added for lint-to-the-future to ignore new violations, `yarn lttf:ignore`.
The command will add lint-ignore declarations throughout all the files with present violations, which should then be committed.
An excerpt from lint-to-the-future's [README](https://github.com/mansona/lint-to-the-future#lint-to-the-future-dashboard):
> The point of Lint to the Future is to allow you to progressively update your codebase using new lint rules without overwhelming you with the task. You can easily ignore lint rules using project-based ignores in your config files but that doesn't prevent you from making the same errors in new files.
> We chose to do the ignores on a file basis as it is a perfect balance and it means that the tracking/graphing aspects of Lint to the Future provide you with achievable goals, especially in large codebases.
## How do I view progress?
lint-to-the-future provides graphs of violations-over-time per lint rule in a dashboard format, so we can track how well we're doing at cleaning up the violations.
To view the dashboard locally, run `yarn lint-progress` and visit `http://localhost:8084` (or whatever the port it chose, as it will choose a new port if 8084 is preoccupied)
Also there is a `list` command which shows a JSON object of:
```ts
{
[date: string]: { // yyyy-mm-dd
[pluginName: string]: {
[fileName: string]: string[]; // list of files with violations
}
}
}
```
```bash
yarn lint-to-the-future list --stdout
```
## What about lint-todo?
Lint todo is another system available for both eslint and ember-template-lint that _forces_ folks to "leave things better than they found them" by being transparent / line-specific ignoring of violations.
It was decided that for _this_ project, it made more sense, and would be less disruptive to new contributors to have the ignore declarations explicitly defined in each file (whereas in lint-todo, they are hidden).
To effectively use lint-todo, a whole team needs to agree to the workflow, and in open source, we want "just anyway" to be able to contribute, and throwing surprises at them can deter contributions.
In production, `eager_load=true`. This sometimes leads to boot errors which are not present in dev/test environments. Running `zeitwerk:check` in CI will help us to pick up on any errors early.
This commit also introduces a `DISCOURSE_ZEITWERK_EAGER_LOAD` environment variable to make it easier to toggle the behaviour when developing locally.
This is a combined work of Martin Brennan, Loïc Guitaut, and Joffrey Jaffeux.
---
This commit implements a base service object when working in chat. The documentation is available at https://discourse.github.io/discourse/chat/backend/Chat/Service.html
Generating documentation has been made as part of this commit with a bigger goal in mind of generally making it easier to dive into the chat project.
Working with services generally involves 3 parts:
- The service object itself, which is a series of steps where few of them are specialized (model, transaction, policy)
```ruby
class UpdateAge
include Chat::Service::Base
model :user, :fetch_user
policy :can_see_user
contract
step :update_age
class Contract
attribute :age, :integer
end
def fetch_user(user_id:, **)
User.find_by(id: user_id)
end
def can_see_user(guardian:, **)
guardian.can_see_user(user)
end
def update_age(age:, **)
user.update!(age: age)
end
end
```
- The `with_service` controller helper, handling success and failure of the service within a service and making easy to return proper response to it from the controller
```ruby
def update
with_service(UpdateAge) do
on_success { render_serialized(result.user, BasicUserSerializer, root: "user") }
end
end
```
- Rspec matchers and steps inspector, improving the dev experience while creating specs for a service
```ruby
RSpec.describe(UpdateAge) do
subject(:result) do
described_class.call(guardian: guardian, user_id: user.id, age: age)
end
fab!(:user) { Fabricate(:user) }
fab!(:current_user) { Fabricate(:admin) }
let(:guardian) { Guardian.new(current_user) }
let(:age) { 1 }
it { expect(user.reload.age).to eq(age) }
end
```
Note in case of unexpected failure in your spec, the output will give all the relevant information:
```
1) UpdateAge when no channel_id is given is expected to fail to find a model named 'user'
Failure/Error: it { is_expected.to fail_to_find_a_model(:user) }
Expected model 'foo' (key: 'result.model.user') was not found in the result object.
[1/4] [model] 'user' ❌
[2/4] [policy] 'can_see_user'
[3/4] [contract] 'default'
[4/4] [step] 'update_age'
/Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/update_age.rb:32:in `fetch_user': missing keyword: :user_id (ArgumentError)
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:202:in `instance_exec'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:202:in `call'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:219:in `call'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:417:in `block in run!'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:417:in `each'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:417:in `run!'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:411:in `run'
from <internal:kernel>:90:in `tap'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:302:in `call'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/spec/services/update_age_spec.rb:15:in `block (3 levels) in <main>'
```
This broke because of directory ownership errors during `git ls-files`. This commit fixes the permissions and adds bash flags so that those kind of errors will blow up the step in future.
The `git` version in our discourse_test docker image was recently updated to include a permissions check before running any git commands. For this to pass, the owner of the discourse directory needs to match the user running any git commands.
Under GitHub actions, by default the working directory is created with uid=1000 as the owner. We run all our tests as `root`, so this mismatch causes git to raise the permissions error. We can't switch to run the entire workflow as the `discourse (uid=1000)` user because our discourse_test image is not configured to allow `discourse` access to postgres/redis directories. For now, this commit updates the working directory's owner to match the user running the workflow.
Note this commit also slightly changes internal API: channel instead of getChannel and updateCurrentUserChannelNotificationsSettings instead of updateCurrentUserChatChannelNotificationsSettings.
Also destroyChannel takes a second param which is the name confirmation instead of an optional object containing this confirmation. This is to enforce the fact that it's required.
In the future a top level jsdoc config file could be used instead of the hack tempfile, but while it's only an experiment for chat, it's probably good enough.
This commit introduces the necessary gems and config, but adds all our ruby code directories to the `--ignore-files` list.
Future commits will apply syntax_tree to parts of the codebase, removing the ignore patterns as we go
Our working theory is that system tests on Github run on much less
powerful hardware as compared to running the tests on our work machines.
Hopefully, increasing the wait time now will help reduce some flakes
that we're seeing on Github.
These screenshots are located at paths like:
/__w/discourse/discourse/tmp/capybara/failures_r_spec_example_groups_quoting_chat_message_transcripts_copying_quote_transcripts_with_the_clipboard_quotes_multiple_chat_messages_into_a_topic_134.png
not /tmp/screenshots. This should fix the issue. Also makes plugin system specs
use documentation format and profile.
This commit introduces rails system tests run with chromedriver, selenium,
and headless chrome to our testing toolbox.
We use the `webdrivers` gem and `selenium-webdriver` which is what
the latest Rails uses so the tests run locally and in CI out of the box.
You can use `SELENIUM_VERBOSE_DRIVER_LOGS=1` to show extra
verbose logs of what selenium is doing to communicate with the system
tests.
By default JS logs are verbose so errors from JS are shown when
running system tests, you can disable this with
`SELENIUM_DISABLE_VERBOSE_JS_LOGS=1`
You can use `SELENIUM_HEADLESS=0` to run the system
tests inside a chrome browser instead of headless, which can be useful to debug things
and see what the spec sees. See note above about `bin/ember-cli` to avoid
surprises.
I have modified `bin/turbo_rspec` to exclude `spec/system` by default,
support for parallel system specs is a little shaky right now and we don't
want them slowing down the turbo by default either.
### PageObjects and System Tests
To make querying and inspecting parts of the page easier
and more reusable inbetween system tests, we are using the
concept of [PageObjects](https://www.selenium.dev/documentation/test_practices/encouraged/page_object_models/) in
our system tests. A "Page" here is generally corresponds to
an overarching ember route, e.g. "Topic" for `/t/324345/some-topic`,
and this contains logic for querying components within the topic
such as "Posts".
I have also split "Modals" into their own entity. Further down the
line we may want to explore creating independent "Component"
contexts.
Capybara DSL should be included in each PageObject class,
reference for this can be found at https://rubydoc.info/github/teamcapybara/capybara/master#the-dsl
For system tests, since they are so slow, we want to focus on
the "happy path" and not do every different possible context
and branch check using them. They are meant to be overarching
tests that check a number of things are correct using the full stack
from JS and ember to rails to ruby and then the database.
### CI Setup
Whenever a system spec fails, a screenshot
is taken and a build artifact is produced _after the entire CI run is complete_,
which can be downloaded from the Actions UI in the repo.
Most importantly, a step to build the Ember app using Ember CLI
is needed, otherwise the JS assets cannot be found by capybara:
```
- name: Build Ember CLI
run: bin/ember-cli --build
```
A new `--build` argument has been added to `bin/ember-cli` for this
case, which is not needed locally if you already have the discourse
rails server running via `bin/ember-cli -u` since the whole server is built and
set up by default.
Co-authored-by: David Taylor <david@taylorhq.com>
Both versions are used with `--headless`, so labelling one "Firefox" and the other "Firefox Headless" doesn't really make sense. Evergreen / ESR are better descriptions.
We added `always()` on some steps so that they run even if previous steps fail. That helps give us a picture of all failures in one run, rather than having to re-run the workflow after fixing the first failure.
However, when we explicitly cancel a job, we should skip running these steps. `!cancelled()` is a better substitute for `always()` in this case.