What motivated this change?
Our builds on Github actions have been extremely flaky mostly due to system tests. This has led to a drop in confidence
in our test suite where our developers tend to assume that a failed job is due to a flaky system test. As a result, we
have had occurrences where changes that resulted in legitimate test failures are merged into the `main` branch because developers
assumed it was a flaky test.
What does this change do?
This change seeks to reduce the flakiness of our builds on Github Actions by automatically re-running RSpec tests once when
they fail. If a failed test passes subsequently in the re-run, we mark the test as flaky by logging it into a file on disk
which is then uploaded as an artifact of the Github workflow run. We understand that automatically re-runs will lead to
lower accuracy of our tests but we accept this as an acceptable trade-off since a fragile build has a much greater impact
on our developers' time. Internally, the Discourse development team will be running a service to fetch the flaky tests
which have been logged for internal monitoring.
How is the change implemented?
1. A `--retry-and-log-flaky-tests` CLI flag is added to the `bin/turbo_rspec` CLI which will then initialize `TurboTests::Runner`
with the `retry_and_log_flaky_tests` kwarg set to `true`.
2. When the `retry_and_log_flaky_tests` kwarg is set to `true` for `TurboTests::Runner`, we will register an additional
formatter `Flaky::FailuresLoggerFormatter` to the `TurboTests::Reporter` in the `TurboTests::Runner#run` method.
The `Flaky::FailuresLoggerFormatter` has a simple job of logging all failed examples to a file on disk when running all the
tests. The details of the failed example which are logged can be found in `TurboTests::Flaky::FailedExample.to_h`.
3. Once all the tests have been run once, we check the result for any failed examples and if there are, we read the file on
disk to fetch the `location_rerun_location` of the failed examples which is then used to run the tests in a new RSpec process.
In the rerun, we configure a `TurboTests::Flaky::FlakyDetectorFormatter` with RSpec which removes all failed examples from the log file on disk since those examples are not flaky tests. Note that if there are too many failed examples on the first run, we will deem the failures to likely not be due to flaky tests and not re-run the test failures. As of writing, the threshold of failed examples is set to 10. If there are more than 10 failed examples, we will not re-run the failures.
This commit introduces the scaffolding for us to easily switch between Ember 3.28 and Ember 5 on the `main` branch of Discourse. Unfortunately, there is no built-in system to apply this kind of flagging within yarn / ember-cli. There are projects like `ember-try` which are designed for running against multiple version of a dependency, but they do not allow us to 'lock' dependency/sub-dependency versions, and are therefore unsuitable for our use in production.
Instead, we will be maintaining two root `package.json` files, and two `yarn.lock` files. For ember-3, they remain as-is. For ember5, we use a yarn 'resolution' to override the version for ember-source across the entire yarn workspace.
To allow for easy switching with minimal diff against the repository, `package.json` and `yarn.lock` are symlinks which point to `package-ember3.json` and `yarn-ember3.lock` by default. To switch to Ember 5, we can run `script/switch ember version 5` to update the symlinks to point to `package-ember5.json` and `package-ember3.json` respectively. In production, and when using `bin/ember-cli` for development, the ember version can also be upgraded using the `EMBER_VERSION=5` environment variable.
When making changes to dependencies, these should be made against the default `ember3` versions, and then `script/regen_ember_5_lockfile` should be used to regenerate `yarn-ember5.lock` accordingly. A new 'Ember Version Lockfiles' GitHub workflow will automate this process on Dependabot PRs.
When running a local environment against Ember 5, the two symlink changes will show up as git diffs. To avoid us accidentally committing/pushing that change, another GitHub workflow is introduced which checks the default Ember version and raises an error if it is greater than v3.
Supporting two ember versions simultaneously obviously carries significant overhead, so our aim will be to get themes/plugins updated as quickly as possible, and then drop this flag.
Why this change?
Before this change, running `d/boot_dev` does not run `bundle install`
and `yarn install` unless the `--init` option is specified. However,
this does not make sense because the user will end up having to run
`d/bundle install` and `d/yarn install` manually after if the `--init`
option is not used. We can simplify this by just always running `bundle
install` and `yarn install` whenever `d/boot_dev` is used.
What does this change do?
This change changes the `d/boot_dev` script to always run `bundle
install` and `yarn install`.
We were already running `yarn install` for the app directory before booting ember-cli via `bin/ember-cli`. This commit extends that so that it also runs `yarn install` for the root of the repository. In the long-term we hope to combine these packages, but for now they are separate.
The post-install hook for the root package is also updated to pass through the `-s` flag. Previously, we only passed through the `--frozen-lockfile` flag.
We have been discussing potentially adding a Procfile to help launch the application. As part of the discussions, @tgxworld also mentioned it'd be nice to have a command to launch the app in development.
This allows us to encode how to start the app in code and ship that in the repo, rather than having to rely on external documentation.
Newer installs of Rails come with a bin/dev script for exactly this purpose. This change adds one in retroactively for us.
The script currently runs bin/ember-cli -u, which is the canonical way of starting the app for development according to documentation. I know not everyone uses this, but the point of this PR is to add the script first, then we can iterate on the means of running the app. 🙂
Using the runtime information, we will be able to more efficiently group
the test files across the test processes hence leading to better
utilization of resources.
Using the runtime information, we will be able to more efficiently group
the test files across the test processes hence leading to better
utilization of resources.
Why is this change required?
By default, `RSpec` comes with a `--profile=[COUNT]` option as well but
enabling that option means that the entire test suite needs to be
executed. This does not work so well for `turbo_rspec` which splits our
test files into various "buckets" for the tests to be executed in
multiple processes. Therefore, this commit adds a similar
`--profile=[COUNT]` option to `turbo_rspec` but will only profile the
tests being executed. Examples:
`LOAD_PLUGINS=1 bin/turbo_rspec --profile plugins/*/spec/system`
or
`LOAD_PLUGINS=1 bin/turbo_rspec --profile=20 plugins/*/spec/system`
This commit 57caf08e13 broke
`bin/turbo_rspec` timing recording via `TurboTests::Runner`,
because we changed to using all `spec/*` folders except
`spec/system` as default for the runner, rather than
the old `['spec']` array, which is what `TurboTests::Runner`
was relying on to determine whether to record test run
time with `ParallelTests::RSpec::RuntimeLogger`.
Instead, we can just pass a new `use_runtime_info` boolean to the
runner class and use it when running against the default set of
spec files using `bin/turbo_rspec` and the turbo rspec rake task.
This commit introduces rails system tests run with chromedriver, selenium,
and headless chrome to our testing toolbox.
We use the `webdrivers` gem and `selenium-webdriver` which is what
the latest Rails uses so the tests run locally and in CI out of the box.
You can use `SELENIUM_VERBOSE_DRIVER_LOGS=1` to show extra
verbose logs of what selenium is doing to communicate with the system
tests.
By default JS logs are verbose so errors from JS are shown when
running system tests, you can disable this with
`SELENIUM_DISABLE_VERBOSE_JS_LOGS=1`
You can use `SELENIUM_HEADLESS=0` to run the system
tests inside a chrome browser instead of headless, which can be useful to debug things
and see what the spec sees. See note above about `bin/ember-cli` to avoid
surprises.
I have modified `bin/turbo_rspec` to exclude `spec/system` by default,
support for parallel system specs is a little shaky right now and we don't
want them slowing down the turbo by default either.
### PageObjects and System Tests
To make querying and inspecting parts of the page easier
and more reusable inbetween system tests, we are using the
concept of [PageObjects](https://www.selenium.dev/documentation/test_practices/encouraged/page_object_models/) in
our system tests. A "Page" here is generally corresponds to
an overarching ember route, e.g. "Topic" for `/t/324345/some-topic`,
and this contains logic for querying components within the topic
such as "Posts".
I have also split "Modals" into their own entity. Further down the
line we may want to explore creating independent "Component"
contexts.
Capybara DSL should be included in each PageObject class,
reference for this can be found at https://rubydoc.info/github/teamcapybara/capybara/master#the-dsl
For system tests, since they are so slow, we want to focus on
the "happy path" and not do every different possible context
and branch check using them. They are meant to be overarching
tests that check a number of things are correct using the full stack
from JS and ember to rails to ruby and then the database.
### CI Setup
Whenever a system spec fails, a screenshot
is taken and a build artifact is produced _after the entire CI run is complete_,
which can be downloaded from the Actions UI in the repo.
Most importantly, a step to build the Ember app using Ember CLI
is needed, otherwise the JS assets cannot be found by capybara:
```
- name: Build Ember CLI
run: bin/ember-cli --build
```
A new `--build` argument has been added to `bin/ember-cli` for this
case, which is not needed locally if you already have the discourse
rails server running via `bin/ember-cli -u` since the whole server is built and
set up by default.
Co-authored-by: David Taylor <david@taylorhq.com>
Also:
* Remove an unused method (#fill_email)
* Replace a method that was used just once (#generate_username) with `SecureRandom.alphanumeric`
* Remove an obsolete dev puma `tmp/restart` file logic
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
On new macs, `localhost` resolves to IPV6 of `::1` and unfortunately
unicorn doesn't bind to IPv6 by default.
This seems to be the path of least resistance. By using 127.0.0.1 we
force IPv4 which works great.
Trying to use a local test hostname other than localhost
(e.g. discourse.test )for discourse development was difficult due
the fact that localhost was hardcoded in a few places. This patch
uses existing environment variables to allow a developer to use a
different domain when developing.
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
The `-depth` flag is incorrect on Linux, it does not take an argument
and causes an error and results in no plugins ever being found.
Copied from `man find`:
```
The global options occur after the list of start points, and so are not the same kind of option as -L, for example.
-d A synonym for -depth, for compatibility with FreeBSD, NetBSD, MacOS X and OpenBSD.
-depth Process each directory's contents before the directory itself. The -delete action also implies -depth.
...
-maxdepth levels
Descend at most levels (a non-negative integer) levels of directories below the starting-points. Using -maxdepth 0 means
only apply the tests and actions to the starting-points themselves.
```
Previously we would need to launch unicorn separately this achieves
the same goal by making 2 modifications:
1. If -u is supplied ember-cli binary will launch and monitor ember cli and unicorn
2. We suppress 200 requests to keep console clean (we may consider moving to development rails logs)
Also cleans out output a bit by supplying silent flags to yarn.
Running a development environment using Docker's qemu architecture emulation is currently not possible because `inotify` is not supported: https://github.com/docker/for-mac/issues/5321
- Add `d/ember-cli`, and publish port 4200
- Remove `d/sidekiq`. Sidekiq is now started with the rails server
- Move all `docker exec` logic into a single place, so we have one place to set environment variable pass-throughs
- Use `exec` for all bash scripts, so that return statuses are passed back correctly
- Avoid using `bin/bash -c` unnecessarily, because it makes escaping arguments difficult
Sometimes plugins directories will end up with other symlinks (e.g. inside node_modules folders). This logic does not work with deeply nested symlinks, and they are unlikely to be necessary for the plugin to work. Therefore we should only look for symlinks in the top-level of the `plugins` directory
Some people have noticed that if we change the packages in package.json
that they have to manually run `yarn install` or Discourse won't work.
This adds `yarn install` to the `bin/ember-cli` helper we run. It seems
quite fast if there is nothing to install so it shouldn't hurt to do
this every time we start the server.
This fixes the following error I've been seeing lately in RubyMine:
> Error:Your `bin/bundle` was not generated by Bundler, so this binstub cannot run.
> Replace `bin/bundle` by running `bundle binstubs bundler --force`, then run this command again.
The auto restart logic was sending a USR2 to the parent process without checking what the parent process actually was. In some situations, it might not be the `bin/unicorn` supervisor.
This commit switches to use a global variable for the supervisor PID. This will be much less prone to unexpected behavior.
(inspired by martin-brennan's recent adventure with `tags` exclusion)
**tl;dr: some of it is obsolete, some is dev-env specific and should live in *your* ~/.gitignore.**
---
To enable global gitignore (`~/.gitignore`) run:
```bash
git config --global core.excludesFile '~/.gitignore'
```
Then create that file and add your VIM/VSCode/Sublime/Emacs/Eclipse/RubyMine/JetBrains/macOS/Arch/Windows 95-specific stuff.
---
Reasons for removal:
* `bin` - never really ignored, all the files are checked in
* `.DS_Store`, `._.DS_Store` - OS specific
* `.sass-cache/*`, `/.bundle`, `/bundle/*`, `/cache`, `/logfile`, `!/plugins/discourse-nginx-performance-report`, `MiniProfiler/Ruby/rack-mini-profiler-2.0.1a.gem`, `config/fog_credentials.yml`, `/public/stylesheet-cache/*`, `script/download_db`, `script/refresh_db`, `config/version.rb` - no longer used?
* `/log/*.log` - covered by /log
* `bootsnap-load-path-cache`, `bootsnap-compile-cache` - bootsnap now keeps its ~~trash~~ temporary files in `/tmp`
* `/.project`, `/.buildpath`, `/.byebug_history`, `/.idea`, `discourse.sublime-workspace`, `*~`, `*.swp`, `*.swo`, `*.swm`, `config/multisite1.yml`, `.rvmrc`, `.ruby-version`, `.ruby-gemset`, `.rbenv`, `bundler_stubs/*`, `*.db`, `*.iml`, `*.swn`, `/package-lock.json`, `.vscode`, `.envrc`, `tags` - dev-env specific, see the note at the top of the commit
Using `bundle exec` will slow down server startup by at least 0.5s. `bin/unicorn` has built-in handling of bundler dependencies, so it is better to launch `bin/rails s` or `bin/unicorn` directly.
In Discourse, `rails s` ultimately launches the `bin/unicorn` script. However, the overhead of `rails` launching `bin/rails`, and then in turn `bin/unicorn` can be non-trivial. Therefore it is much better to run `bin/unicorn` directly.
This commit prints a warning message to STDERR when `rails s` is used. Functionality is unchanged.
We really want to encourage all developers to use Ember CLI for local
development and testing. This will display an error page if they are not
with instructions on how to start the local server.
To disable it, you can set `NO_EMBER_CLI=1` as an ENV variable