Our 'page_view_crawler' / 'page_view_anon' metrics are based purely on the User Agent sent by clients. This means that 'badly behaved' bots which are imitating real user agents are counted towards 'anon' page views.
This commit introduces a new method of tracking visitors. When an initial HTML request is made, we assume it is a 'non-browser' request (i.e. a bot). Then, once the JS application has booted, we notify the server to count it as a 'browser' request. This reliance on a JavaScript-capable browser matches up more closely to dedicated analytics systems like Google Analytics.
Existing data collection and graphs are unchanged. Data collected via the new technique is available in a new 'experimental' report.
- Run the CSP-nonce-related middlewares on the generated response
- Fix the readonly mode checking to avoid empty strings being passed (the `check_readonly_mode` before_action will not execute in the case of these re-dispatched exceptions)
- Move the BlockRequestsMiddleware cookie-setting to the middleware, so that it is included even for unusual HTML responses like these exceptions
This is to enable :array type attributes for Contract
attributes in services, this is a followup to the move
of services from chat to core here:
cab178a405
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Why this change?
This regressed in 6e9fbb5bab because we
had a `request.xhr?` check before we decide to block requests. However,
there could not none-xhr requests which we need to block as well at the
end of each system test when `@@block_requests` is true.
This also reverts commit 6437f27f90.
Why this change?
This reverts 725561cf4b as it did not
address the root cause of the problem even though it fixed the failing tests we were seeing
when running `bundle exec rspec --tag ~type:multisite --order random:776 spec/system/admin_customize_form_templates_spec.rb spec/system/admin_sidebar_navigation_spec.rb spec/system/admin_site_setting_search_spec.rb spec/system/composer/dont_feed_the_trolls_popup_spec.rb spec/system/composer/review_media_unless_trust_level_spec.rb spec/system/create_account_spec.rb spec/system/editing_sidebar_tags_navigation_spec.rb spec/system/email_change_spec.rb spec/system/emojis/emoji_deny_list_spec.rb spec/system/group_activity_spec.rb spec/system/hashtag_autocomplete_spec.rb spec/system/network_disconnected_spec.rb spec/system/post_menu_spec.rb spec/system/post_small_action_spec.rb spec/system/tags_intersection_spec.rb spec/system/topic_list_focus_spec.rb spec/system/topic_page_spec.rb spec/system/user_page/user_profile_info_panel_spec.rb spec/system/viewing_group_members_spec.rb spec/system/viewing_navigation_menu_preferences_spec.rb`.
The root cause here is that `before_action`s added to a controller is
order dependent. As such, some requests were not setting the cookie
because the `before_action` callback was not even hit as a prior
`before_action` callbacks has raised an error such as the `check_xhr`
`before_action` callback.
To resolve the problem, we need to add the `prepend: true` option in
our monkey patch of `ApplicationController` to ensure that the
`before_action` callback which we have added is always run first.
This change also makes a couple of changes:
1. Improve the response body when a request is blocked by the `BlockRequestsMiddleware` middleware
so that it makes debugging easier.
2. Only set the cookies for non-xhr HTML format requests. Setting it for
other formats is kind of pointless.
Why this change?
We noticed that running `LOAD_PLUGINS=1 rspec --seed=38855 plugins/chat/spec/system/chat_new_message_spec.rb` locally
results in the system tests randomly failing. When we inspected the
request logs closely, we noticed that a `/presence/get` request from a
previous rspec example was being processed when a new rspec example is
already being run. We know it was from the previous rspec example
because inspecting the auth token showed the request using the auth
token of a user from the previous example. However, when a request using
an auth token from a previous example is used it ends up logging out the
same user on the server side because the user id in the cookie is the same
due to the use of `fab!`.
I did some research and there is apparently no way to wait until all
inflight requests by the browser has completed through capybara or
selenium. Therefore, we will add an identifier by attaching a cookie to all non-xhr requests so that
xhr requests which are triggered subsequently will contain the cookie in the request.
In the `BlockRequestsMiddleware` middleware, we will then reject any
requests when the value of the identifier in the cookie does not match the current rspec's example
location.
To see the problem locally, change `Auth::DefaultCurrentUserProvider.find_v1_auth_cookie` to the following:
```
def self.find_v1_auth_cookie(env)
return env[DECRYPTED_AUTH_COOKIE] if env.key?(DECRYPTED_AUTH_COOKIE)
env[DECRYPTED_AUTH_COOKIE] = begin
request = ActionDispatch::Request.new(env)
cookie = request.cookies[TOKEN_COOKIE]
# don't even initialize a cookie jar if we don't have a cookie at all
if cookie&.valid_encoding? && cookie.present?
puts "#{env["REQUEST_PATH"]} #{request.cookie_jar.encrypted[TOKEN_COOKIE]&.with_indifferent_access}"
request.cookie_jar.encrypted[TOKEN_COOKIE]&.with_indifferent_access
end
end
end
```
After which run the following command: `LOAD_PLUGINS=1 rspec --format documentation --seed=38855 plugins/chat/spec/system/chat_new_message_spec.rb`
It takes a few tries but the last spec should fail and you should see something like this:
```
assets/chunk.c16f6ba8b6824baa47ac.d41d8cd9.js {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
/assets/chunk.050148142e1d2dc992dd.d41d8cd9.js {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
/chat/api/channels/527/messages {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
/uploads/default/test_0/optimized/1X/_129430568242d1b7f853bb13ebea28b3f6af4e7_2_512x512.png {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
redirects to existing chat channel
redirects to chat channel if recipients param is missing (PENDING: Temporarily skipped with xit)
with multiple users
/favicon.ico {"token"=>"9a75c114c4d3401509a23d240f0a46d4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591736}
/chat/new-message {"token"=>"9a75c114c4d3401509a23d240f0a46d4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591736}
/presence/get {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
```
Note how the `/presence/get` request is using a token from the previous example.
Co-authored-by: David Taylor <david@taylorhq.com>
This commit introduces the possibility to stream messages. To allow plugins to use streaming this commit also ships a `ChatSDK` library to allow to interact with few parts of discourse chat.
```ruby
ChatSDK::Message.create_with_stream(raw: "test") do |helper|
5.times do |i|
is_streaming = helper.stream(raw: "more #{i}")
next if !is_streaming
sleep 2
end
end
```
This commit also introduces all the frontend parts:
- messages can now be marked as streaming
- when streaming their content will be updated when a new content is appended
- a special UI will be showing (a blinking indicator)
- a cancel button allows the user to stop the streaming, when cancelled `helper.stream(...)` will return `false`, and the plugin can decide exit early
The strict-dynamic CSP directive is supported in all our target browsers, and makes for a much simpler configuration. Instead of allowlisting paths, we use a per-request nonce to authorize `<script>` tags, and then those scripts are allowed to load additional scripts (or add additional inline scripts) without restriction.
This becomes especially useful when admins want to add external scripts like Google Tag Manager, or advertising scripts, which then go on to load a ton of other scripts.
All script tags introduced via themes will automatically have the nonce attribute applied, so it should be zero-effort for theme developers. Plugins *may* need some changes if they are inserting their own script tags.
This commit introduces a strict-dynamic-based CSP behind an experimental `content_security_policy_strict_dynamic` site setting.
Why this change?
We have been debugging flaky system tests and noticed in https://github.com/discourse/discourse/actions/runs/7911902047/job/21596791343?pr=25690
that ActiveRecord connection checkout timeouts are encountered because
the Capybara server thread is processing requests even after
`Capybara.reset_session!` and ActiveRecord's `teardown_fixtures` have already been call.
The theory here is that an inflight request can still hit the Capybara
server even after `Capybara.reset_session!` has been called and end up
eating up an ActiveRecord connection for too long and also messing with
the database outside of a transaction.
What does this change do?
This change adds a `BlockRequestsMiddleware` middleware in the test
environment which is enabled to reject all incoming requests at the end
of each system test and before `Capybara.reset_session!` is called. At
the start of each RSpec test, the middleware is disabled again.
These routes were previously rendered using Rails, and had a fairly fragile 2fa implementation in vanilla-js. This commit refactors the routes to be handled in the Ember app, removes the custom vanilla-js bundles, and leans on our centralized 2fa implementation. It also introduces a set of system specs for the behavior.
We add `Access-Control-Allow-Origin: *` to all asset requests which are requested via a configured CDN. This is particularly important now that we're using browser-native `import()` to load the highlightjs bundle. Unfortunately, user-configurable 'cors_origins' site setting was overriding the wldcard value on CDN assets and causing CORS errors.
This commit updates the logic to give the `*` value precedence, and adds a spec for the situation. It also invalidates the cache of hljs assets (because CDNs will have cached the bad Access-Control-Allow-Origin header).
The rack-cors middleware is also slightly tweaked so that it is always inserted. This makes things easier to test and more consistent.
This value is included when generating static asset URLs. Updating the value will allow site operators to invalidate all asset urls to recover from configuration issues which may have been cached by CDNs/browsers.
Why this change?
As the number of themes which the Discourse team supports officially
grows, we want to ensure that changes made to Discourse core do not
break the plugins. As such, we are adding a step to our Github actions
test job to run the QUnit tests for all official themes.
What does this change do?
This change adds a new job to our tests Github actions workflow to run the QUnit
tests for all official plugins. This is achieved with the following
changes:
1. Update `testem.js` to rely on the `THEME_TEST_PAGES` env variable to set the
`test_page` option when running theme QUnit tests with testem. The
`test_page` option [allows an array to be specified](https://github.com/testem/testem#multiple-test-pages) such that tests for
multiple pages can be run at the same time. We are relying on a ENV variable
because the `testem` CLI does not support passing a list of pages
to the `--test_page` option.
2. Support a `/testem-theme-qunit/:testem_id/theme-qunit` Rails route in the development environment. This
is done because testem prefixes the path with a unique ID to the configured `test_page` URL.
This is problematic for us because we proxy all testem requests to the
Rails server and testem's proxy configuration option does not allow us
to easily rewrite the URL to remove the prefix. Therefore, we configure a proxy in testem to prefix `theme-qunit` requests with
`/testem-theme-qunit` which can then be easily identified by the Rails server and routed accordingly.
3. Update `qunit:test` to support a `THEME_IDS` environment variable
which will allow it to run QUnit tests for multiple themes at the
same time.
4. Support `bin/rake themes:qunit[ids,"<theme_id>|<theme_id>"]` to run
the QUnit tests for multiple themes at the same time.
5. Adds a `themes:qunit_all_official` Rake task which runs the QUnit
tests for all the official themes.
- Remove the wildcard crawler. This was already excluding almost all file types, but the exclude list was missing '.gjs' which meant those files were unnecessarily being hoisted into the `public/` directory during precompile
- Automatically include all ember-cli-generated assets without needing them to be listed. The main motivation for this change is to allow us to start using async imports via Embroider/Webpack. The filenames for those new async bundles will not be known in advance.
- Skips sprockets fingerprinting on Embroider/Webpack chunk JS files. Their filenames already include a fingerprint, and having sprockets change the filenames will cause problems for the async import feature (where filenames are included deep inside js bundles)
This commit also updates our ember-cli build so that it skips building plugin tests in the production environment. This should provide a slight build speed improvement.
Why this change?
This ensures that malicious requests cannot end up causing the logs to
quickly fill up. The default chosen is sufficient for most legitimate
requests to the Discourse application.
When truncation happens, parsing of logs in supported format like
lograge may break down.
Why this change?
This is a follow up to e8f7b62752.
Tracking of GC stats didn't really belong in the `MethodProfiler` class
so we want to extract that concern into its own class.
As part of this PR, the `track_gc_stat_per_request` site setting has
also been renamed to `instrument_gc_stat_per_request`.
Previously workbox JS was vendored into our git repository, and would be loaded from the `public/javascripts` directory with a 1 day cache lifetime. The main aim of this commit is to add 'cachebuster' to the workbox URL so that the cache lifetime can be increased.
- Remove vendored copies of workbox.
- Use ember-cli/broccoli to collect workbox files from node_modules into assets/workbox-{digest}
- Add assets to sprockets manifest so that they're collected from the ember-cli output directory (and uploaded to s3 when configured)
Some of the sprockets-related changes in this commit are not ideal, but we hope to remove sprockets in the not-too-distant future.
Legal topics, such as the Terms of Service and Privacy Policy topics
do not make sense if the entity creating the community is not a company.
These topics will be created and updated only when the company name is
present and deleted when it is not.
- Update welcome topic copy
- Edit the welcome topic automatically when the title or description changes
- Remove “Create your Welcome Topic” banner/CTA
- Add "edit welcome topic" user tip
Group user event webhooks filtered by group fail silently
because the `group_ids` job arg wasn't being passed into the job.
This change add's `group_ids` to the `EmitWebHookEvent` jobs queued for
`user_added_to_group` and `user_removed_from_group` events.
Currently, only user badge grants emit webhook events. This change
extends the `user_badge` webhook to emit user badge revocation events.
A new `user_badge_revoked` event has been introduced instead of relying
on the existing `user_badge_removed` event. `user_badge_removed` emitted
just the `badge_id` and `user_id` which aren't helpful for generating a
meaningful webhook payload for revoked(deleted) user badges.
The new event emits the user badge object.
This feature will allow sites to define which emoji are not allowed. Emoji in this list should be excluded from the set we show in the core emoji picker used in the composer for posts when emoji are enabled. And they should not be allowed to be chosen to be added to messages or as reactions in chat.
This feature prevents denied emoji from appearing in the following scenarios:
- topic title and page title
- private messages (topic title and body)
- inserting emojis into a chat
- reacting to chat messages
- using the emoji picker (composer, user status etc)
- using search within emoji picker
It also takes into account the various ways that emojis can be accessed, such as:
- emoji autocomplete suggestions
- emoji favourites (auto populates when adding to emoji deny list for example)
- emoji inline translations
- emoji skintones (ie. for certain hand gestures)
- Remove no-longer-used `commits-widget` and `site_customizations` paths
- Add `/presence/`
- Move from regex to strings. The unanchored regexes were causing unexpected behaviour (e.g. if a topic had the word `assets` in its slug, it would be skipped). When passed strings, mini-profiler uses `String#starts_with?` for comparison
- Add `Discourse.base_path` to ensure proper functionality in subfolder environments
Mini-profiler causes routes to become uncacheable. When using mini_profiler in production, the lack of cache on these routes can have a noticeable impact on performance/rate-limiting.
Since ad6c028484
in the zeitwork repo which was introduced to discourse/discourse in PR #20253,
the `autoloads` attribute on the loader has been marked `internal`, which means
that it errors if we try to access it directly.
Instead we should access it via the "mangled" version so it is clear
we're accessing an internal property, which is `__autoloads`.
Without this, any time a ruby file is saved the
000-development_reload_warnings.rb initializer will error.
* DEV: Change default bootstrap min users for private sites
Private sites should have a lower min users to escape bootstrap mode.
* reset back to 50 if site is changed to public, added some tests
* fix formatting
* Remove comment
* Move constant declaration
* Update config/initializers/014-track-setting-changes.rb
Shaving a bit of repetition
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
* Remove commented out code
* stree
---------
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
In dev/prod, these are absorbed by unicorn. Most commonly, they occur when a client interrupts a message-bus long-polling request.
Also reverts the EPIPE workaround introduced in 011c9b9973
This was previously disabled because of incompatibility with the ember-cli proxy. This commit fixes that incompatibility, and restores the development behaviour to match production.
There were three issues at play:
1. Our bootstrap-js addon handles the forwarding of most requests in the ember-cli proxy. This is not built to handle streaming responses. Solution: skip our custom request processing for `/message-bus/*` and use ember-cli's default `http-proxy`.
2. The request/response size-limiting middleware (`rawMiddleware`) would apply even to unhandled paths, causing request and response bodies to be buffered. Solution: skip it for any paths which are not handled by our custom addon.
3. Expressjs servers will buffer/compress responses. Solution: add `Cache-Control: no-transform` to message-bus responses. For now I've done this in development only, but it may be useful to add it to message-bus's default headers in future
This patch introduces a cookies rotator as indicated in the Rails
upgrade guide. This allows to migrate from the old SHA1 digest to the
new SHA256 digest.