Followup 2f2da72747
This commit moves topic view tracking from happening
every time a Topic is requested, which is susceptible
to inflating numbers of views from web crawlers, to
our request tracker middleware.
In this new location, topic views are only tracked when
the following headers are sent:
* HTTP_DISCOURSE_TRACK_VIEW - This is sent on every page navigation when
clicking around the ember app. We count these as browser page views
because we know it comes from the AJAX call in our app. The topic ID
is extracted from HTTP_DISCOURSE_TRACK_VIEW_TOPIC_ID
* HTTP_DISCOURSE_DEFERRED_TRACK_VIEW - Sent when MessageBus initializes
after first loading the page to count the initial page load view. The
topic ID is extracted from HTTP_DISCOURSE_DEFERRED_TRACK_VIEW.
This will bring topic views more in line with the change we
made to page views in the referenced commit and result in
more realistic topic view counts.
* UX: Add a rake task to monitor progress for long rebakes
When doing mass rebaking, this task will print progress, speed and
expected time to completion (ETC) on a loop. It is meant for rebakes
that take several hours or days.
It will calculate a 10m moving average over the past 6 hours, and print
ETC accordingly.
It also shows Sidekiq stats in case Sidekiq jobs for rebaking were
enqueued separately, instead of using the rebake_posts tasks in this
file (which are 100% synchronous and do not touch Sidekiq at all).
NOTE: only the currently unbaked count at task start time is considered;
this is useful in live communities with lots of traffic, where new posts
might otherwise change the goal posts continuously.
* Satisfy stree
I am changing many of these to notes or resolving them as is,
most of these I have not actively worked on in years so someone
else can work on them when we get to these areas again.
This commit continues work laid out by ffec8163b0 for the admin config page for the /about page. The last commit set up the user interface, and this one sets up all the wiring needed to make the input fields and save buttons actually work.
Internal topic: t/128544.
* rescue mmdb download failure
If mmdb database fails to download a bootstrap fails. This is a trivial fix for that problem. A more elegant solution might check whether the dataabase was downloaded and provide a helpful error message.
* linting
---------
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
When using the full page search and filtering down to a specific topic, the sort order was overwritten to by by "post_number".
This was confusing because we allow different type of sort order in the full search page.
This fixes it by only sorting by post_number when there's no "global" sort order defined.
Since the "new topic map" uses the search endpoint behind the scene, this also fixes the "most likes" popup.
Context - https://meta.discourse.org/t/searching-order-seems-to-be-broken-when-searching-in-topic/312303
If the `enforce_second_factor_on_external_auth` setting
is disabled and a user logs in with an OAuth method,
they don't automatically get redirected to /preferences/second-factor
on login. However, they can get there manually, and once there
they cannot leave.
This commit fixes the issue and allows them to leave
and also does some refactors to indicate to the client
what login method is used as a followup to
0e1102b332
Followup 96a0781bc1
When sending emails where secure uploads is enabled
and secure_uploads_allow_embed_images_in_emails is
true, we attach the images to the email, and we
do some munging with the final email so the structure
of the MIME parts looks like this:
```
multipart/mixed
multipart/alternative
text/plain
text/html
image/jpeg
image/jpeg
image/png
```
However, we were not specifying the `boundary` of the
`multipart/mixed` main content-type of the email, so
sometimes the email would come through appearing to
have an empty body with the entire thing attached as
one attachment, and some mail parsers considered the
entire email as the "epilogue" and/or "preamble".
This commit fixes the issue by specifying the boundary
in the content-type header per https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html
Adds a checkbox to filter untranslated text strings in the admin UI, behind a hidden and default `false` site setting `admin_allow_filter_untranslated_text`.
* DEV: Upgrade Rails to 7.1
* FIX: Remove references to `Rails.logger.chained`
`Rails.logger.chained` was provided by Logster before Rails 7.1
introduced their broadcast logger. Now all the loggers are added to
`Rails.logger.broadcasts`.
Some code in our initializers was still using `chained` instead of
`broadcasts`.
* DEV: Make parameters optional to all FakeLogger methods
* FIX: Set `override_level` on Logster loggers (#27519)
A followup to f595d599dd
* FIX: Don’t duplicate Rack response
---------
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
This ensures that the theme id is resolved as early as possible in the
request cycle. This is necessary for the custom homepage to skip
preloading the wrong data.
* DEV: Upgrade Rails to 7.1
* FIX: Remove references to `Rails.logger.chained`
`Rails.logger.chained` was provided by Logster before Rails 7.1
introduced their broadcast logger. Now all the loggers are added to
`Rails.logger.broadcasts`.
Some code in our initializers was still using `chained` instead of
`broadcasts`.
* DEV: Make parameters optional to all FakeLogger methods
* FIX: Set `override_level` on Logster loggers (#27519)
A followup to f595d599dd
* FIX: Don’t duplicate Rack response
---------
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
We recently fixed a problem where secure upload images weren't re-attached when sending the activity summary e-mail.
This fix contained a bug that would lead to n copies of the e-mail body being included, n being the number of duplicates. This is because #fix_parts_after_attachments! was called once per attachment, and adding more parts to the multipart e-mail.
This PR fixes that by:
Adding a failing test case for the above.
Moving the looping over multiple posts into #fix_parts_after_attachments! itself.
When a post is cooked the links are extracted and `TopicLink` instances
are created for each of them. These links are used in various places,
including the topic view, user summary page, etc.
In previous commit 48e5d1a, hotlinked images from Oneboxes have been
ignored from the texts, but hotlinked images turned into Lightboxes
were still extracted.
Many site settings can be distructive or have huge side-effects
for a site that the admin may not be aware of when changing it.
This commit introduces a `requires_confirmation` attribute that
can be added to any site setting. When it is true, a confirmation
dialog will open if that setting is changed in the admin UI,
optionally with a custom message that is defined in client.en.yml.
If the admin does not confirm, we reset the setting to its previous
clean value and do not save the new value.
A new admin setting called `enforce_second_factor_on_external_auth`. It allows users to authenticate using external providers even when 2FA is forced with `enforce_second_factor` site setting.
* Revert "FIX: Set `override_level` on Logster loggers (#27519)"
This reverts commit c1b0488c54.
* Revert "DEV: Make parameters optional to all FakeLogger methods"
This reverts commit 3318dad7b4.
* Revert "FIX: Remove references to `Rails.logger.chained`"
This reverts commit f595d599dd.
* Revert "DEV: Upgrade Rails to 7.1"
This reverts commit 081b00391e.
`Rails.logger.chained` was provided by Logster before Rails 7.1
introduced their broadcast logger. Now all the loggers are added to
`Rails.logger.broadcasts`.
Some code in our initializers was still using `chained` instead of
`broadcasts`.
Currently when a cache entry is corrupt, we log the event without doing
anything else. It means the cache is still corrupt, and the proper value
isn’t computed again.
Normally, it’s very rare the cache becomes corrupt, but it can happen
when upgrading Rails for example and the cache format changes. This is
normally handled automatically by Rails but since we’re using a custom
cache class, we have to do it ourselves.
This patch takes the same approach the Rails team did, when a cache
entry is corrupt, we treat it as a miss, recomputing the proper value
and caching it in the new format.
Wasn't quite handling the cases where a closing bracket `]` was used in the value of one of the attributes.
```markdown
[chat quote=user channel="[broken]"]
```
Would not be correctly parsed because we would _greedily_ use the first `]` as the end of the tag even though it might be a valid character when inside proper quotes.
c39a4de139/app/assets/javascripts/discourse-markdown-it/src/features/bbcode-block.js (L62)
Re-wrote the `parseBBCodeTag` to properly handle the following cases
- A closing tag (aka `[/name]`) which are easy since they don't have any attributes
- An old `[quote=...]` format we used that doesn't uses quotes but still has various attributes of the form `key:value`
- All three valid BBCode opening tag formats we support
- `[name]` without any attributes
- `[name=foo]` with a default value
- `[name foo=bar]` with some attributes
Ended up having to fix/rewrite the few bbcode rules that were using the `parseBBCodeTag` function, namely `d-wrap` and `discourse-local-dates`.
While working on this, I think I also found a way to get rid the of shims we had in place so that plugins could use the `parseBBCodeTag` function.
Reference - https://meta.discourse.org/t/having-a-right-bracket-in-a-channel-name-breaks-all-quotes-from-that-channel/308439
This is a duplicate of the `host` field which means we are bloating the
logs unnecessarily.
Just remove without depreciation for now but we are open to properly
depreciating it if others depend on this field.
This commit adds ability to fetch a subset of site settings from the `/admin/site_settings` endpoint so that it can be used in all places where the client app needs access to a subset of the site settings.
Additionally, this commit also introduces a new service class called `UpdateSiteSetting` that encapsulates all the logic that surrounds updating a site setting so that it can be used to update site setting(s) anywhere in the backend. This service comes in handy with, for example, the controller for the flags admin config area which may need to update some site settings related to flags.
Internal topic: t/130713.
This commits introduces the `sidekiq_report_long_running_jobs_minutes`
global setting which allows a site administrator to log a warning in the
Rails log when a Sidekiq job has been running for too long.
The warning is logged with the backtrace of the thread that is
processing the Sidekiq job to make it easier to figure out what a
sidekiq job is stuck on.
When we turn on settings automatically for customers,
we sometimes use `.set_and_log` which will make a staff
action log for the site setting change. This is fine, but
there is no context for customers.
This change allows setting a message with `.set_and_log`, which
will be stored in the `details` column of the staff action log
created, which will show up on `/admin/logs/staff_action_logs`
---------
Co-authored-by: Kelv <kelv@discourse.org>
In #26642 we introduced a change that re-attaches securely uploaded images in the digest e-mail. However, this change assumed that the type argument to the Email::Sender constructor would be a symbol, but when it is coming from the UserEmail job it is a string. This PR fixes that.
This introduces the syntax of
`category:a,b,c` which will search across multiple categories.
Previously there was no way to allow search across a wide selection of
categories.
We are seeing the following error in our logs when Sidekiq is sent a
`USR1` signal in production when logrotate happens:
```
log writing failed. stream closed in another thread
Error encountered while starting Sidekiq: can't be called from trap context\n/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/unicorn-6.1.0/lib/unicorn/util.rb:71:in `reopen'
```
I'm not quite sure where the error is triggered from so I'm improving
the way we log errors.
We were using `autoclose` as the topic status update
when silently closing topics using the bulk
actions (introduced in 0464ddcd9b).
However, this resulted in a message like this showing in
the topic as a small moderator post:
> This topic was automatically closed after X days.
This is not accurate, the topic was bulk closed by someone.
Instead, we can use `closed` as the status, and a more accurate
> Closed on DATE
message is used. `TopicStatusUpdater` needed an additional
option to keep the same "fake read" behaviour as autoclose
so we can keep the same functionality for silently closing
topics in bulk actions.
decorator-transforms (https://github.com/ef4/decorator-transforms) is a modern replacement for babel's plugin-proposal-decorators. It provides a decorator implementation using modern browser features, without needing to enable babel's full suite of class feature transformations. This improves the developer experience and performance.
In local testing with Google's 'tachometer' tool, this reduces Discourse's 'init-to-render' time by around 3-4% (230ms -> 222ms).
It reduces our initial gzip'd JS payloads by 3.2% (2.43MB -> 2.35MB), or 7.5% (14.5MB -> 13.4MB) uncompressed.
This was previously reverted in 97847f6. This version includes a babel transformation which works around the bug in Safari <= 15.
For Cloudflare compatibility issues, check https://meta.discourse.org/t/311390
Not all HTML elements are converted into Markdown. Some are kept as HTML.
Without this fix XML/HTML entities that are formatted as text instead of code are swallowed by Discourse.
This also fixes quotes in the `title` attribute of the `<abbr>` tag.
Previously `HtmlToMarkdown` always converted HTML tables into Markdown tables. That lead to some badly formatted Markdown tables, e.g. when the table contained `rowspan` or `colspan`. This solves the issue by using very basic HTML tables in those cases.
Inline the helper functions, avoid creating and then immediately destructuring arrays, use complete strings instead of string interpolation, Map instead of a pojo.
This commit introduces a hidden `s3_inventory_bucket` site setting which
replaces the `enable_s3_inventory` and `s3_configure_inventory_policy`
site setting.
The reason `enable_s3_inventory` and `s3_configure_inventory_policy`
site settings are removed is because this feature has technically been
broken since it was introduced. When the `enable_s3_inventory` feature
is turned on, the app will because configure a daily inventory policy for the
`s3_upload_bucket` bucket and store the inventories under a prefix in
the bucket. The problem here is that once the inventories are created,
there is nothing cleaning up all these inventories so whoever that has
enabled this feature would have been paying the cost of storing a whole
bunch of inventory files which are never used. Given that we have not
received any complains about inventory files inflating S3 storage costs,
we think that it is very likely that this feature is no longer being
used and we are looking to drop support for this feature in the not too
distance future.
For now, we will still support a hidden `s3_inventory_bucket` site
setting which site administrators can configure via the
`DISCOURSE_S3_INVENTORY_BUCKET` env.
Continued work on moderate flags UI.
In this PR admins are allowed to change the order of flags. The notify user flag is always on top but all other flags can be moved.
Previously "themes frontend" CI job would:
1. pull compatible versions of themes that happened to be in the base image
2. clone all official themes (overriding the compatible versions from 1.)
3. run tests
This commit adds the ability for site administrators to mark users'
passwords as expired. Note that this commit does not add any client side
interface to mark a user's password as expired.
The following changes are introduced in this commit:
1. Adds a `user_passwords` table and `UserPassword` model. While the
`user_passwords` table is currently used to only store expired
passwords, it will be used in the future to store a user's current
password as well.
2. Adds a `UserPasswordExpirer.expire_user_password` method which can
be used from the Rails console to mark a user's password as expired.
3. Updates `SessionsController#create` to check that the user's current
password has not been marked as expired after confirming the
password. If the password is determined to be expired based on the
existence of a `UserPassword` record with the `password_expired_at`
column set, we will not log the user in and will display a password
expired notice. A forgot password email is automatically send out to
the user as well.
This commit introduces the following changes:
1. Introduce the `SignalTrapLogger` singleton which starts a single
thread that polls a queue to log messages with the specified logger.
This thread is necessary becasue most loggers cannot be used inside
the `Signal.trap` context as they rely on mutexes which are not
allowed within the context.
2. Moves the monkey patch in `freedom_patches/unicorn_http_server_patch.rb` to
`config/unicorn.config.rb` which is already monkey patching
`Unicorn::HttpServer`.
3. `Unicorn::HttpServer` will now automatically send a `USR2` signal to
a unicorn worker 2 seconds before the worker is timed out by the
Unicorn master.
4. When a Unicorn worker receives a `USR2` signal, it will now log only
the main thread's backtraces to `Rails.logger`. Previously, it was
`put`ing the backtraces to `STDOUT` which most people wouldn't read.
Logging it via `Rails.logger` will make the backtraces easily
accessible via `/logs`.
This commit introduces the following changes which allows a site
administrator to mark `Upload` records with the `s3_file_missing`
verification status which will result in the `Upload` record being ignored when
`Discourse.store.list_missing_uploads` is ran on a site where S3 uploads
are enabled and `SiteSetting.enable_s3_inventory` is set to `true`.
1. Introduce `s3_file_missing` to `Upload.verification_statuses`
2. Introduce `Upload.mark_invalid_s3_uploads_as_missing` which updates
`Upload#verification_status` of all `Upload` records from `invalid_etag` to `s3_file_missing`.
3. Introduce `rake uploads:mark_invalid_s3_uploads_as_missing` Rake task
which allows a site administrator to change `Upload` records with
`invalid_etag` verification status to the `s3_file_missing`
verificaton_status.
4. Update `S3Inventory` to ignore `Upload` records with the
`s3_file_missing` verification status.
When uploading a video, the composer will now show a thumbnail image in
the composer preview instead of just the video placeholder image.
If `enable_diffhtml_preview` is enabled the video will be rendered in
the composer preview and is playable.
This commit updates all Sidekiq signal handling event logs to go through
Unicorn's logger instead of logging to STDOUT. Going through a proper logger
means the log messages are logged in the format which the logger has configured.
This means we get proper timestamp for the log messages.
Introduced back in 2022 in
e3d495850d,
our new more specific message-id format for inbound and
outbound emails has now been in use for a very long time,
we can remove the support for the old formats:
`topic/:topic_id/:post_id.:random@:host`
`topic/:topic_id@:host`
`topic/:topic_id.:random@:host`
In this PR service objects were moved to Core https://github.com/discourse/discourse/pull/26506
However, ServiceRunner should be moved as well. Mostly for CI to run effortlessly without loading plugins.
This commit moves the logic for crawler rate limits out of the application controller and into the request tracker middleware. The reason for this move is to apply rate limits to all crawler requests instead of just the requests that make it to the application controller. Some requests are served early from the middleware stack without reaching the Rails app for performance reasons (e.g. `AnonymousCache`) which results in crawlers getting 200 responses even though they've reached their limits and should be getting 429 responses.
Internal topic: t/128810.
This commit adds a `DISCOURSE_DUMP_BACKTRACES_ON_UNICORN_WORKER_TIMEOUT`
environment that will allow us to dump all backtraces for all threads of
a Unicorn worker 2 seconds before it times out. In development,
backtraces are dumped to `STDOUT` and in production we will dump it to
`unicorn.stdout.log`.
We want to dump all the backtraces to make it easier to identify the
cause of a Unicorn worker timing out.
This commit updates `S3Inventory#files` to ignore S3 inventory files
which have a `last_modified` timestamp which are not at least 2 days
older than `BackupMetadata.last_restore_date` timestamp.
This check was previously only in `Jobs::EnsureS3UploadsExistence` but
`S3Inventory` can also be used via Rake tasks so this protection needs
to be in `S3Inventory` and not in the scheduled job.
* DEV: allow reply_by_email, visit_link_to_respond strings to be modified by plugins
* DEV: separate visit_link_to_respond and reply_by_email modifiers out
This PR aims to add bulk actions to the user's bookmarks.
After this feature, all users should be able to select multiple bookmarks and perform the actions of "deleting" or "clear reminders"
This fixes the `PrettyText.make_all_links_absolute` to better handle subfolder.
In subfolder, when given the cooked version of a post, links to mentions includes the `Discourse.base_path` prefix. Adding the `Discourse.base_url` was doubling the `Discourse.base_path`.
The issue was hidden behind the specs which was stubbing `Discourse.base_url` instead of relying on `Discourse.base_path`.
This fixes both the "algorithm" used in `PrettyText.make_all_links_absolute` to better handle this case and correct the specs to properly handle subfolder cases.
There are lots of changes in the specs due to a refactoring to use squiggly heredoc strings for easier reading and less escaping.
This commit adds a step in our tests workflow on Github actions to update the themes to
use the compatible version when not running aginast the `main` branch.
This is to ensure that we are not running
the tests for themes against an incompatible version of Discourse.
AuthProvider#enabled_setting=, used primarily by plugins, has been deprecated since version 2.9, in favour of Authenticator#enabled?. This PR confirms we are seeing no more usage and removes the method.
Whenever one creates, updates, or deletes a post, we should keep the `topic.word_count` counter in sync.
Context - https://meta.discourse.org/t/-/308062
This commits updates `FinalDestination#get` to not forward
`Authorization` header on redirects since most HTTP clients I tested like
curl and wget does not it.
This also fixes a recent problem in `DiscourseIpInfo.mmdb_download`
where we will fail to download the databases when both `GlobalSetting.maxmind_account_id` and
`GlobalSetting.maxmind_license_key` has been set. The failure is due to
the bug above where the redirected URL given by MaxMind does not accept
an `Authorization` header.
This commit optimises the database query generated by
`TopicsFilter#filter_categories` when the `-category:*` filter is used.
Previously, the method will add the `topics.category_id NOT IN
(<category ids to be excluded>)` filter to the resulting query. However,
we noticed that the performance of the query degrades as the number of
rows in the `topics` table grow and when the number of category ids to be
excluded is large.
Sample of query we ran on a large database in production to demonstrate
the improvement:
Before:
```
SELECT topics.id FROM topics WHERE topics.category_id NOT IN (83, 136, 149, 143, 153, 165, 161, 123, 155, 163, 144, 134, 69, 135, 158, 141, 151, 160, 131, 133, 89, 104, 150, 147, 132, 145, 108, 146, 122, 100, 128, 154, 95, 102, 140, 139, 88, 91, 87) ORDER BY topics.id DESC LIMIT 5;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=27795.34..27795.34 rows=1 width=4) (actual time=29.317..30.165 rows=5 loops=1)
-> Sort (cost=27795.34..27795.34 rows=1 width=4) (actual time=29.316..30.163 rows=5 loops=1)
Sort Key: id DESC
Sort Method: top-N heapsort Memory: 25kB
-> Gather (cost=1000.10..27795.33 rows=1 width=4) (actual time=0.187..26.132 rows=73478 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on topics (cost=0.10..26795.23 rows=1 width=4) (actual time=0.013..22.252 rows=24493 loops=3)
Filter: (category_id <> ALL ('{83,136,149,143,153,165,161,123,155,163,144,134,69,135,158,141,151,160,131,133,89,104,150,147,132,145,108,146,122,100,128,154,95,102,140,139,88,91,87}'::integer[]))
Rows Removed by Filter: 77276
Planning Time: 0.140 ms
Execution Time: 30.181 ms
```
After:
```
SELECT topics.id FROM topics WHERE NOT EXISTS (
SELECT 1
FROM unnest(array[83, 136, 149, 143, 153, 165, 161, 123, 155, 163, 144, 134, 69, 135, 158, 141, 151, 160, 131, 133, 89, 104, 150, 147, 132, 145, 108, 146, 122, 100, 128, 154, 95, 102, 140, 139, 88, 91, 87]) AS excluded_categories(category_id)
WHERE topics.category_id IS NULL OR excluded_categories.category_id = topics.category_id
) ORDER BY topics.id DESC LIMIT 5 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..13.52 rows=5 width=4) (actual time=0.028..0.110 rows=5 loops=1)
-> Nested Loop Anti Join (cost=0.42..179929.62 rows=68715 width=4) (actual time=0.027..0.109 rows=5 loops=1)
Join Filter: ((topics.category_id IS NULL) OR (excluded_categories.category_id = topics.category_id))
Rows Removed by Join Filter: 239
-> Index Scan Backward using forum_threads_pkey on topics (cost=0.42..108925.71 rows=305301 width=8) (actual time=0.012..0.062 rows=44 loops=1)
-> Function Scan on unnest excluded_categories (cost=0.00..0.39 rows=39 width=4) (actual time=0.000..0.001 rows=6 loops=44)
Planning Time: 0.126 ms
Execution Time: 0.124 ms
(8 rows)
```
This commit changes request method for "categories/search" from GET to
POST to make sure that long filters can be passed to the server. For
example, category selectors with many categories are setting the full
list of selected category IDs to ensure these are filtered out from the
list of choices. This can result in a long URL that exceeds the maximum
length.
This reverts commit 0f4520867b.
This has led to two problems:
1. An incompatibility with Cloudflare's "auto minify" feature. They've deprecated this feature because of incompatibility with modern JS syntax. But unfortunately it will remain enabled on existing properties until 2024-08-05.
2. Discourse fails to boot in Safari 15. This is strange, because Safari does support all the required features in our production JS bundles. Even more strangely, things start working as soon as you open the developer tools. That suggests the cause could be a Safari bug rather than a simple incompatibility.
Reverting while we work out a path forward on both those issues.
This commit switches `DiscourseIpInfo.mmdb_download` to use the
permalinks supplied by MaxMind to download the MaxMind databases as
specified in
https://dev.maxmind.com/geoip/updating-databases#directly-downloading-databases
which states:
```
To directly download databases, follow these steps:
1. In the "Download Links" column, click "Get Permalink(s)" for the desired database.
2. Copy the permalink(s) provided in the modal window.
3. Provide your account ID and your license key using Basic Authentication to authenticate.
```
Previously we are downloading from `https://download.maxmind.com/app/geoip_download` but this is not
documented anyway on MaxMind's docs so this URL can in theory break
in the future without warning. Therefore, we are taking a proactive
approach to download the databases from MaxMind the recommended way
instead of relying on a hidden URL. This old way of downloading the
databases with only a license key will be deprecated in 3.3 and be
removed in 3.4.
In 95a82d608d, we lowered the default for
`Onebox.options.max_download_kb` from 10mb to 2mb for security hardening
purposes. However, this resulted in multiple bug reports where seemingly
nomral URLs stopped being oneboxed. It turns out that lowering
`Onebox.options.max_download_kb` resulted in `Onebox::Helpers::DownloadTooLarge` being raised
more often for more URLs in `Onebox::Helpers.fetch_response` which
`Onebox::Helpers.fetch_html_doc` relies on. When
`Onebox::Helpers::DownloadTooLarge` is raised in
`Onebox::Helpers.fetch_response`, we throw away whatever response body
which we have already downloaded at that point. This is not ideal
because Nokogiri can parse incomplete HTML documents and there is a
really high chance that the incomplete HTML document still contains the
information which we need for oneboxing.
Therefore, this commit updates `Onebox::Helpers.fetch_html_doc` to not
throw away the response body when the size of the response body exceeds
`Onebox.options.max_download_size`. Instead, we just take whatever
response which we have and get Nokogiri to parse it.
This can happen for various reasons including rate limiting and middleware bugs. This should resolve the warning we're seeing in the logs
```
RequestTracker.get_data failed : NoMethodError : undefined method `[]' for nil:NilClass
```
decorator-transforms (https://github.com/ef4/decorator-transforms) is a modern replacement for babel's plugin-proposal-decorators. It provides a decorator implementation using modern browser features, without needing to enable babel's full suite of class feature transformations. This improves the developer experience and performance.
In local testing with Google's 'tachometer' tool, this reduces Discourse's 'init-to-render' time by around 3-4% (230ms -> 222ms).
It reduces our initial gzip'd JS payloads by 3.2% (2.43MB -> 2.35MB), or 7.5% (14.5MB -> 13.4MB) uncompressed.
This keeps coming up in user testing as something
we want to get rid of. The `navigation_menu` setting
has been set to sidebar by default for some time now,
and we are rolling out admin sidebar widely. It just
doesn't make sense to let people turn this off in
the first step of the wizard -- we _want_ people to
use the sidebar.
Whenever a post already failed "lightweight" validations, we skip all the expensive validations (that cooks the post or run SQL queries) so that we reply as soon as possible.
Also skip validating polls when there's no "[/poll]" in the raw.
Internal ref - t/115890
When running `rake uploads:regenerate_missing_optimized`,
a `Discourse::InvalidAccess` will be raised if an SVG
file is being processed as `OptimizedImage.prepend_decoder!`
doesn't support the svg extension. This commit simply copies
the original SVG file as the thumbnail, just like currently
`OptimizedImage.create_for` does.
Ignored columns can only be dropped when its associated post-deploy
migration has been promoted to a regular migration. This is so because
Discourse doesn't rely on a schema file system to setup a brand new
database and thus the column information will be loaded by the
application first before the post-deploy migration runs.
- Use 'cheap-source-map' webpack config on low-memory machines
This results in worse quality sourcemaps in browser dev tools, but it significantly reduces memory use in our webpack build. In approximate local testing it drops from 1100mb to 590mb. This should make the rebuild process on low-memory machines much faster and less likely to trigger OOM errors.
In development, and on higher-memory machines, the higher-quality 'source-map' option is maintained.
- Disable Webpack's built-in `minimize` feature. Embroider already applies Terser after the webpack build is complete. There is no need to double-minimize the output.
- Update ember-cli-progress-ci to print to stderr instead of stdout. For some reason, pups (used by discourse_docker) buffers the stdout of commands and only prints when they are finished. stderr does not have this same limitation, so switching will mean sysadmins can see the progress of the ember build in real-time.
Given the number of variables it's hard to promise exact numbers. But, in my tests on a DO droplet with 1GB RAM (+2GB swap), this reduced the `ember build` portion of a `./launcher rebuild app` from ~50 minutes to ~15 minutes.
This commit adds a `isValidUrl` helper function to the context in
which theme migrations are ran in. This helper function is to make it
easier for theme developers to check if a string is a valid URL or path
when writing theme migrations. This can be helpful in cases when
migrating a string based setting to `type: objects` which contain `type:
string` properties with URL validations enabled.
This commit also introduces the `UrlHelper.is_valid_url?` method
which actually checks that the URL string is of the valid format instead of
only checking if the URL string is parseable which is what `UrlHelper.relaxed_parse` does
and is not sufficient for our needs.
This is a follow up of 5fcb7c262d
It was missing the case where secure uploads is enabled, which creates a copy of the upload no matter what.
So this checks for the original_sha1 of the uploads as well when checking for duplicates.
This commit introduces a few changes as a result of
customer issues with finding why a topic was relisted.
In one case, if a user edited the OP of a topic that was
unlisted and hidden because of too many flags, the topic
would get relisted by directly changing topic.visible,
instead of going via TopicStatusUpdater.
To improve tracking we:
* Introduce a visibility_reason_id to topic which functions
in a similar way to hidden_reason_id on post, this column is
set from the various places we change topic visibility
* Fix Post#unhide! which was directly modifying topic.visible,
instead we use TopicStatusUpdater which sets visibility_reason_id
and also makes a small action post
* Show the reason topic visibility changed when hovering the
unlisted icon in topic status on topic titles
This commit fixes a bug in theme settings migrations where values of `objects` typed theme settings aren't passed to migrations even when there are overriding values for those settings. What causes this bug is that, when creating the hash that contains all the overridden settings and will be passed to migrations, the values of `objects` typed settings are incorrectly retrieved from the `value` column (which is always nil for `objects` type) instead of `json_value`. `objects` settings are different from all other types in that they store their values in the `json_value` column and they need to be special-cased when retrieving their values.
Our 'page_view_crawler' / 'page_view_anon' metrics are based purely on the User Agent sent by clients. This means that 'badly behaved' bots which are imitating real user agents are counted towards 'anon' page views.
This commit introduces a new method of tracking visitors. When an initial HTML request is made, we assume it is a 'non-browser' request (i.e. a bot). Then, once the JS application has booted, we notify the server to count it as a 'browser' request. This reliance on a JavaScript-capable browser matches up more closely to dedicated analytics systems like Google Analytics.
Existing data collection and graphs are unchanged. Data collected via the new technique is available in a new 'experimental' report.
This commit fixes a bug in the `themes:update` rake task which resulted
in the ActiveRecord transaction not being rolled back when an error was
encountered. The transaction was first introduced in
7f0682f4f2 which changed a `begin..rescue`
block to `transaction do..rescue`. The problem with that change
prevented the transaction from ever rolling back as the code block
looks something like this:
```
transaction do
begin
update_theme
rescue => e
# surpress error
end
end
```
From the transaction's point of view now, it will never rollback even if
an error was encountered when updating the remote theme because it will
never see the error.
Instead we should have done something like this if we wanted to surpress
the errors encountered while still ensuring that the transaction is
rolled back.
```
begin
transaction do
update_theme
end
rescue => e
# surpress error
end
```
This is essential for us to determine which site is encountering an
error while updating remote themes. We are also including the theme's id
because themes can have the same name.
This service-worker caching functionality was disabled by default in 1c58395bca, and the setting to re-enable was marked as experimental. Now we are dropping all the related logic.
- Run the CSP-nonce-related middlewares on the generated response
- Fix the readonly mode checking to avoid empty strings being passed (the `check_readonly_mode` before_action will not execute in the case of these re-dispatched exceptions)
- Move the BlockRequestsMiddleware cookie-setting to the middleware, so that it is included even for unusual HTML responses like these exceptions
LinkedIn has grandfathered its old OAuth2 provider. This can only be used by existing apps. New apps have to use the new OIDC provider.
This PR adds a linkedin_oidc provider to core. This will exist alongside the discourse-linkedin-auth plugin, which will be kept for those still using the deprecated provider.
For e-mails, secure uploads redacts all secure images, and later uses the access control post to re-attached allowed ones. We pass the ID of this post through the X-Discourse-Post-Id header. As the name suggests, this assumes there's only ever one access control post. This is not true for activity summary e-mails, as they summarize across posts.
This adds a new header, X-Discourse-Post-Ids, which is used the same way as the old header, but also works for the case where an e-mail is associated with multiple posts.
Automatically add `moderators` and `admins` auto groups to specific site settings.
In the new group-based permissions systems, we just want to check the user’s groups since it more accurately reflects reality
Affected settings:
- tag_topic_allowed_groups
- create_tag_allowed_groups
- send_email_messages_allowed_groups
- personal_message_enabled_groups
- here_mention_allowed_groups
- approve_unless_allowed_groups
- approve_new_topics_unless_allowed_groups
- skip_review_media_groups
- email_in_allowed_groups
- create_topic_allowed_groups
- edit_wiki_post_allowed_groups
- edit_post_allowed_groups
- self_wiki_allowed_groups
- flag_post_allowed_groups
- post_links_allowed_groups
- embedded_media_post_allowed_groups
- profile_background_allowed_groups
- user_card_background_allowed_groups
- invite_allowed_groups
- ignore_allowed_groups
- user_api_key_allowed_groups
This is to enable :array type attributes for Contract
attributes in services, this is a followup to the move
of services from chat to core here:
cab178a405
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
This SQL tries to insert as much data as possible into the `user_stats` table by either calculating or by approximating stats based on existing. It also fixes an error in the calculation of `reply_count`which mistakenly contained all posts, not just replies.
This change also disables some steps in the `import:ensure_consistency` rake task by setting the `SKIP_USER_STATS` env variable. Otherwise, the rake task will overwrite the calculated data in the `user_stats` table with inaccurate data. I'm not changing or removing the logic from the rake task yet because other bulk import scripts seem to depend on it.
This modifier allows plugins to alter the outcome of
`should_secure_uploads?` on a Post record, for cases when
plugins need post-attached uploads to always be secure (or
not secure) in specific scenarios.
This commit changes enum typed theme objects property to be optional.
Previously, an enum typed property is always required but we have found
that this might not be ideal so we want to change it.
This method name is a bit confusing; with_secure_uploads implies
it may return a block or something with the uploads of the post,
and has_secure_uploads implies that it's checking whether the post
is linked to any secure uploads.
should_secure_uploads? communicates the true intent of this method --
which is to say whether uploads attached to this post should be
secure or not.
This commit changes `DiscourseIpInfo.mmdb_download` to use `File.join`
instead of `URI.join` when `GlobalSetting.maxmind_mirror_url` has been
configured. This is necessary because `URI.join` does not work the way I
expect it to work when I implemented it previously.
`URI.join("http://www.example.com/mirror", "test.tar.gz") results in
`http://www.example.com/test.tar.gz` instead of our expected
`http://www.exmaple.com/mirror/test.tar.gz`. For our simple use case
here, `File.join` is sufficient.
We will be collecting the logo URL and the site's default locale values along with existing basic details to display the site on the Discourse Discover listing page. It will be included only if the site is opted-in by enabling the "`include_in_discourse_discover`" site setting.
Also, we no longer going to use `about.json` and `site/statistics.json` endpoints retrieve these data. We will be using only the `site/basic-info.json` endpoint.
Doesn't actually seem to be used by any of our formatters, but let's send the proper data anyway for future-proofing. Followup to ff6cb1bc05 and 8098876bfa
When a user is manually deactivated, they should not be deleted by our background job that purges inactive users.
In addition, site settings keywords should accept an array of keywords.
In this PR, all references in the UI to the word "`upgrade`" are changed to "`update`". This is to differentiate the update process in self-hosted sites from the plan "upgrade" process in hosted sites.
Follow-up to the PR: https://github.com/discourse/docker_manager/pull/208
Why this change?
This allows downloading the MaxMind databases from a mirror in cases
where downloading directly from MaxMind's API endpoint is problematic
due to API limits.
Why this change?
We currently support `GlobalSetting.refresh_maxmind_db_during_precompile_days` which
should cache the maxmind databases on disk for the configured number of
days before it downloads the databases from maxmind again via the API.
This was previously added to help us avoid hitting the API rate limit from maxmind.
However, there was a bug in the `copy_maxmind` when we copied the latest
downloaded database to the cache directory. In particular, `FileUtils.cp` was called with
`preserve: true` which would preserve the modified time of the file
being copied. This is problematic because download the database from
maxmind on 2 April 2024 can give us a file with an mtime of 29 March
2024. If `GlobalSetting.refresh_maxmind_db_during_precompile_days` is
set to `2` for example, the cache will never be used since we will
think that the file has been downloaded for more than 2 days in our
checks.
What is the fix here?
While we want to preserve the owner and group of the file, we do not
want to preserve the modified time and hence we will call
`FileUtils.touch` when copying the file.
Why this change?
For a schema like this:
```
schema = {
name: "section",
properties: {
category_property: {
type: "categories",
required: true,
},
},
}
```
When the value of the property is set to an empty array, we are not
raising an error which we should because the property is marked as
required.
Why this change?
This is a follow-up to 86b2e3a.
Basically, we want to allow people to select more than 1 group as well.
What does this change do?
1. Change `type: group` to `type: groups` and support `min` and `max`
validations for `type: groups`.
2. Fix the `<SchemaThemeSetting::Types::Groups>` component to support the
`min` and `max` validations and switch it to use the `<GroupChooser>` component
instead of the `<ComboBoxComponent>` component which previously only supported
selecting a single group.
Previously the problem check registry simply looked at the subclasses of ProblemCheck. This was causing some confusion in environments where eager loading is not enabled, as the registry would appear empty as a result of the classes never being referenced (and thus never loaded.)
This PR changes the approach to a more explicit one. I followed other implementations (bookmarkable and hashtag autocomplete.) As a bonus, this now has a neat plugin entry point as well.
When a topic fails to be created due to the user not having permission to add tags to the topic, the error message that you get is `There was an error tagging the topic` which is very generic and doesn't explain where/what the problem is. This commit adds a clearer error message for this scenario.
The `TopicCreator` class has a `skip_validations` option that can force-create a topic without performing permission checks or validation rules. However, at the moment it doesn't skip validations that are related to tags, so topics that are created by the system or by some scrip can still fail if they use tags. This commit makes the `TopicCreator` class skip all tags-related checks if the `skip_validations` is specified.
Internal topic: t/124280.
Followup 0bbca318f2,
rather than making developers provide the plugin path
name (which may not always be the same depending on
dir names and git cloning etc) we can infer the plugin
dir from the caller in plugin_file_from_fixtures
Why this change?
This is a follow-up to 86b2e3aa3e.
Basically, we want to allow people to select more than 1 category as well.
What does this change do?
1. Change `type: category` to `type: categories` and support `min` and `max`
validations for `type: categories`.
2. Fix the `<SchemaThemeSetting::Types::Categories>` component to support the
`min` and `max` validations and switch it to use the `<CategorySelector>` component
instead of the `<CategoryChooser>` component which only supports selecting one category.
At the moment, all topic `?page=` views are served with exactly identical page titles. If you search for something which is mentioned many times in the same Discourse topic, this makes for some very hard-to-understand search results! All the result titles are exactly the same, with no indication of why there are multiple results showing.
This commit adds a `- Page #` suffix to the titles in this situation. This lines up with our existing strategy for topic-list pagination.
When crawlers visit a post-specific URL like `/t/-/{topic-id}/{post-number}`, we use the canonical to direct them to the appropriate crawler-optimised paginated view (e.g. `?page=3`).
However, analysis of google results shows that the post-specific URLs are still being included in the index. Google doesn't tell us exactly why this is happening. However, as a general rule, 'A large portion of the duplicate page's content should be present on the canonical version'.
In our previous implementation, this wasn't 100% true all the time. That's because a request for a post-specific URL would include posts 'surrounding' that post, and won't exactly conform to the page boundaries which are used in the canonical version of the page. Essentially: in some cases, the content of the post-specific pages would include many posts which were not present on the canonical paginated version.
This commit aims to resolve that problem by simplifying the implementation. Instead of rendering posts surrounding the target post_number, we will only render the target post, and include a link to 'show post in topic'. With this new implementation, 100% of the post-specific page content will be present on the canonical paginated version, which will hopefully mean google reduces their indexing of the non-canonical post-specific pages.
To generate letter avatars, we’re currently using the ImageMagick suite
and we’re using the Helvetica font family. However, that font isn’t
shipped anymore in the latest stable version of Debian (Bookworm).
Instead it seems to have been replaced by the Nimbus font. The rendering
is extremely similar (not to say it’s the same thing) so it shouldn’t be
noticeable.
That change is necessary for us to upgrade our docker images to Debian
Bookworm.
This is so the CI output on GitHub actions isn't showing
tons and tons of unnecessary log data every time you want
to see the important thing, which is the actual test failure.
We were previously using the `EMBER_ENV=production` environment variable, which appears to produce the same output. But, some parts of ember-cli don't seem to support it, which leads to a confusing 'Environment: development' being printed on the console.
This commit adds `-prod` by default, which is the more common way to invoke ember-cli for production builds.
Why this change?
While working on the tag selector for the theme object editor, I
realised that there is an extremely high possibility that users might want to select
more than one tag. By supporting the ability to select more than one
tag, it also means that we get support for a single tag for free as
well.
What does this change do?
1. Change `type: tag` to `type: tags` and support `min` and `max`
validations for `type: tags`.
2. Fix the `<SchemaThemeSetting::Types::Tags>` component to support the
`min` and `max` validations
Why this change?
Fortunately or unfortunately in Discourse core, we mainly use `Tag#name`
to look up tags and not its id. This assumption is built into the
frontend as well so we need to use the tag's name instead of the id
here.
Followup 3094f32ff5,
this fixes an issue with the logic in this commit where
we were returning false if any of the conditionals here
were false, regardless of the type of `obj`, where we should
have only done this if `obj` was a `PostAction`, which lead
us to return false in cases where we were checking if the
user could edit their own post as anon.
When categories are loaded by the frontend, the parent category is
looked up by ID and the `parentCategory` is set with the result. If the
categories returned are not in order, the parent category may miss.
Why this change?
This ENV allows the brotli compression quality to be configurable such
that one can opt for a higher/lower level of compression based on their
preferences.
This commit changes the API for registering the plugin config
page nav configuration from a server-side to a JS one;
there is no need for it to be server-side.
It also makes some changes to allow for 2 different ways of displaying
navigation for plugin pages, depending on complexity:
* TOP - This is the best mode for simple plugins without a lot of different
custom configuration pages, and it reuses the grey horizontal nav bar
already used for admins.
* SIDEBAR - This is better for more complex plugins; likely this won't
be used in the near future, but it's readily available if needed
There is a new AdminPluginConfigNavManager service too to manage which
plugin the admin is actively viewing, otherwise we would have trouble
hiding the main plugin nav for admins when viewing a single plugin.
For some identity providers, 10 minutes isn't much time for a user to complete authentication/registration on the identity provider. Increasing the default to 30 minutes should help in those situations. The nonce is still tied to a single browser session, so there is no material impact on security.
This commit makes it so the site settings filter controls and
the list of settings input editors themselves can be used elsewhere
in the admin UI outside of /admin/site_settings
This allows us to provide more targeted groups of settings in different
UI areas where it makes sense to provide them, such as on plugin pages.
You could open a single page for a plugin where you can see information
about that plugin, change settings, and configure it with custom UIs
in the one place.
In future we will do this in "config areas" for other parts of the
admin UI.
This commit moves the generation of category background CSS from the
server side to the client side. This simplifies the server side code
because it does not need to check which categories are visible to the
current user.
Since we are introducing new ways to search in Discourse, like the AI
semantic search using embeddings, posts can be part of a search result
list without having any search data.
Since the code path already handles this, we only need to add a safety
check when accessing the post_search_data.
In AdminDashboardData we have a bunch of problem checks implemented as methods on that class. This PR absolves it of the responsibility by promoting each of those checks to a first class ProblemCheck. This way each of them can have their own priority and arbitrary functionality can be isolated in its own class.
Think "extract class" refactoring over and over. Since they were all moved we can also get rid of the @@problem_syms class variable which was basically the old version of the registry now replaced by ProblemCheck.realtime.
In addition AdminDashboardData::Problem value object has been entirely replaced with the new ProblemCheck::Problem (with compatible API).
Lastly, I added some RSpec matchers to simplify testing of problem checks and provide helpful error messages when assertions fail.