This commit introduces the following changes which allows a site
administrator to mark `Upload` records with the `s3_file_missing`
verification status which will result in the `Upload` record being ignored when
`Discourse.store.list_missing_uploads` is ran on a site where S3 uploads
are enabled and `SiteSetting.enable_s3_inventory` is set to `true`.
1. Introduce `s3_file_missing` to `Upload.verification_statuses`
2. Introduce `Upload.mark_invalid_s3_uploads_as_missing` which updates
`Upload#verification_status` of all `Upload` records from `invalid_etag` to `s3_file_missing`.
3. Introduce `rake uploads:mark_invalid_s3_uploads_as_missing` Rake task
which allows a site administrator to change `Upload` records with
`invalid_etag` verification status to the `s3_file_missing`
verificaton_status.
4. Update `S3Inventory` to ignore `Upload` records with the
`s3_file_missing` verification status.
When uploading a video, the composer will now show a thumbnail image in
the composer preview instead of just the video placeholder image.
If `enable_diffhtml_preview` is enabled the video will be rendered in
the composer preview and is playable.
This commit updates all Sidekiq signal handling event logs to go through
Unicorn's logger instead of logging to STDOUT. Going through a proper logger
means the log messages are logged in the format which the logger has configured.
This means we get proper timestamp for the log messages.
Introduced back in 2022 in
e3d495850d,
our new more specific message-id format for inbound and
outbound emails has now been in use for a very long time,
we can remove the support for the old formats:
`topic/:topic_id/:post_id.:random@:host`
`topic/:topic_id@:host`
`topic/:topic_id.:random@:host`
In this PR service objects were moved to Core https://github.com/discourse/discourse/pull/26506
However, ServiceRunner should be moved as well. Mostly for CI to run effortlessly without loading plugins.
This commit moves the logic for crawler rate limits out of the application controller and into the request tracker middleware. The reason for this move is to apply rate limits to all crawler requests instead of just the requests that make it to the application controller. Some requests are served early from the middleware stack without reaching the Rails app for performance reasons (e.g. `AnonymousCache`) which results in crawlers getting 200 responses even though they've reached their limits and should be getting 429 responses.
Internal topic: t/128810.
This commit adds a `DISCOURSE_DUMP_BACKTRACES_ON_UNICORN_WORKER_TIMEOUT`
environment that will allow us to dump all backtraces for all threads of
a Unicorn worker 2 seconds before it times out. In development,
backtraces are dumped to `STDOUT` and in production we will dump it to
`unicorn.stdout.log`.
We want to dump all the backtraces to make it easier to identify the
cause of a Unicorn worker timing out.
This commit updates `S3Inventory#files` to ignore S3 inventory files
which have a `last_modified` timestamp which are not at least 2 days
older than `BackupMetadata.last_restore_date` timestamp.
This check was previously only in `Jobs::EnsureS3UploadsExistence` but
`S3Inventory` can also be used via Rake tasks so this protection needs
to be in `S3Inventory` and not in the scheduled job.
* DEV: allow reply_by_email, visit_link_to_respond strings to be modified by plugins
* DEV: separate visit_link_to_respond and reply_by_email modifiers out
This PR aims to add bulk actions to the user's bookmarks.
After this feature, all users should be able to select multiple bookmarks and perform the actions of "deleting" or "clear reminders"
This fixes the `PrettyText.make_all_links_absolute` to better handle subfolder.
In subfolder, when given the cooked version of a post, links to mentions includes the `Discourse.base_path` prefix. Adding the `Discourse.base_url` was doubling the `Discourse.base_path`.
The issue was hidden behind the specs which was stubbing `Discourse.base_url` instead of relying on `Discourse.base_path`.
This fixes both the "algorithm" used in `PrettyText.make_all_links_absolute` to better handle this case and correct the specs to properly handle subfolder cases.
There are lots of changes in the specs due to a refactoring to use squiggly heredoc strings for easier reading and less escaping.
This commit adds a step in our tests workflow on Github actions to update the themes to
use the compatible version when not running aginast the `main` branch.
This is to ensure that we are not running
the tests for themes against an incompatible version of Discourse.
AuthProvider#enabled_setting=, used primarily by plugins, has been deprecated since version 2.9, in favour of Authenticator#enabled?. This PR confirms we are seeing no more usage and removes the method.
Whenever one creates, updates, or deletes a post, we should keep the `topic.word_count` counter in sync.
Context - https://meta.discourse.org/t/-/308062
This commits updates `FinalDestination#get` to not forward
`Authorization` header on redirects since most HTTP clients I tested like
curl and wget does not it.
This also fixes a recent problem in `DiscourseIpInfo.mmdb_download`
where we will fail to download the databases when both `GlobalSetting.maxmind_account_id` and
`GlobalSetting.maxmind_license_key` has been set. The failure is due to
the bug above where the redirected URL given by MaxMind does not accept
an `Authorization` header.
This commit optimises the database query generated by
`TopicsFilter#filter_categories` when the `-category:*` filter is used.
Previously, the method will add the `topics.category_id NOT IN
(<category ids to be excluded>)` filter to the resulting query. However,
we noticed that the performance of the query degrades as the number of
rows in the `topics` table grow and when the number of category ids to be
excluded is large.
Sample of query we ran on a large database in production to demonstrate
the improvement:
Before:
```
SELECT topics.id FROM topics WHERE topics.category_id NOT IN (83, 136, 149, 143, 153, 165, 161, 123, 155, 163, 144, 134, 69, 135, 158, 141, 151, 160, 131, 133, 89, 104, 150, 147, 132, 145, 108, 146, 122, 100, 128, 154, 95, 102, 140, 139, 88, 91, 87) ORDER BY topics.id DESC LIMIT 5;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=27795.34..27795.34 rows=1 width=4) (actual time=29.317..30.165 rows=5 loops=1)
-> Sort (cost=27795.34..27795.34 rows=1 width=4) (actual time=29.316..30.163 rows=5 loops=1)
Sort Key: id DESC
Sort Method: top-N heapsort Memory: 25kB
-> Gather (cost=1000.10..27795.33 rows=1 width=4) (actual time=0.187..26.132 rows=73478 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on topics (cost=0.10..26795.23 rows=1 width=4) (actual time=0.013..22.252 rows=24493 loops=3)
Filter: (category_id <> ALL ('{83,136,149,143,153,165,161,123,155,163,144,134,69,135,158,141,151,160,131,133,89,104,150,147,132,145,108,146,122,100,128,154,95,102,140,139,88,91,87}'::integer[]))
Rows Removed by Filter: 77276
Planning Time: 0.140 ms
Execution Time: 30.181 ms
```
After:
```
SELECT topics.id FROM topics WHERE NOT EXISTS (
SELECT 1
FROM unnest(array[83, 136, 149, 143, 153, 165, 161, 123, 155, 163, 144, 134, 69, 135, 158, 141, 151, 160, 131, 133, 89, 104, 150, 147, 132, 145, 108, 146, 122, 100, 128, 154, 95, 102, 140, 139, 88, 91, 87]) AS excluded_categories(category_id)
WHERE topics.category_id IS NULL OR excluded_categories.category_id = topics.category_id
) ORDER BY topics.id DESC LIMIT 5 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..13.52 rows=5 width=4) (actual time=0.028..0.110 rows=5 loops=1)
-> Nested Loop Anti Join (cost=0.42..179929.62 rows=68715 width=4) (actual time=0.027..0.109 rows=5 loops=1)
Join Filter: ((topics.category_id IS NULL) OR (excluded_categories.category_id = topics.category_id))
Rows Removed by Join Filter: 239
-> Index Scan Backward using forum_threads_pkey on topics (cost=0.42..108925.71 rows=305301 width=8) (actual time=0.012..0.062 rows=44 loops=1)
-> Function Scan on unnest excluded_categories (cost=0.00..0.39 rows=39 width=4) (actual time=0.000..0.001 rows=6 loops=44)
Planning Time: 0.126 ms
Execution Time: 0.124 ms
(8 rows)
```
This commit changes request method for "categories/search" from GET to
POST to make sure that long filters can be passed to the server. For
example, category selectors with many categories are setting the full
list of selected category IDs to ensure these are filtered out from the
list of choices. This can result in a long URL that exceeds the maximum
length.
This reverts commit 0f4520867b.
This has led to two problems:
1. An incompatibility with Cloudflare's "auto minify" feature. They've deprecated this feature because of incompatibility with modern JS syntax. But unfortunately it will remain enabled on existing properties until 2024-08-05.
2. Discourse fails to boot in Safari 15. This is strange, because Safari does support all the required features in our production JS bundles. Even more strangely, things start working as soon as you open the developer tools. That suggests the cause could be a Safari bug rather than a simple incompatibility.
Reverting while we work out a path forward on both those issues.
This commit switches `DiscourseIpInfo.mmdb_download` to use the
permalinks supplied by MaxMind to download the MaxMind databases as
specified in
https://dev.maxmind.com/geoip/updating-databases#directly-downloading-databases
which states:
```
To directly download databases, follow these steps:
1. In the "Download Links" column, click "Get Permalink(s)" for the desired database.
2. Copy the permalink(s) provided in the modal window.
3. Provide your account ID and your license key using Basic Authentication to authenticate.
```
Previously we are downloading from `https://download.maxmind.com/app/geoip_download` but this is not
documented anyway on MaxMind's docs so this URL can in theory break
in the future without warning. Therefore, we are taking a proactive
approach to download the databases from MaxMind the recommended way
instead of relying on a hidden URL. This old way of downloading the
databases with only a license key will be deprecated in 3.3 and be
removed in 3.4.
In 95a82d608d, we lowered the default for
`Onebox.options.max_download_kb` from 10mb to 2mb for security hardening
purposes. However, this resulted in multiple bug reports where seemingly
nomral URLs stopped being oneboxed. It turns out that lowering
`Onebox.options.max_download_kb` resulted in `Onebox::Helpers::DownloadTooLarge` being raised
more often for more URLs in `Onebox::Helpers.fetch_response` which
`Onebox::Helpers.fetch_html_doc` relies on. When
`Onebox::Helpers::DownloadTooLarge` is raised in
`Onebox::Helpers.fetch_response`, we throw away whatever response body
which we have already downloaded at that point. This is not ideal
because Nokogiri can parse incomplete HTML documents and there is a
really high chance that the incomplete HTML document still contains the
information which we need for oneboxing.
Therefore, this commit updates `Onebox::Helpers.fetch_html_doc` to not
throw away the response body when the size of the response body exceeds
`Onebox.options.max_download_size`. Instead, we just take whatever
response which we have and get Nokogiri to parse it.
This can happen for various reasons including rate limiting and middleware bugs. This should resolve the warning we're seeing in the logs
```
RequestTracker.get_data failed : NoMethodError : undefined method `[]' for nil:NilClass
```
decorator-transforms (https://github.com/ef4/decorator-transforms) is a modern replacement for babel's plugin-proposal-decorators. It provides a decorator implementation using modern browser features, without needing to enable babel's full suite of class feature transformations. This improves the developer experience and performance.
In local testing with Google's 'tachometer' tool, this reduces Discourse's 'init-to-render' time by around 3-4% (230ms -> 222ms).
It reduces our initial gzip'd JS payloads by 3.2% (2.43MB -> 2.35MB), or 7.5% (14.5MB -> 13.4MB) uncompressed.
This keeps coming up in user testing as something
we want to get rid of. The `navigation_menu` setting
has been set to sidebar by default for some time now,
and we are rolling out admin sidebar widely. It just
doesn't make sense to let people turn this off in
the first step of the wizard -- we _want_ people to
use the sidebar.
Whenever a post already failed "lightweight" validations, we skip all the expensive validations (that cooks the post or run SQL queries) so that we reply as soon as possible.
Also skip validating polls when there's no "[/poll]" in the raw.
Internal ref - t/115890
When running `rake uploads:regenerate_missing_optimized`,
a `Discourse::InvalidAccess` will be raised if an SVG
file is being processed as `OptimizedImage.prepend_decoder!`
doesn't support the svg extension. This commit simply copies
the original SVG file as the thumbnail, just like currently
`OptimizedImage.create_for` does.
Ignored columns can only be dropped when its associated post-deploy
migration has been promoted to a regular migration. This is so because
Discourse doesn't rely on a schema file system to setup a brand new
database and thus the column information will be loaded by the
application first before the post-deploy migration runs.
- Use 'cheap-source-map' webpack config on low-memory machines
This results in worse quality sourcemaps in browser dev tools, but it significantly reduces memory use in our webpack build. In approximate local testing it drops from 1100mb to 590mb. This should make the rebuild process on low-memory machines much faster and less likely to trigger OOM errors.
In development, and on higher-memory machines, the higher-quality 'source-map' option is maintained.
- Disable Webpack's built-in `minimize` feature. Embroider already applies Terser after the webpack build is complete. There is no need to double-minimize the output.
- Update ember-cli-progress-ci to print to stderr instead of stdout. For some reason, pups (used by discourse_docker) buffers the stdout of commands and only prints when they are finished. stderr does not have this same limitation, so switching will mean sysadmins can see the progress of the ember build in real-time.
Given the number of variables it's hard to promise exact numbers. But, in my tests on a DO droplet with 1GB RAM (+2GB swap), this reduced the `ember build` portion of a `./launcher rebuild app` from ~50 minutes to ~15 minutes.
This commit adds a `isValidUrl` helper function to the context in
which theme migrations are ran in. This helper function is to make it
easier for theme developers to check if a string is a valid URL or path
when writing theme migrations. This can be helpful in cases when
migrating a string based setting to `type: objects` which contain `type:
string` properties with URL validations enabled.
This commit also introduces the `UrlHelper.is_valid_url?` method
which actually checks that the URL string is of the valid format instead of
only checking if the URL string is parseable which is what `UrlHelper.relaxed_parse` does
and is not sufficient for our needs.