Previously we considered 'upload rows without etags' to be exempt from the check. This is bad, because older/migrated sites might not have etags on all their uploads. We should consider rows without etags to be broken, since we can't check them against the inventory.
This also removes the `by_users` scope. We need all uploads to be working, even ones created by the system user.
This adds an option to "delete on owner reply" to bookmarks. If you select this option in the modal, then reply to the topic the bookmark is in, the bookmark will be deleted on reply.
This PR also changes the checkboxes for these additional bookmark options to an Integer column in the DB with a combobox to select the option you want.
The use cases are:
* Sometimes I will bookmark the topics to read it later. In this case we definitely don’t need to keep the bookmark after I replied to it.
* Sometimes I will read the topic in mobile and I will prefer to reply in PC later. Or I may have to do some research before reply. So I will bookmark it for reply later.
* FEATURE: Allow List for PMs
This feature adds a new user setting that is disabled by default that
allows them to specify a list of users that are allowed to send them
private messages. This way they don't have to maintain a large list of
users they don't want to here from and instead just list the people they
know they do want. Staff will still always be able to send messages to
the user.
* Update PR based on feedback
Some definitions rely on others, in particular the c/cpp/c-like ones,
and we were appending the bundle of all files in the folder.
Instead for testing I've limited us to just three definitions. This has
the benefit of being a lot smaller to download/parse in test mode too.
It seems there was a discrepancy in that background images were attached
to the full slug category class: `category-:slug-:id` and our body class
only had `category-:slug`.
This fix adds support for both formats.
* Remove unneeded bookmark name index.
* Change bookmark search query to use post_search_data. This allows searching on topic title and post content
* Tweak the style/layout of the bookmark list so the search looks better and the whole page fits better on mobile.
* Added scopes UI
* Create scopes when creating a new API key
* Show scopes on the API key show route
* Apply scopes on API requests
* Extend scopes from plugins
* Add missing scopes. A mapping can be associated with multiple controller actions
* Only send scopes if the use global key option is disabled. Use the discourse plugin registry to add new scopes
* Add not null validations and index for api_key_id
* Annotate model
* DEV: Move default mappings to ApiKeyScope
* Remove unused attribute and improve UI for existing keys
* Support multiple parameters separated by a comma
Follow up to d8c796bc4.
Note that his change increases query time by around 40% in the following
benchmark against `dev.discourse.org` but this is a tradeoff that has to be taken so that relevance
search is accurate.
```
require 'benchmark/ips'
Benchmark.ips do |x|
x.config(time: 10, warmup: 2)
x.report("current aggregate search query") do
DB.exec <<~SQL
SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id" FROM "posts" JOIN (SELECT *, row_number() over() row_number FROM (SELECT topics.id, min(posts.post_number) post_number FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3, 4) AND (topics.visible) AND (topics.archetype <> 'private_message') AND (post_search_data.search_data @@ TO_TSQUERY('english', '''postgres'':*ABCD')) AND (categories.id NOT IN (
SELECT categories.id WHERE categories.search_priority = 1
)
) AND ((categories.id IS NULL) OR (NOT categories.read_restricted)) GROUP BY topics.id ORDER BY MAX((
TS_RANK_CD(
post_search_data.search_data,
TO_TSQUERY('english', '''postgres'':*ABCD'),
1|32
) *
(
CASE categories.search_priority
WHEN 2
THEN 0.6
WHEN 3
THEN 0.8
WHEN 4
THEN 1.2
WHEN 5
THEN 1.4
ELSE
CASE WHEN topics.closed
THEN 0.9
ELSE 1
END
END
)
)
) DESC, topics.bumped_at DESC LIMIT 51 OFFSET 0) xxx) x ON x.id = posts.topic_id AND x.post_number = posts.post_number WHERE ("posts"."deleted_at" IS NULL) ORDER BY row_number;
SQL
end
x.report("current aggregate search query with proper ranking") do
DB.exec <<~SQL
SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id" FROM "posts" JOIN (SELECT *, row_number() over() row_number FROM (SELECT subquery.topic_id id, (ARRAY_AGG(subquery.post_number ORDER BY rank DESC, bumped_at DESC))[1] post_number, MAX(subquery.rank) rank, MAX(subquery.bumped_at) bumped_at FROM (SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id", (
TS_RANK_CD(
post_search_data.search_data,
TO_TSQUERY('english', '''postgres'':*ABCD'),
1|32
) *
(
CASE categories.search_priority
WHEN 2
THEN 0.6
WHEN 3
THEN 0.8
WHEN 4
THEN 1.2
WHEN 5
THEN 1.4
ELSE
CASE WHEN topics.closed
THEN 0.9
ELSE 1
END
END
)
)
rank, topics.bumped_at bumped_at FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3, 4) AND (topics.visible) AND (topics.archetype <> 'private_message') AND (post_search_data.search_data @@ TO_TSQUERY('english', '''postgres'':*ABCD')) AND (categories.id NOT IN (
SELECT categories.id WHERE categories.search_priority = 1
)
) AND ((categories.id IS NULL) OR (NOT categories.read_restricted))) subquery GROUP BY subquery.topic_id ORDER BY rank DESC, bumped_at DESC LIMIT 51 OFFSET 0) xxx) x ON x.id = posts.topic_id AND x.post_number = posts.post_number WHERE ("posts"."deleted_at" IS NULL) ORDER BY row_number;
SQL
end
x.compare!
end
```
```
Warming up --------------------------------------
current aggregate search query
1.000 i/100ms
current aggregate search query with proper ranking
1.000 i/100ms
Calculating -------------------------------------
current aggregate search query
18.040 (± 0.0%) i/s - 181.000 in 10.035241s
current aggregate search query with proper ranking
12.992 (± 0.0%) i/s - 130.000 in 10.007214s
Comparison:
current aggregate search query: 18.0 i/s
current aggregate search query with proper ranking: 13.0 i/s - 1.39x (± 0.00) slower
```
Indexing query strings in URLS produces inconsistent results in PG and
pollutes the search data for really little gain.
The following seems to work as expected...
```
discourse_development=# SELECT TO_TSVECTOR('https://www.discourse.org?test=2&test2=3');
to_tsvector
------------------------------------------------------
'2':3 '3':5 'test':2 'test2':4 'www.discourse.org':1
```
However, once a path is present
```
discourse_development=# SELECT TO_TSVECTOR('https://www.discourse.org/latest?test=2&test2=3');
to_tsvector
----------------------------------------------------------------------------------------------
'/latest?test=2&test2=3':3 'www.discourse.org':2 'www.discourse.org/latest?test=2&test2=3':1
```
The lexeme contains both the path and the query string.
Previously, we would only take either the `MIN` or `MAX` for
`post_number` during aggregation meaning that the ranking is not
considered.
```
require 'benchmark/ips'
Benchmark.ips do |x|
x.config(time: 10, warmup: 2)
x.report("current aggregate search query") do
DB.exec <<~SQL
SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id" FROM "posts" JOIN (SELECT *, row_number() over() row_number FROM (SELECT topics.id, min(posts.post_number) post_number FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3, 4) AND (topics.visible) AND (topics.archetype <> 'private_message') AND (post_search_data.search_data @@ TO_TSQUERY('english', '''postgres'':*ABCD')) AND (categories.id NOT IN (
SELECT categories.id WHERE categories.search_priority = 1
)
) AND ((categories.id IS NULL) OR (NOT categories.read_restricted)) GROUP BY topics.id ORDER BY MAX((
TS_RANK_CD(
post_search_data.search_data,
TO_TSQUERY('english', '''postgres'':*ABCD'),
1|32
) *
(
CASE categories.search_priority
WHEN 2
THEN 0.6
WHEN 3
THEN 0.8
WHEN 4
THEN 1.2
WHEN 5
THEN 1.4
ELSE
CASE WHEN topics.closed
THEN 0.9
ELSE 1
END
END
)
)
) DESC, topics.bumped_at DESC LIMIT 51 OFFSET 0) xxx) x ON x.id = posts.topic_id AND x.post_number = posts.post_number WHERE ("posts"."deleted_at" IS NULL) ORDER BY row_number;
SQL
end
x.report("current aggregate search query with proper ranking") do
DB.exec <<~SQL
SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id" FROM "posts" JOIN (SELECT *, row_number() over() row_number FROM (SELECT subquery.topic_id id, (ARRAY_AGG(subquery.post_number))[1] post_number, MAX(subquery.rank) rank, MAX(subquery.bumped_at) bumped_at FROM (SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id", (
TS_RANK_CD(
post_search_data.search_data,
TO_TSQUERY('english', '''postgres'':*ABCD'),
1|32
) *
(
CASE categories.search_priority
WHEN 2
THEN 0.6
WHEN 3
THEN 0.8
WHEN 4
THEN 1.2
WHEN 5
THEN 1.4
ELSE
CASE WHEN topics.closed
THEN 0.9
ELSE 1
END
END
)
)
rank, topics.bumped_at bumped_at FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3, 4) AND (topics.visible) AND (topics.archetype <> 'private_message') AND (post_search_data.search_data @@ TO_TSQUERY('english', '''postgres'':*ABCD')) AND (categories.id NOT IN (
SELECT categories.id WHERE categories.search_priority = 1
)
) AND ((categories.id IS NULL) OR (NOT categories.read_restricted))) subquery GROUP BY subquery.topic_id ORDER BY rank DESC, bumped_at DESC LIMIT 51 OFFSET 0) xxx) x ON x.id = posts.topic_id AND x.post_number = posts.post_number WHERE ("posts"."deleted_at" IS NULL) ORDER BY row_number;
SQL
end
x.compare!
end
```
```
Warming up --------------------------------------
current aggregate search query
1.000 i/100ms
current aggregate search query with proper ranking
1.000 i/100ms
Calculating -------------------------------------
current aggregate search query
17.726 (± 0.0%) i/s - 178.000 in 10.045107s
current aggregate search query with proper ranking
17.802 (± 0.0%) i/s - 178.000 in 10.002230s
Comparison:
current aggregate search query with proper ranking: 17.8 i/s
current aggregate search query: 17.7 i/s - 1.00x (± 0.00) slower
```
On large topics, the cost of sending the entire post ID list back over to the database is signficant. Just have the DB recalculate the list of visible posts instead.
Category and tag hashtags used to be handled differently even though
most of the code was very similar. This design was the root cause of
multiple issues related to hashtags.
This commit reduces the number of requests (just one and debounced
better), removes the use of CSS classes which marked resolved hashtags,
simplifies a lot of the code as there is a single source of truth and
previous race condition fixes are now useless.
It also includes a very minor security fix which let unauthorized users
to guess hidden tags.
It's a little awkward to test constants by re-assigning them so
I've added a new parameter to `Discourse.find_compatible_resource`
which can be used by tests.
Instead of loading all of the user bookmarks using all the post IDs in a topic, load all the bookmarks for a user using the topic ID. This eliminates a costly WHERE ID IN query.
* strip out the href and xlink:href attributes from use element that
are _not_ anchors in svgs which can be used for XSS
* adding the content-disposition: attachment ensures that
uploaded SVGs cannot be opened and executed using the XSS exploit.
svgs embedded using an img tag do not suffer from the same exploit
Before this change:
- first full page load would get category defaults defined un cateory settings
- a navigation to a topic and then back to categories list would reset defaut to the ones defined in discovery/topics
Adds a new rake task `plugin:checkout_compatible_all` and
`plugin:checkout_compatible[plugin-name]` that check out compatible plugin
versions.
Supports a .discourse-compatibility file in the root of plugins and themes that
list out a plugin's compatibility with certain discourse versions:
eg: .discourse-compatibility
```
2.5.0.beta6: some-git-hash
2.4.4.beta4: some-git-tag
2.2.0: git-reference
```
This ensures older Discourse installs are able to find and install older
versions of plugins without intervention, through the manifest only.
It iterates through the versions in descending order. If the current Discourse
version matches an item in the manifest, it checks out the listed plugin target.
If the Discourse version is greater than an item in the manifest, it checks out
the next highest version listed in the manifest.
If no versions match, it makes no change.
This is a very expensive process, and it should only be required in exceptional circumstances. It is possible to run a similar recovery using `rake uploads:recover` (5284d41a8e/lib/upload_recovery.rb (L135-L184))
`OptimizedImage#filesize` calls `Discourse.store.download` with an OptimizedImage as an argument. It would in turn attempt to call `#original_filename` and `#secure?` on that object. Both would fail as these methods do not exist on OptimizedImage, only on Upload. We didn't know about these issues because:
1. `#calculate_filesize` is not called often, because the filesize is saved on OptimizedImage creation, so it's used mostly for manual filesize recalculation
2. we were using `rescue nil` which swallows all errors
`Nokogiri::HTML.fragment` is a huge hack (a comment in the source code
admits this). The current behavior of `Email::Styles` is to try to
emulate `fragment` using nokogumbo, but it misses some edge cases. In
particular, meta tags in a email template don't make it through to the
final email.
Instead of treating the provided HTML as an indeterminate fragment, this
commit makes `Email::Styles` treat the HTML as a complete document. This
means that the generated HTML for an email will now always contain top
level structure (a doctype, html, head and body tags).
This new behavior is behind a hidden site setting for now and defaults
off.
Previously, while generating the topic page's canoncial url we used the current post number. It will create invalid canonical path if the topic has whsiper posts. Now we only taking the visible posts for current page index calculation.
The previous fix (f43c0a5d85) wasn't working for images that were already uploaded.
The "metadata" (eg. 'for_*' and 'secure' attributes) were not added to existing uploads.
Also used 'Upload.get_from_url' is the admin/site_setting controller to properly retrieve
an upload from its URL.
Fixed the Upload::URL_REGEX to use the \h (hexadecimal) for the SHA
Follow-up-to: f43c0a5d85
* Change S3Helper::DOWNLOAD_URL_EXPIRES_AFTER_SECONDS to 5 minutes, which controls presigned URL expiry and secure-media route cache time.
* This is done because of the composer preview refreshing while typing causes a lot of requests sent to our server because of the short URL expiry. If this ends up being not enough we can always increase the time or explore other avenues (e.g. GitHub has a 7 day validity for secure URLs)
In 91c89df6, I fixed the onebox to support local topics with a slug-less URL.
This commit fixes all the other spots (search, topic links and user badges) where we look up for a local topic.
Follow-up-to: 91c89df6
* FIX: Correct version comparison logic when comparing stable to beta
For example, version 1.3.0 should be considered higher than 1.3.0.beta3. So `Discourse.has_needed_version?('1.3.0', '1.3.0.beta3')` should return true
* Switch to use Gem::Version to compare versions
Added a small helper class to for seed data because we need to add the
same filter to multisite:migrate as we have in db:migrate. Having this
filter in both places means we can get rid of the SKIP_SEED flag.
The logic of adding additional search results does not seem to be
needed anymore.
It appears to be a relic of an old implementation.
This saves an entire search query for every search made.
When rebaking a post we were invalidating _regular_ oneboxes but not inline oneboxes.
DEV: also renamed 'InlineOneboxer.purge' to 'InlineOneboxer.invalidate' to keep
the API consistent with 'Oneboxer.invalidate'
When linking to a topic in the same Discourse, we try to onebox the link to show the title
and other various information depending on whether it's a "standard" or "inline" onebox.
However, we were not properly detecting links to topics that had no slugs (eg. https://meta.discourse.org/t/1234).
This is causing issues when purging old users, if they are set up in the
exact condition where they will be demoted into another group, but also
do not have a primary email.
A future-dated migration was accidently introduced by me in 45c399f0. This was removed in b9762afc, but other migrations had already been generated based on its incorrect date. This commit removes the offending data in the schema_migrations table, and corrects the version in the published_pages migration.
This commit also adds a check to db:migrate which raises an error when invalid migration timestamps are used.
When plugin is hooking into TopicView joining other tables, it may fail because `created_at` is potentially available on 2 tables. Therefore we should explicitly define which `created_at` we want.
The `EXECUTE FUNCTION` syntax for `CREATE TRIGGER` statements was introduced in PostgreSQL 11. We need to replace `EXECUTE FUNCTION` with `EXECUTE PROCEDURE` in order to be able to restore backups created with PG12 on PG10.
Fixed bugs, added specs, extracted the upload downsizing code to a class, added support for non-S3 setups, changed it so that images aren't downloaded twice.
This code has been tested on production and successfully resized ~180k uploads.
Includes:
* DEV: Extract upload downsizing logic
* DEV: Add support for non-S3 uploads
* DEV: Process only images uploaded by users
* FIX: Incorrect usage of `count` and `exist?` typo
* DEV: Spec S3 image downsizing
* DEV: Avoid downloading images twice
* DEV: Update filesizes earlier in the process
* DEV: Return false on invalid upload
* FIX: Download images that currently above the limit (If the image size limit is decreased, then there was no way to resize those images that now fall outside the allowed size range)
* Update script/downsize_uploads.rb (Co-authored-by: Régis Hanol <regis@hanol.fr>)
FIX: prevent re-flagging when we have reviewed flags before
Fixes an edge case where a review can be reflagged when:
User flags as inappropriate.
Moderator rejects the flag.
Another user re-flags the post as spam.
Before, anyone was able to re-flag as inappropriate despite it being flagged
previously. With this, users are unable to re-flag for the same reason
regardless of reviewable status.
FEATURE: new rake task to update first_post_created_at column
The not-equal operator (`<>`) in PostgreSQL does not compare values
with NULL. We should instead use `IS DISTINCT FROM` when comparing
values with NULL.
Since this is rare, we don't want to check for
`Discourse.pg_readonly_mode?` on every request since we have to reach
for Redis. Instead, just rescue the error here.
Allow limiting the number of migrations to do at once, both to do migrations that
have impact limited to multiple off-peak usage hours to reduce user impact from
a migration, and to allow tests that do only a very small number for test
purposes. ("Give me a ping, Vasili. One ping only, please.")
Looks like some html elements like `aside` and `section` will throw an error
when checking if they are inline or not. The commit simply handles
```
Job exception: undefined method `inline?' for nil:NilClass
```
and adds a test for it.
Moves the most important checks into a linter. It gets executed by Lefthook as well as the docker rake task and Github actions. Doing those checks in rspec takes too long and it produces errors when the discourse:test Docker image contains old, invalid locale files.
* DEV: Move `Discourse.getURL` and related functions to a module
* DEV: Remove `Discourse.getURL` and `Discourse.getURLWithCDN`
* FIX: `get-url` is required for server side code
* DEV: Deprecate `BaseUri` too.
In some restricted setups all JS payloads need tight control.
This setting bans admins from making changes to JS on the site and
requires all themes be whitelisted to be used.
There are edge cases we still need to work through in this mode
hence this is still not supported in production and experimental.
Use an example like this to enable:
`DISCOURSE_WHITELISTED_THEME_REPOS="https://repo.com/repo.git,https://repo.com/repo2.git"`
By default this feature is not enabled and no changes are made.
One exception is that default theme id was missing a security check
this was added for correctness.
* DEV: To be pedantic, there is more than EMBER in there now
* DEV: Use less globals. Have `Discourse` start in an initializer
* DEV: Remove another global
Previously the pull hotlinked images job was skipped after system edits. This ensured that we never had an infinite loop of system-edit/pull-hotlinked/system-edit/pull-hotlinked etc.
A side effect was that edits made by system for any other reason (e.g. API, removing full quotes) would prevent pulling hotlinked images. This commit removes the system edit check, and replaces it with another method to avoid an infinite job scheduling loop.
* DEV: new S3 backup layout
Currently, with $S3_BACKUP_BUCKET of "bucket/backups", multisite backups
end up in "bucket/backups/backups/dbname/" and single-site will be in
"bucket/backups/".
Both _should_ be in "bucket/backups/dbname/"
- remove MULTISITE_PREFIX,
- always include dbname,
- method to move to the new prefix
- job to call the method
* SPEC: add tests for `VacateLegacyPrefixBackups` onceoff job.
Co-authored-by: Vinoth Kannan <vinothkannan@vinkas.com>
Fixes a regression in
e8fb9d4066
which caused a bug where you couldn't send a message to a group that
contained an Uppercase letter. Added a test case for this.
Bug report: https://meta.discourse.org/t/-/152999
Previously we had a partial fix in place where non human users
were not allowed draft sequences, this left edges around where non
human users asked for drafts yet had none.
For example system could already have a few drafts in place.
This also removes and extensibility point we added that is not in use
This reverts commit 20780a1eee.
* SECURITY: re-adds accidentally reverted commit:
03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
instead of the 03d26cd6 parent (which contains security fixes)
* We now have a site setting "topic_excerpt_maxlength" that is used when the OP is created or revised to generate a topic excerpt.
* However, posts created before this setting was introduced cannot benefit from this change unless they are revised, and if the topic excerpt length setting is changed that situation is also not covererd.
* This PR makes a change to rebake! to update the topic excerpt IF the post is the OP.
In some cases, between Discourse forums the hostname of a URL could match if they are hosting S3 files on the same bucket but the S3 bucket path might not. So e.g. https://testbucket.somesite.com/testpath/some/file/url.png vs https://testbucket.somesite.com/prodpath/some/file/url.png. So has_been_uploaded? was returning true for the second URL, even though it may have been uploaded on a different Discourse forum.
This is a very rare case but must be accounted for, because this impacts UrlHelper.is_local which mistakenly thinks the file has already been downloaded and thus allows the URL to be cooked, where we want to return the full URL to be downloaded using PullHotlinkedImages.
* the post_actions table has no FK to users, so if a user has been
deleted we may end up with dangling post_action records, which then
interferes with the bookmarks migration because bookmarks DO have
an FK to users
PG already handles English stop words, the list in cppjieba is
bigger than the list PG uses, which in turn causes confusion cause
words such as "volume" are stripped using cppijieba stop word list
We will follow up with another commit here to apply the Chinese
word stopwords, but for now to eliminate the confusion we are
skipping applying the stopword list when the dictionary in PG is
in English.
* DEV: Add framework for filtered plugin registers
Plugins often need to add values to a list, and we need to filter those lists at runtime to ignore values from disabled plugins. This commit provides a re-usable way to do that, which should make it easier to add new registers in future, and also reduce repeated code.
Follow-up commits will migrate existing registers to use this new system
* DEV: Migrate user and group custom field APIs to plugin registry
This gives us a consistent system for checking plugin enabled state, so we are repeating less logic. API changes are backwards compatible
Prior to this change we would never clear memory from contexts and
rely on V8 reacting to pressure
This could lead to bloating of PrettyText and Transpiler contexts
This optimisations ensures that we will clear memory 2 seconds after
the last eval on the context
Previously we would raise a warning in the logs if downloading
a file (from s3) takes longer than 60 seconds.
At scale this happens reasonably frequently.
1. Raised the duration to 3 minutes
2. Pulled the resizing mutex out of the downloading mutex
so we have less and clearer error logs
* DEV: Standardize table sorting verbiage
This commit creates a common component that tables can use to make their
headers sortable. This commit also standardizes on using `desc` as the
default and passing in the `asc=true` flag to adjust the sorting
direction.
* Add deprecation warnings
Adds deprecation warnings if using previous params and maintains
backwards compatibility. Set the default sort value for group members to
be asc.
* switch group requests to use common table-header-toggle
* update fixture
* PERF: Dematerialize topic_reply_count
It's only ever used for trust level promotions that run daily, or compared to 0. We don't need to track it on every post creation.
* UX: Add symbol in TL3 report if topic reply count is capped
* DEV: Drop user_stats.topic_reply_count column
Co-authored-by: Mark VanLandingham <markvanlan@gmail.com>
Co-authored-by: Robin Ward <robin.ward@gmail.com>
Co-authored-by: Mark VanLandingham <markvanlan@gmail.com>
Use a helper method to simplify creating a new register. Previously this would require creating lots of different methods manually, and adding every register to the clear/reset functions
This commit prevents a 500 error from occurring if someone is trying to
setup their discourse instance as a sso provider and they don't pass in
a `return_sso_url` in their payload.
This refactors default_current_user_provider in a few ways:
- Introduce a generic `api_parameter_allowed?` method which checks for whitelisted routes/formats
- Only read the api_key parameter on allowed routes. It is now completely ignored on other routes (previously it would raise a 403)
- Start reading user_api_key parameter on allowed routes
- Refactor tests as end-end integration tests
A plugin API for PARAMETER_API_PATTERNS will be added soon
For pages that do not specify canonical URL we will default to `https://SITENAME/PATH`.
This ensures that if a URL is crawled on the CDN the search ranking will transfer to the main site.
Additionally we whitelist the `?page` param
Adds a new rake task to auto generate a constants.js file with the
constants present. This makes migrating to Ember CLI easier, but also
slightly speeds up asset compilation by having to do less work.
If the constants change you need to run:
`rake javascripts:update_constants`
There were two constants here, `INLINE_ONEBOX_LOADING_CSS_CLASS` and
`INLINE_ONEBOX_CSS_CLASS` that were both longer than the strings they
were DRYing up: `inline-onebox-loading` and `inline-onebox`
I normally appreciate constants, but in this case it meant that we had
a lot of JS imports resulting in many more lines of code (and CPU cycles
spent figuring them out.)
It also meant we had an `.erb` file and had to invoke Ruby to create the
JS file, which meant the app was harder to port to Ember CLI.
I removed the constants. It's less DRY but faster and simpler, and
arguably the loss of DRYness is not significant as you can still search
for the `inline-onebox-loading` and `inline-onebox` strings easily if
you are refactoring.
We now show an options gear icon next to the bookmark name.
When expanded we show the "delete bookmark when reminder sent" option. The value of this checkbox is saved in local storage for the user.
If this is ticked, when a reminder is sent for the bookmark the bookmark itself is deleted. This is so people can use the reminder functionality by itself.
Also remove the blue alert reminder section from the "Edit Bookmark" modal as it just added clutter, because the user can already see they had a reminder set:
Adds a default false boolean column `delete_when_reminder_sent` to bookmarks.
Locale files get precompiled after deployment and they contained translations from the `default_locale`. That's especially bad in multisites, because the initial `default_locale` is `en_US`. Sites where the `default_locale` isn't `en_US` could see missing translations. The same thing could happen when users are allowed to chose a different locale.
This change simplifies the logic by not using the `default_locale` in the locale chain. It always falls back to `en` in case of missing translations.
We were sharing `Discourse` both as an application object and a
namespace which complicated things for Ember CLI. This patch
moves raw templates into `__DISCOURSE_RAW_TEMPLATES` and adds
a couple helper methods to create/remove them.
This introduces new APIs for obtaining optimized thumbnails for topics. There are a few building blocks required for this:
- Introduces new `image_upload_id` columns on the `posts` and `topics` table. This replaces the old `image_url` column, which means that thumbnails are now restricted to uploads. Hotlinked thumbnails are no longer possible. In normal use (with pull_hotlinked_images enabled), this has no noticeable impact
- A migration attempts to match existing urls to upload records. If a match cannot be found then the posts will be queued for rebake
- Optimized thumbnails are generated during post_process_cooked. If thumbnails are missing when serializing a topic list, then a sidekiq job is queued
- Topic lists and topics now include a `thumbnails` key, which includes all the available images:
```
"thumbnails": [
{
"max_width": null,
"max_height": null,
"url": "//example.com/original-image.png",
"width": 1380,
"height": 1840
},
{
"max_width": 1024,
"max_height": 1024,
"url": "//example.com/optimized-image.png",
"width": 768,
"height": 1024
}
]
```
- Themes can request additional thumbnail sizes by using a modifier in their `about.json` file:
```
"modifiers": {
"topic_thumbnail_sizes": [
[200, 200],
[800, 800]
],
...
```
Remember that these are generated asynchronously, so your theme should include logic to fallback to other available thumbnails if your requested size has not yet been generated
- Two new raw plugin outlets are introduced, to improve the customisability of the topic list. `topic-list-before-columns` and `topic-list-before-link`
The PR https://github.com/rails/rails/pull/36612 changes the raised exception
if the error message includes the target database name.
Since the error message contains the hostname, this could be triggered when
the hostname contains the database name.
* When hovering over the bookmark icon for a post, show the name of the bookmark at the end of the tooltip _if_ it has been set.
* Order bookmarks by `updated_at DESC` in the user list and show that instead of created at.
Recently, we added feature that we are sending `/muted` to users who muted specific topic just before `/latest` so the client knows to ignore those messages - https://github.com/discourse/discourse/pull/9482
Same `/muted` message should be included when the post is edited
* Remove Handlebars.SafeString usage
* DEV: Support for `import Handlebars from 'handlebars'`;
* FIX: Sprockets was broken when `node_modules` was present
By default the old version of sprockets looks for application.js
anywhere, including in a node_modules folder if this exists
(which it will when we move to Ember CLI.)
Some providers don't implement the Expect: 100-continue support,
which results in a mismatch in the object signature.
With this settings, users can disable the header and use such providers.
TLDR; this commit vastly improves how whitespaces are handled when converting from HTML to Markdown.
It also adds support for converting HTML <tables> to markdown tables.
The previous 'remove_whitespaces!' method was traversing the whole HTML tree and used a heuristic to remove
leading and trailing whitespaces whenever it was appropriate (ie. mostly before and after HTML block elements)
It was a good idea, but it was very limited and leaded to bad conversion when the html had leading whitespaces on several lines for example.
One such example can be found [here](https://meta.discourse.org/t/86782).
For various reasons, most of the whitespaces in a HTML file is ignored when the page is being displayed in a browser.
The rules that the browsers follow are the [CSS' White Space Processing Rules](https://www.w3.org/TR/css-text-3/#white-space-rules).
They can be quite complicated when you take into account RTL languages and other various tidbits but they boils down to the following:
- Collapse whitespaces down to one space (0x20) inside an inline context (ie. nodes/tags that are being displaying on the same line)
- Remove any leading/trailing whitespaces inside an inline context
One quick & dirty way of getting this 90% solved would be to do 'HTML.gsub!(/[[:space:]]+/, " ")'.
We would also need to hoist <pre> elements in order to not mess with their whitespaces.
Unfortunately, this solution let some whitespaces creep around HTML tags which leads to more '.strip!' calls than I can bear.
I decided to "emulate" the browser's handling of whitespaces and came up with a solution in 4 parts
1. remove_not_allowed!
The HtmlToMarkdown library is recursively "visiting" all the nodes in the HTML in order to convert them to Markdown.
All the nodes that aren't handled by the library (eg. <script>, <style> or any non-textual HTML tags) are "swallowed".
In order to reduce the number of nodes visited, the method 'remove_not_allowed!' will automatically delete all the nodes
that have no "visitor" (eg. a 'visit_<tag>' method) defined.
2. remove_hidden!
Similar purpose as the previous method (eg. reducing number of nodes visited), there's no point trying to convert something that is hidden.
The 'remove_hidden!' method removes any nodes that was hidden using the "hidden" HTML attribute, some CSS or with a width or height equal to 0.
3. hoist_line_breaks!
The 'hoist_line_breaks!' method is there to handle <br> tags. I know those tiny <br> don't do much but they can be quite annoying.
The <br> tags are inline elements but they visually work like a block element (ie. they create a new line).
If you have the following HTML "<i>Foo<br>Bar</i>", it ends up visually similar to "<i>Foo</i><br><i>Bar</i>".
The latter being much more easy to process than the former, so that's what this method is doing.
The "hoist_line_breaks" will hoist <br> tags out of inline tags until their parent is a block element.
4. remove_whitespaces!
The "remove_whitespaces!" is where all the whitespace removal is happening. It's broken down into 4 methods as well
- remove_whitespaces!
- is_inline?
- collapse_spaces!
- remove_trailing_space!
The 'remove_whitespace!' method is recursively walking the HTML tree (skipping <pre> tags).
If a node has any children, they will be chunked into groups of inline elements vs block elements.
For each chunks of inline elements, it will call the "collapse_space!" and "remove_trailing_space!" methods.
For each chunks of block elements, it will call "remote_whitespace!" to keep walking the HTML tree recursively.
The "is_inline?" method determines whether a node is part of a inline context.
A node is inline iif it's a text node or it's an inline tag, but not <br>, and all its children are also inline.
The "collapse_spaces!" method will collapse any kind of (white) space into a single space (" ") character, even accros tags.
For example, if we have " Foo \n<i> Bar </i>\t42", it will return "Foo <i>Bar </i>42".
Finally, the "remove_trailing_space!" method is there to remove any trailing space that might creep in at the end of the inline chunk.
This solution is not 100% bullet-proof.
It does not support RTL languages at all and has some caveats that I felt were not worth the work to get properly fixed.
FIX: better detection of hidden elements when converting HTML to Markdown
FIX: take into account the 'allowed_href_schemes' site setting when converting HTML <a> to Markdown
FIX: added support for 'mailto:' scheme when converting <a> from HTML to Markdown
FIX: added support for <img> dimensions when converting from HTML to Markdown
FIX: added support for <dl>, <dd> and <dt> when converting from HTML to Markdown
FIX: added support for multilines emphases, strongs and strikes when converting from HTML to Markdown
FIX: added support for <acronym> when converting from HTML to Markdown
DEV: remove unused 'sanitize' gem
Wow, did you just read all that?! Congratz, here's a cookie: 🍪.
We have the `# frozen_string_literal: true` comment on all our
files. This means all string literals are frozen. There is no need
to call #freeze on any literals.
For files with `# frozen_string_literal: true`
```
puts %w{a b}[0].frozen?
=> true
puts "hi".frozen?
=> true
puts "a #{1} b".frozen?
=> true
puts ("a " + "b").frozen?
=> false
puts (-("a " + "b")).frozen?
=> true
```
For more details see: https://samsaffron.com/archive/2018/02/16/reducing-string-duplication-in-ruby
This is to help with the migration to Ember CLI. In the current running
version of Discourse everything should be the same as before, just with
a few extra files that are not used. However, using Ember CLI this can
be installed as an Ember addon.
Co-Authored-By: Jarek Radosz <jradosz@gmail.com>
rebuilding user_actions is not something that should be done.
Plugins such as solved and assigned extend it, there are tons of
little rules that were not captured in `user_actions:rebuild`
Make sure the topic_user.bookmarked column is set correctly when user bookmarks/unbookmarks any post in a topic. For example, you bookmarked a post in the topic that was not the OP, the bookmark icon in the topic list would not be shown. Same if deleting a bookmark for the last bookmarked post in a topic, the bookmark icon in the topic list would not be removed.
Previously this was only setting it to true if bookmarking the OP/topic, which was not correct -- we want to show the icon on the topic list if any post is bookmarked.
Also set to false if unbookmarking the last bookmarked post in the topic.
Also in this PR is a migration to correct any out of sync topic_user.bookmarked columns, based on the new logic.
* When copying the markdown for an image between posts, we were not adding the srcset and data-small-image attributes which are done by calling optimize_image! in cooked post processor
* Refactored the code which was confusing in its current state (the consider_for_reuse method was super confusing) and fixed the issue
* FEATURE: don't display new/unread notification for muted topics
Currently, even if user mute topic, when a new reply to that topic arrives, the user will get "See 1 new or updated topic" message. After clicking on that link, nothing is visible (because the topic is muted)
To solve that problem, we will send background message to all users who recently muted that topic that update is coming and they can ignore the next message about that topic.
DO does not implement tagging support for S3 objects. Removing our default
empty tag fixes compatibility.
The expire_missing_assets rake task can't be used with that service still,
but this patch allows normal operation.