Commit Graph

311 Commits

Author SHA1 Message Date
Osama Sayegh 7bd3986b21
FEATURE: Replace `Crawl-delay` directive with proper rate limiting (#15131)
We have a couple of site setting, `slow_down_crawler_user_agents` and `slow_down_crawler_rate`, that are meant to allow site owners to signal to specific crawlers that they're crawling the site too aggressively and that they should slow down.

When a crawler is added to the `slow_down_crawler_user_agents` setting, Discourse currently adds a `Crawl-delay` directive for that crawler in `/robots.txt`. Unfortunately, many crawlers don't support the `Crawl-delay` directive in `/robots.txt` which leaves the site owners no options if a crawler is crawling the site too aggressively.

This PR replaces the `Crawl-delay` directive with proper rate limiting for crawlers added to the `slow_down_crawler_user_agents` list. On every request made by a non-logged in user, Discourse will check the User Agent string and if it contains one of the values of the `slow_down_crawler_user_agents` list, Discourse will only allow 1 request every N seconds for that User Agent (N is the value of the `slow_down_crawler_rate` setting) and the rest of requests made within the same interval will get a 429 response. 

The `slow_down_crawler_user_agents` setting becomes quite dangerous with this PR since it could rate limit lots if not all of anonymous traffic if the setting is not used appropriately. So to protect against this scenario, we've added a couple of new validations to the setting when it's changed:

1) each value added to setting must 3 characters or longer
2) each value cannot be a substring of tokens found in popular browser User Agent. The current list of prohibited values is: apple, windows, linux, ubuntu, gecko, firefox, chrome, safari, applewebkit, webkit, mozilla, macintosh, khtml, intel, osx, os x, iphone, ipad and mac.
2021-11-30 12:55:25 +03:00
Natalie Tay 4c46c7e334
DEV: Remove xlink hrefs (#15059) 2021-11-25 15:22:43 +11:00
David Taylor 13fdc979a8
DEV: Improve multisite testing (#14884)
This commit adds the RailsMultisite middleware in test mode when Rails.configuration.multisite is true. This allows for much more realistic integration testing. The `multisite_spec.rb` file is rewritten to avoid needing to simulate a middleware stack.
2021-11-11 16:44:58 +00:00
Martin Brennan fc98d1edfa
DEV: Improve s3:ensure_cors_rules logging (#14832) 2021-11-08 11:44:12 +10:00
Martin Brennan 9a72a0945f
FIX: Ensure CORS rules exist for S3 using rake task (#14802)
This commit introduces a new s3:ensure_cors_rules rake task
that is run as a prerequisite to s3:upload_assets. This rake
task calls out to the S3CorsRulesets class to ensure that
the 3 relevant sets of CORS rules are applied, depending on
site settings:

* assets
* direct S3 backups
* direct S3 uploads

This works for both Global S3 settings and Database S3 settings
(the latter set directly via SiteSetting).

As it is, only one rule can be applied, which is generally
the assets rule as it is called first. This commit changes
the ensure_cors! method to be able to apply new rules as
well as the existing ones.

This commit also slightly changes the existing rules to cover
direct S3 uploads via uppy, especially multipart, which requires
some more headers.
2021-11-08 09:16:38 +10:00
jbrw aec125b617
FIX: Display Instagram Oneboxes in an iframe (#14789)
We are no longer able to display the image returned by Instagram directly within a Discourse site (either in the composer, or within a cooked post within a topic), so:

- Display an image placeholder in the composer preview
- A cooked post should use an iframe to display the Instagram 'embed' content
2021-11-02 14:34:51 -04:00
Joffrey JAFFEUX b18c01e3c6
DEV: prevents flakky spec when deleting plugin (#14701)
Not reseting the registry could lead to assets still being registered for example.

This flakky spec was reprdocible with this call: `bundle exec rspec --seed 9472 spec/components/discourse_plugin_registry_spec.rb spec/components/svg_sprite/svg_sprite_spec.rb`

Which would trigger the following error:

```
Failures:

  1) DiscoursePluginRegistry#register_asset registers vendored_core_pretty_text properly
     Failure/Error: expect(registry.javascripts.count).to eq(0)

       expected: 0
            got: 1

       (compared using ==)
     # ./spec/components/discourse_plugin_registry_spec.rb:248:in `block (3 levels) in <top (required)>'
     # ./spec/rails_helper.rb:280:in `block (2 levels) in <top (required)>'
     # /Users/joffreyjaffeux/.gem/ruby/2.7.3/gems/webmock-3.14.0/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
```
2021-10-25 10:24:21 +02:00
Yasuo Honda 9a083a550c FIX: `BackupRestore::DatabaseRestorer` failures with Ruby 3
Implemented workaround suggested at
https://github.com/freerange/mocha/issues/445#issuecomment-644944003
2021-10-12 17:25:51 -04:00
Alan Guo Xiang Tan 34cebfd867
FIX: Exclude PMs that user sent to themselves. (#14496)
Regression from 016efeadf6

Follow-up to 016efeadf6
2021-10-04 11:55:35 +08:00
Alan Guo Xiang Tan 9d5da2b383
PERF: Revert all inboxes from messages route. (#14445)
The all inboxes was introduced in
016efeadf6 but we decided to roll it back
for performance reasons. The main performance challenge here is that PG
has to basically loop through all the PMs that a user is allowed to view
before being able to order by `Topic#bumped_at`. The all inboxes was not
planned as part of the new/unread filter so we've decided not to tackle
the performance issue for the upcoming release.

Follow-up to 016efeadf6
2021-09-28 11:58:04 +08:00
Martin Brennan dba6a5eabf
FEATURE: Humanize file size error messages (#14398)
The file size error messages for max_image_size_kb and
max_attachment_size_kb are shown to the user in the KB
format, regardless of how large the limit is. Since we
are going to support uploading much larger files soon,
this KB-based limit soon becomes unfriendly to the end
user.

For example, if the max attachment size is set to 512000
KB, this is what the user sees:

> Sorry, the file you are trying to upload is too big (maximum
size is 512000KB)

This makes the user do math. In almost all file explorers that
a regular user would be familiar width, the file size is shown
in a format based on the maximum increment (e.g. KB, MB, GB).

This commit changes the behaviour to output a humanized file size
instead of the raw KB. For the above example, it would now say:

> Sorry, the file you are trying to upload is too big (maximum
size is 512 MB)

This humanization also handles decimals, e.g. 1536KB = 1.5 MB
2021-09-22 07:59:45 +10:00
Martin Brennan 27699648ef
FEATURE: Go to last unread for topic-level bookmark links (#14396)
Instead of going to the OP of the topic for topic-level bookmarks
(which are bookmarks where for_topic is true) when clicking on the
bookmark in the quick access menu or on the user bookmark list,
this commit takes the user to the last unread post in
the topic instead. This should be generally more useful than landing
on the unchanging OP.

To make this work nicely, I needed to add the last_read_post_number to
the BookmarkQuery based on the TopicUser association. It should not add
too much extra weight to the query, because it is limited to the user
that we are fetching bookmarks for.

Also fixed an issue where the bookmark serializer highest_post_number was
not taking into account whether the user was staff, which is when we
should use highest_staff_post_number instead.
2021-09-21 13:49:56 +10:00
Alan Guo Xiang Tan 7a8b5cdd5c
DEV: Improve tests coverage when listing private messages. (#14385)
This is in response to the security incident published in
https://github.com/discourse/discourse/security/advisories/GHSA-vm3x-w6jm-j9vv.

The security incident highlighted a gap in our test suite so we're
adding more test cases to ensure that personal and group messages do not
leak between users in the future.
2021-09-21 10:39:59 +08:00
Martin Brennan 0c42a1e5f3
FEATURE: Topic-level bookmarks (#14353)
Allows creating a bookmark with the `for_topic` flag introduced in d1d2298a4c set to true. This happens when clicking on the Bookmark button in the topic footer when no other posts are bookmarked. In a later PR, when clicking on these topic-level bookmarks the user will be taken to the last unread post in the topic, not the OP. Only the OP can have a topic level bookmark, and users can also make a post-level bookmark on the OP of the topic.

I had to do some pretty heavy refactors because most of the bookmark code in the JS topics controller was centred around instances of Post JS models, but the topic level bookmark is not centred around a post. Some refactors were just for readability as well.

Also removes some missed reminderType code from the purge in 41e19adb0d
2021-09-21 08:45:47 +10:00
Martin Brennan 41e19adb0d
DEV: Ignore reminder_type for bookmarks (#14349)
We don't actually use the reminder_type for bookmarks anywhere;
we are just storing it. It has no bearing on the UI. It used
to be relevant with the at_desktop bookmark reminders (see
fa572d3a7a)

This commit marks the column as readonly, ignores it, and removes
the index, and it will be dropped in a later PR. Some plugins
are relying on reminder_type partially so some stubs have been
left in place to avoid errors.
2021-09-16 09:56:54 +10:00
Alan Guo Xiang Tan ddb458343d
PERF: Improve query performance all inbox private messages. (#14304)
First reported in https://meta.discourse.org/t/-/202482/19

There are two optimizations being applied here:

1. Fetch a user's group ids in a seperate query instead of including it
   as a sub-query. When I tried a subquery, the query plan becomes very
inefficient.

1. Join against the `topic_allowed_users` and `topic_allowed_groups`
   table instead of doing an IN against a subquery where we UNION the
`topic_id`s from the two tables. From my profiling, this enables PG to
do a backwards index scan on the `index_topics_on_timestamps_private`
index.

This commit fixes a bug where listing all messages was incorrectly
excluding topics if a topic has been archived by a group even if the
user did not belong to the group.

This commit also fixes another bug where dismissing private messages
selectively was subjected to the default limit of 30.
2021-09-15 10:29:42 +08:00
Martin Brennan 22208836c5
DEV: Ignore bookmarks.topic_id column and remove references to it in code (#14289)
We don't need no stinkin' denormalization! This commit ignores
the topic_id column on bookmarks, to be deleted at a later date.
We don't really need this column and it's better to rely on the
post.topic_id as the canonical topic_id for bookmarks, then we
don't need to remember to update both columns if the bookmarked
post moves to another topic.
2021-09-15 10:16:54 +10:00
Jean 34ff7bfeeb
FEATURE: Hide suspended users from site-wide search to regular users (#14245) 2021-09-06 09:59:35 -04:00
Martin Brennan e43a8af3bd
FIX: Do not send emails to mailing_list_mode subscribers for PMs (#14159)
This bug was introduced by f66007ec83.

In PostJobsEnqueuer we previously did not fire the after_post_create
event and after_topic_create event for private message topics. This was
changed in the above commit in order to publish message bus messages
for topic tracking state updates. Unfortunately this caused the
NotifyMailingListSubscribers job to be enqueued for all posts including
private messages, and admins and the users involved in the PMs got
emailed the contents of the PMs if they had mailing list mode enabled.

Luckily the impact of this was mitigated by a Guardian#can_see? check
for each mailing list mode user in the NotifyMailingListSubscribers job.
We never want to notify mailing list mode subscribers for private messages
so an early return has been added there, plus the logic in PostJobsEnqueuer
has been fixed, and tests have been added to that class where there were
none before.
2021-08-26 15:16:35 +10:00
Alan Guo Xiang Tan d13716286c
FIX: Unread group PMs should use `GroupUser#first_unread_pm_at`. (#14075)
This bug was causing unread PMs for groups to appear inaccurate.
2021-08-18 11:23:28 +08:00
Chema Balsas 745b99edbf TEST: Adds test for urls with url-encoded section hash 2021-08-12 10:43:50 -04:00
Chema Balsas 6b8ee4d5ef TEST: Adds test for urls with section hash 2021-08-12 10:43:50 -04:00
Alan Guo Xiang Tan 0bf27242ec FIX: Group inbox new filter not accounting for dismissed topics.
Follow-up to 2c046cc670
2021-08-05 16:53:12 +08:00
Alan Guo Xiang Tan 2c046cc670 FEATURE: Dismiss new and unread for PM inboxes. 2021-08-05 12:56:15 +08:00
Bianca Nenciu e2c415457c
FEATURE: Attach backup log as upload (#13849)
Discourse automatically sends a private message after backup or
restore finished. The private message used to contain the log inline
even when it was very long. A very long log can create issues because
the length of the post will be over the maximum allowed length of a
post. When that happens, Discourse will try to create an upload with
the logs. If that fails, it will trim the log and inline it.
2021-08-03 20:06:50 +03:00
Bianca Nenciu fbf7627c8e
FIX: Make search work with sub-sub-categories (#13901)
Searching in a category looked only one level down, ignoring the site
setting max_category_nesting. The user interface did not support the
third level of categories and did not display them in the "Categorized"
input of the advanced search options.
2021-08-02 14:04:13 +03:00
Alan Guo Xiang Tan 016efeadf6
FEATURE: New and Unread messages for user personal messages. (#13603)
* FEATURE: New and Unread messages for user personal messages.

Co-authored-by: awesomerobot <kris.aubuchon@discourse.org>
2021-08-02 12:41:41 +08:00
jbrw 2f28ba318c
FEATURE: Onebox can match engines based on the content_type (#13876)
* FEATURE: Onebox can match engines based on the content_type

`FinalDestination` now returns the `content_type` of a resolved URL.

`Oneboxer` passes this value to `Onebox` itself. Onebox engines can now specify a `matches_content_type` regex of content_types that the engine can handle, regardless of the URL.

`ImageOnebox` will match URLs with a content type of `image/png`, `jpg`, `gif`, `bmp`, `tif`, etc.

This will allow images that exist at a URL without a file type extension to be correctly rendered, assuming a valid `content_type` is returned.
2021-07-30 13:36:30 -04:00
Joffrey JAFFEUX 5eb6e9281a
FIX: manually adds frowning_face_with_open_mouth for apple (#13528) 2021-07-21 23:27:20 +02:00
Michael Brown 76a11e6dc9 DEV: fix test (missed a reference to master) 2021-07-19 12:47:45 -04:00
Michael Brown aa12d12c0b discourse/discourse change from 'master' to 'main': update fixture data 2021-07-19 11:46:15 -04:00
David Taylor 8b89787426
SECURITY: Sanitize YouTube Onebox data (#13748)
CVE-2021-32764
2021-07-15 19:31:50 +01:00
jbrw a64aea38b7
FIX: Don’t use `user_generated` images as avatar images in Oneboxed Twitter content (#13712)
By default, Twitter will return the URL for the avatar image of the tweet poster as the `og:image` value.

However, if the `user_generated` attribute is true, we should not use this as the avatar URL as this will be an URL of an image in the tweet itself (e.g., an image belonging to a tweeted news story).
2021-07-13 14:54:28 -04:00
Jarek Radosz 48b92d8897
DEV: Isolate multisite specs (#13634)
Mixing multisite and standard specs can lead to issues (e.g. when using `fab!`)
Disabled the (upcoming https://github.com/discourse/rubocop-discourse/pull/11) rubocop rule for two files that have thoroughly tangled both types of specs.
2021-07-07 18:57:42 +02:00
Penar Musaraj 35110f6681
FIX: Set CSP base-uri to `self` (#13654) 2021-07-07 09:43:48 -04:00
Arpit Jalan 05bdbd9f97
SECURITY: Onebox canonical links bypassing FinalDestination checks (#13605) 2021-07-01 20:09:29 +05:30
Arpit Jalan b63c9febe8
FIX: ignore canonical link to localhost (#13577) 2021-06-30 13:55:17 +05:30
Krzysztof Kotlarek a6363170e9 FIX: flaky search-spec
More precise expectations for search spec
2021-06-29 10:06:44 +08:00
Jarek Radosz 046a875222
DEV: Improve `script/downsize_uploads.rb` (#13508)
* Only shrink images that are used in Posts and no other models
* Don't save the upload if the size is the same
2021-06-24 00:09:40 +02:00
David Taylor d2c5165052 FIX: Check all migrations for dropped columns/tables during restore
Previously only post-deploy migrations were being checked for DROPPED_(COLUMNS|TABLES) constants
2021-06-23 17:43:38 +01:00
Alan Guo Xiang Tan 44aa46ca05 Code review comments. 2021-06-21 11:06:58 +08:00
Alan Guo Xiang Tan 8e3691d537 PERF: Eager load Theme associations in Stylesheet Manager.
Before this change, calling `StyleSheet::Manager.stylesheet_details`
for the first time resulted in multiple queries to the database. This is
because the code was modelled in a way where each `Theme` was loaded
from the database one at a time.

This PR restructures the code such that it allows us to load all the
theme records in a single query. It also allows us to eager load the
required associations upfront. In order to achieve this, I removed the
support of loading multiple themes per request. It was initially added
to support user selectable theme components but the feature was never
completed and abandoned because it wasn't a feature that we thought was
worth building.
2021-06-21 11:06:58 +08:00
jbrw fbfd1fd80b
FIX: Allow SVG uploads if dimensions are a fraction of a unit (#13409)
* FIX: Allow SVG uploads if dimensions are a fraction of a unit

`UploadCreator` counts the number of pixels in an file to determine if it is valid. `pixels` is calculated by multiplying the width and height of the image, as determined by FastImage.

SVG files can have their width/height expressed in a variety of different units of measurement. For example, ‘px’, ‘in’, ‘cm’, ‘mm’, ‘pt’, ‘pc’, etc are all valid within SVG files. If an image has a width of `0.5in`, FastImage may interpret this as being a width of `0`, meaning it will report the `size` as being `0`.

However, we don’t need to concern ourselves with the number of ‘pixels’ in a SVG files, as that is irrelevant for this file format, so we can skip over the check for `pixels == 0` when processing this file type.

* DEV: Speed up getting SVG dimensions

The `-ping` flag prevents the entire image from being rasterized before a result is returned. See:

https://imagemagick.org/script/command-line-options.php#ping
2021-06-17 15:56:11 -04:00
Jarek Radosz e36377d9ab
DEV: Don't user before(:all)/after(:all) (#13389)
Leaking state and non-obvious order (before :all runs *before* RailsHelper.test_setup) are not worth it.
A replacement PR for #13370. Fixes some flaky specs, e.g.
```
bin/rspec './spec/components/freedom_patches/translate_accelerator_spec.rb[1:3]' './spec/jobs/clean_up_user_export_topics_spec.rb[1:1]' --tag ~type:multisite --seed 35994
```

Also included:
* DEV: No need for locale reset (we do it anyway in rails_helper in `test_setup`)
2021-06-15 17:25:06 +02:00
Roman Rizzi fa57316a4e
FIX: Validate upload is still valid after calling the "before_upload_creation" event (#13091)
Since we use the event to perform additional validations on the file, we should check if it added any errors to the upload before saving it. This change makes the UploadCreator more consistent since we no longer have to rely on exceptions.
2021-06-15 10:10:03 -03:00
Penar Musaraj 6f76479054
FEATURE: Add upgrade-insecure-requests to CSP when force_https is enabled (#13348)
If force_https is enabled all resource (including markdown preview and so on) will be accessed using HTTPS

If for any reason you attempt to link to non HTTPS reachable content content may appear broken
2021-06-10 10:53:10 +10:00
Penar Musaraj 8336e732d3
DEV: Add manifest-src to CSP (#13319)
Defaults to `manifest-src: 'self'` and allows plugins/themes to extend it.
2021-06-08 09:32:31 -04:00
Penar Musaraj f90c4bd6a1
DEV: Allow plugins to extend frame-ancestors (#13316) 2021-06-07 14:59:15 -04:00
jbrw 09bc95d46b
FIX: Quoting Oneboxed content should exclude formatting (#13296)
* FIX: Quoting Oneboxed content should exclude formatting

When a post is quoted that includes Oneboxed content, we should not include the formatting generated by the Onebox. Rather, we should attempt to collapse the link referenced by the Onebox to a single line text link.

* DEV: fix tests
2021-06-07 13:03:53 -04:00
Sam 435c4817cb
FEATURE: enable tagging by default (#13175)
Over the years we have found that a few communities never discovered tags.

Instead of having them default off we now have them default on, ensuring
that everyone finds out about them.

Co-authored-by: Dan Ungureanu <dan@ungureanu.me>
2021-06-07 18:07:46 +03:00