* Chinese segmenetation will continue to rely on cppjieba
* Japanese segmentation will use our port of TinySegmenter
* Korean currently does not rely on segmentation which was dropped in c677877e4f
* SiteSetting.search_tokenize_chinese_japanese_korean has been split
into SiteSetting.search_tokenize_chinese and
SiteSetting.search_tokenize_japanese respectively
- Limit bulk re-invite to 1 time per day
- Move bulk invite by csv behind a site setting (hidden by default)
- Bump invite expiry from 30 -> 90 days
## Updates to rate_limiter
When limiting reinvites I found that **staff** are never limited in any way. So I updated the **rate_limiter** model to allow for a few things:
- add an optional param of `staff_limit`, which (when included and passed values, and the user passes `.staff?`) will override the default `max` & `secs` values and apply them to the user.
- in the case you **do** pass values to `staff_limit` but the user **does not** pass `staff?` the standard `max` & `secs` values will be applied to the user.
This should give us enough flexibility to
1. continue to apply a strict rate limit to a standard user
2. but also apply a secondary (less strict) limit to staff
Non-staff users are not allowed to see whisper so this change prevents
non-staff user from seeing a like count that does not make sense to
them. In the future, we might consider adding another like count column
for staff user.
Follow-up to 4492718864
When creating a direct message to a group with group SMTP
set up, and adding another person to that message in the OP,
we send an email to the second person in the OP via the group_smtp
job. This in turn creates an IncomingEmail record to guard against
IMAP double sync.
The issue with this was that this IncomingEmail (which is essentialy
a placeholder/dummy one) was having its Message-ID used as the canonical
References Message-ID for subsequent emails sent out to user_private_message
recipients (such as members of the group), causing threading issues in
the mail client. The canonical <topic/ID@HOST> format should be used
instead for these cases.
This commit fixes the issue by only using the IncomingEmail for the
OP's Message-ID if the OP was created via our handle_mail email receiver
pipeline. It does not make sense to use it in other cases.
When the record is not saved, we should display a proper message.
One potential reason can be plugins for example discourse-calendar is specifying that only first post can contain event
* FIX: Remove svg icons from webmanifest shortcuts
While SVGs are valid in the webmanifest, Chromium has not implemented
support for it in this specific manifest member.
Revert when https://bugs.chromium.org/p/chromium/issues/detail?id=1091612
lands.
* fix test
In an earlier PR, we decided that we only want to block a domain if
the blocked domain in the SiteSetting is the final destination (/t/59305). That
PR used `FinalDestination#get`. `resolve` however is used several places
but blocks domains along the redirect chain when certain options are provided.
This commit changes the default options for `resolve` to not do that. Existing
users of `FinalDestination#resolve` are
- `Oneboxer#external_onebox`
- our onebox helper `fetch_html_doc`, which is used in amazon, standard embed
and youtube
- these folks already go through `Oneboxer#external_onebox` which already
blocks correctly
Sometimes plugins need to have additional data or options available
when rendering custom markdown features/rules that are not available
on the default opts.discourse object. These additional options should
be namespaced to the plugin adding them.
```
Site.markdown_additional_options["chat"] = { limited_pretty_text_markdown_rules: [] }
```
These are passed down to markdown rules on opts.discourse.additionalOptions.
The main motivation for adding this is the chat plugin, which currently stores
chat_pretty_text_features and chat_pretty_text_markdown_rules on
the Site object via additions to the serializer, and the Site object is
not accessible to import via markdown rules (either through
Site.current() or through container.lookup). So, to have this working
for both front + backend code, we need to attach these additional options
from the Site object onto the markdown options object.
When staff visits the user profile of another user, the `email` field
in the model is empty. In this case, staff cannot send the reset email
password because nothing is passed in the `login` field.
This commit changes the behavior for staff users to allow resetting
password by username instead.
This commit fixes a bug where we our `HTMLScrubber` was only searching
for emoji img tags which contains only the "emoji" class. However, our emoji image tags
may contain more than just the "emoji" class like "only-emoji" when an
emoji exists by itself on a single line.
If a model class calls preload_custom_fields twice then
we have to clear this otherwise the fields are cached inside the
already existing proxy and no new ones are added, so when we check
for custom_fields[KEY] an error is likely to occur
In the unlikely, but possible, scenario where a user has no email_tokens, and has an invite record for their email address, login would fail. This commit fixes the `Invite` `user_doesnt_already_exist` validation so that it only applies to new invites, or when changing the email address.
This regressed in d8fe0f4199 (based on `git bisect`)
The UI used to request a password reset by username when the user was
logged in. This did not work when hide_email_already_taken site setting
was enabled, which disables the lookup-by-username functionality.
This commit also introduces a check to ensure that the parameter is an
email when hide_email_already_taken is enabled as the single allowed
type is email (no usernames are allowed).
Our previous implementation used a simple `blocked_domain_array.include?(hostname)`
so some values were not matching. Additionally, in some configurations like ours, we'd used
"cat.*.dog.com" with the assumption we'd support globbing.
This change implicitly allows globbing by blocking "http://a.b.com" if "b.com" is a blocked
domain but does not actively do anything for "*".
An upcoming change might include frontend validation for values that can be inserted.
* FIX: Tag watching for everyone tag groups
Tags in tag groups that have permissions set to everyone were not able
to be saved correctly. A user on their preferences page would mark the
tags that they wanted to save, but the watched_tags in the response
would be empty. This did not apply to admins, just regular users. Even
though the watched tags were being saved in the db, the user serializer
response was filtering them out. When a user refreshed their preferences
pages it would show zero watched tags.
This appears to be a regression introduced by:
0f598ca51e
The issue that needed to be fixed is that we don't track the "everyone"
group (which has an id of 0) in the group_users table. This is because
everyone has access to it, so why fill a row for every single user, that
would be a lot. The fix was to update the query to include tag groups
that had permissions set to the "everyone" group (group_id 0).
I also added another check to the existing spec for updating
watched tags for tags that aren't in a tag group so that it checks the
response body. I then added a new spec which updates watched tags for
tags in a tag group which has permissions set to everyone.
* Resolve failing tests
Improve SQL query syntax for including the "everyone" group with the id
of 0.
This commit also fixes a few failing tests that were introduced. It
turns out that the Fabrication of the Tag Group Permissions was faulty.
What happens when creating the tag groups without any permissions is
that it sets the permission to "everyone". If we then follow up with
fabricating a tag group permission on the tag group instead of having a
single permission it will have 2 (everyone + the group specified)! We
don't want this. To fix it I removed the fabrication of tag group
permissions and just set the permissions directly when creating the tag
group.
* Use response.parsed_body instead of JSON.parse
* FIX: Mark invites flash messages as HTML safe.
This change should be safe as all user inputs included in the errors are sanitized before sending it back to the client.
Context: https://meta.discourse.org/t/html-tags-are-explicit-after-latest-update/214220
* If somebody adds a new error message that includes user input and doesn't sanitize it, using html-safe suddenly becomes unsafe again. As an extra layer of protection, we make the client sanitize the error message received from the backend.
* Escape user input instead of sanitizing
* FEATURE: Export topics to markdown
The route `/raw/TOPIC_ID` will now export whole topics (paginated to 100
posts) in a markdown format.
See https://meta.discourse.org/t/-/152185/12
If the SiteSetting `allowed_onebox_iframes` contains a value of `*`, it will use the values of `all_iframe_origins` during the Oneboxing process. If `all_iframe_origins` itself contains a value of `*`, `origins_to_regexes` will try to return a "catch-all" regex.
Other code assumes `origins_to_regexes`will return an array, so this change ensures the `*` case will return an array containing only the catch-all regex.
1. `html_doc.css('.Box.md')` always returns a truthy value (e.g. `[]`) so the second branch of the if-elsif never ran
2. `node&.css('text()')` was invalid code that would raise an error
3. Matching on h3 elements is no longer correct with the current html structure returned by GitHub
* FIX: Pass category and tag IDs to the emit webhook event job.
Like webhooks won't fire when they're scoped to specific categories or tags because we're not passing the data to the job that emits it.
* Update config/initializers/012-web_hook_events.rb
Co-authored-by: Dan Ungureanu <dan@ungureanu.me>
Co-authored-by: Dan Ungureanu <dan@ungureanu.me>
This reverts commit 2c7906999a.
The changes break some things in local development (putting JS files
into minified files, not allowing debugger, and others)
This reverts commit ea84a82f77.
This is causing problems with `/theme-qunit` on legacy, non-ember-cli production sites. Reverting while we work on a fix
This is quite complex as it means that in production we have to build
Ember CLI test files and allow them to be used by our Rails application.
There is a fair bit of glue we can remove in the future once we move to
Ember CLI completely.
An admin could search for all screened ip addresses in a block by
using wildcards. 192.168.* returned all IPs in range 192.168.0.0/16.
This feature allows admins to search for a single IP address in all
screened IP blocks. 192.168.0.1 returns all IP blocks that match it,
for example 192.168.0.0/16.
* FEATURE: Remove roll up button for screened IPs
* FIX: Match more specific screened IP address first
The new warnings cover more cases and more accurate. Most of the
warnings will be visible only to staff members because otherwise they
would leak information about user's preferences.
This commit extends the options which can be passed to
`PrettyText.markdown` so that which Markdown-it rules and Discourse
Markdown plugins to be used when rendering a text can be customizable.
Currently, this extension is mainly used by plugins.
The warnings on git 2.28+ are:
```
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
```
Also:
* Remove an unused method (#fill_email)
* Replace a method that was used just once (#generate_username) with `SecureRandom.alphanumeric`
* Remove an obsolete dev puma `tmp/restart` file logic
Adding a spec for documenting the delete post API endpoint for our api
docs. As part of this added detailed info for the `force_destroy`
parameter for permanently deleting a post.
It is too close to release of 2.8 for incomplete
feature shenanigans. Ignores and drops the columns and drops
the trigger/function introduced in
e21c640a3c.
Will pick this feature back up post-release.
We are planning on attaching bookmarks to more and
more other models, so it makes sense to make a polymorphic
relationship to handle this. This commit adds the new
columns and backfills them in the bookmark table, and
makes sure that any new bookmark changes fill in the columns
via DB triggers.
This way we can gradually change the frontend and backend
to use these new columns, and eventually delete the
old post_id and for_topic columns in `bookmarks`.
Tests fail in Ruby 3.0 and later due to separation of positional and
keyword arguments. RSpec treats the hash at the end of include_examples
as keyword arguments when it should be passed as a positional argument.
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
The rake task deleted here was added back in Feb 2020
when bookmarks were first converted from PostAction
records, it is no longer needed. The ignored columns
were removed in ed83d7573e.
This commit adds a check that runs regularly as per
2d68e5d942 which tests the
credentials of groups with SMTP or IMAP enabled. If any issues
are found with those credentials a high priority problem is added to the
admin dashboard.
This commit also formats the admin dashboard differently if
there are high priority problems, bringing them to the top of
the list and highlighting them.
The problem will be cleared if the issue is fixed before the next
problem check, or if the group's settings are updated with a valid
credential.
You can add callbacks that get called before updating an already consolidated notification or creating a consolidated one.
Instances of this rule can add callbacks to access the old notifications about to be destroyed or the consolidated one and add additional data inside the data hash versus having to execute extra queries when adding this logic inside the `set_mutations` block.
I plan to use this in an upcoming discourse-reactions PR, where I want to like a post without notifying the user, so I can instead create a reaction notification.
Additionally, we decouple the a11y attributes from the icon itself, which will let us extend the widget's icon without losing them.
This allows authenticators to instruct the Auth::Result to override attributes without using the general site settings. This provides an easy migration path for auth plugins which offer their own "overrides email", "overrides username" or "overrides name" settings. With this new api, they can set `overrides_*` on the result object, and the attribute will be overriden regardless of the general site setting.
ManagedAuthenticator is updated to use this new API. Plugins which consume ManagedAuthenticator will instantly take advantage of this change.
The previous default aspect ratio for cropping tall images was a little
too strict and was cutting off images. This new setting should allow for
a larger range of image sizes before cropping them.
This commit adds API documentation for the new upload
endpoints related to direct + multipart external uploads.
Also included is a rake task which watches the files in
the spec/requests/api directory and calls a script file
(spec/regenerate_swagger_docs) whenever one changes. This
script runs rake rswag:specs:swaggerize and then copies
the openapi.yml file over to the discourse_api_docs repo
directory, and hits a script there to convert the YML to
JSON so the API docs are refreshed while the server is
still running. This makes the loop of making a doc change
and seeing it in the local server much faster.
The rake task is rake autospec:swagger
* FEATURE: hide_email_address_taken forces use of email in forgot password form
This strengthens this site setting which is meant to be used to harden sites
that are experiencing abuse on forgot password routes.
Previously we would only deny letting people know if forgot password worked on not
New change also bans usage of username for forgot password when enabled
This commit introduces scheduled problem checks for the admin dashboard, which are long running or otherwise cumbersome problem checks that will be run every 10 minutes rather than every time the dashboard is loaded. If these scheduled checks add a problem, the problem will remain until it is cleared or until the scheduled job runs again.
An example of a check that should be scheduled is validating credentials against an external provider.
This commit also introduces the concept of a `priority` to the problems generated by `AdminDashboardData` and the scheduled checks. This is `low` by default, and can be set to `high`, but this commit does not change any part of the UI with this information, only adds a CSS class.
I will be making a follow up PR to check group SMTP credentials.
When attempting to Onebox a page if there is no `meta property="og:description"` tag but there is a `meta name="description"` tag, Onebox should try to use that value.
Discourse sent only translation overrides for the current language to the client instead of sending overrides from fallback locales as well. This especially impacted en_GB -> en since most overrides would be done in English instead of English (UK).
This also adds lots of tests for previously untested code.
There's a small caveat: The client currently doesn't handle fallback locales for MessageFormat strings. That is why overrides for those strings always have a higher priority than regular translations. So, as an example, the lookup order for MessageFormat strings in German is:
1. override for de
2. override for en
3. value from de
4. value from en
As an example, the lookup order for German was:
1. override for de
2. override for en
3. value from de
4. value from en
After this change the lookup order is the same as on the client:
1. override for de
2. value from de
3. override for en
4. value from en
see /t/16381
Some time ago, we made this fix to external authentication – https://github.com/discourse/discourse/pull/13706. We didn't address Discourse Connect (https://meta.discourse.org/t/discourseconnect-official-single-sign-on-for-discourse-sso/13045) at that moment, so I wanted to fix it for Discourse Connect as well.
Turned out though that Discourse Connect doesn't contain this problem and already handles staged users correctly. This PR adds tests that confirm it. Also, I've extracted two functions in Discourse Connect implementation along the way and decided to merge this refactoring too (the refactoring is supported with tests).
A plugin API that allows customizing existing topic-backed static pages, like:
faq, tos, privacy (see: StaticController) The block passed to this
method has to return a SiteSetting name that contains a topic id.
```
add_topic_static_page("faq") do |controller|
current_user&.locale == "pl" ? "polish_faq_topic_id" : "faq_topic_id"
end
```
You can also add new pages in a plugin, but remember to add a route,
for example:
```
get "contact" => "static#show", id: "contact"
```
OmniAuth test mode is disabled by default, so that we can integration-test the omniauth strategies. Sometimes, we manually enable test mode for specific specs. This commit ensures that test_mode is always disabled again after each spec.
If a theme name contained a double-quote, this problem could lead to invalid/unexpected HTML in the `<head>`
Note that this is not considered a security issue because themes can only be installed/named by administrators, and themes/administrators already have the ability to run arbitrary javascript.
We can fake redis transactions so that `fab!` works for redis and PG
data, but it's too slow to be used indiscriminately. Instead, you can
opt into it with the `use_redis_snapshotting` helper.
Insofar as snapshotting allows us to `fab!` more things, it provides a
speedup.
This is a fix to address blurry onebox favicon images if the site you
are linking to happens to have a favicon.ico file that contains multiple
images.
This fix detects of we are trying to create an upload for a favicon.ico
file. We then convert it to a png and not a jpeg like we were doing. We
want a png because it will preserve transparency, otherwise if we
convert it to a jpeg we lose that and it looks bad on dark themed sites.
This fix also addresses the fact that .ico files can include multiple
images. The blurry images we were producing was caused by the
ImageMagick `-flatten` option when the .ico file had multiple images
which then squishes them all together. So for .ico files we are no
longer flattening them and instead we are grabbing the last image in the
.ico bundle and converting that single image to a png.
We previously used ConsolidateNotifications with a threshold of 1 to re-use an existing notification and bump it to the top instead of creating a new one. It produces some jumpiness in the user notification list, and it relies on updating the `created_at` attribute, which is a bit hacky.
As a better alternative, we're introducing a new plan that deletes all the previous versions of the notification, then creates a new one.
We send the reminder using the GroupMessage class, which supports removing previous messages. We can't match them by raw because they could mention different moderators. Also, I had to change the subject to remove dynamically generated values, which is necessary for finding them.
This commit introduces a new site setting "google_oauth2_hd_groups". If enabled, group information will be fetched from Google during authentication, and stored in the Discourse database. These 'associated groups' can be connected to a Discourse group via the "Membership" tab of the group preferences UI.
The majority of the implementation is generic, so we will be able to add support to more authentication methods in the near future.
https://meta.discourse.org/t/managing-group-membership-via-authentication/175950
We don't need it anymore. Actually, I removed using of it on the client side a long time ago, when I was working on improving blank page syndrome on user activity pages (see https://github.com/discourse/discourse/pull/14311).
This PR also removes some old resource strings that we don't use anymore. We have new strings for blank pages.
Since 3b13f1146b the email threading
in mail clients has been broken, because the random suffix meant
that the References header would always be different for non-group
SMTP email notifications sent out.
This commit fixes the issue by always using the "canonical" topic
reference ID inside the References header in the format:
topic/TOPIC_ID@HOST
Which was the old format. We also add the References header to
notifications sent for the first post arriving, so the threading
works for subsequent emails. The Message-ID header is still random
as per the previous change.
Some tests don't pass when this is elevated. They should be fixed,
since, at some point, we may create enough uploads during tests that
they fail naturally.
Currently we display pending posts in topics (both for author and staff
members) but the feature is only enabled when there’s an enabled global site
setting related to moderation.
This patch allows to have the same behavior for a site where there’s
nothing enabled globally but where a moderated category exists. So when
browsing a topic of a moderated category, the presence of pending posts
will be checked whereas nothing will happen in a normal category.
Currently the Message-IDs we send out for outbound email
are not unique; for a post they look like:
topic/TOPIC_ID/POST_ID@HOST
And for a topic they look like:
topic/TOPIC_ID@HOST
This commit changes the outbound Message-IDs to also have
a random suffix before the host, so the new format is
like this:
topic/TOPIC_ID/POST_ID.RANDOM_SUFFIX@HOST
Or:
topic/TOPIC_ID.RANDOM_SUFFIX@HOST
This should help with email deliverability. This change
is backwards-compatible, the old Message-ID format will
still be recognized in the mail receiver flow, so people
will still be able to reply using Message-IDs, In-Reply-To,
and References headers that have already been sent.
This commit also refactors Message-ID related logic
to a central location, and adds judicious amounts of
tests and documentation.
Documenting the `/u/:username:/emails.json` endpoint.
Also removing some email fields from user api responses because they
aren't actually included in the response unless you are querying
yourself.
When redirecting to login, we store a destination_url cookie, which the user is then redirected to after login. We never want the user to be redirected to a JSON URL. Instead, we should return a 403 in these situations.
This should also be much less confusing for API consumers - a 403 is a better representation than a 302.
Related to: 20f736aa11.
`auto_update` is true by default at the database level, but it doesn't make sense for `auto_update` to be true on themes that are not imported from a Git repository.
Under some conditions, these varied responses could lead to cache poisoning, hence the 'security' label.
Previously the Rails application would serve JSON data in place of HTML whenever Ember CLI requested an `application.html.erb`-rendered page. This commit removes that logic, and instead parses the HTML out of the standard response. This means that Rails doesn't need to customize its response for Ember CLI.
* DEV: Specify bookmarks order
It's better to order by id than to have a semi-random order. Fixes a flaky test:
```
1) TopicView with a few sample posts #bookmarks gets the first post bookmark reminder at for the user
59
Failure/Error: expect(first[:post_id]).to eq(bookmark1.post_id)
60
61
expected: 1901
62
got: 1902
63
64
(compared using ==)
65
# ./spec/components/topic_view_spec.rb:420:in `block (4 levels) in <main>'
66
# ./spec/rails_helper.rb:284:in `block (2 levels) in <top (required)>'
67
# ./vendor/bundle/ruby/2.7.0/gems/webmock-3.14.0/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
68
```
* Change test
* Revert "DEV: Specify bookmarks order"
This reverts commit 1f50026231.
* REFACTOR: Improve support for consolidating notifications.
Before this commit, we didn't have a single way of consolidating notifications. For notifications like group summaries, we manually removed old ones before creating a new one. On the other hand, we used an after_create callback for likes and group membership requests, which caused unnecessary work, as we need to delete the record we created to replace it with a consolidated one.
We now have all the consolidation rules centralized in a single place: the consolidation planner class. Other parts of the app looking to create a consolidable notification can do so by calling Notification#consolidate_or_save!, instead of the default Notification#create! method.
Finally, we added two more rules: one for re-using existing group summaries and another for deleting duplicated dashboard problems PMs notifications when the user is tracking the moderator's inbox. Setting the threshold to one forces the planner to apply this rule every time.
I plan to add plugin support for adding custom rules in another PR to keep this one relatively small.
* DEV: Introduces a plugin API for consolidating notifications.
This commit removes the `Notification#filter_by_consolidation_data` scope since plugins could have to define their criteria. The Plan class now receives two blocks, one to query for an already consolidated notification, which we'll try to update, and another to query for existing ones to consolidate.
It also receives a consolidation window, which accepts an ActiveSupport::Duration object, and filter notifications created since that value.
Recently, the wrong new behavior appeared – we started to suggest to invited users usernames like "user1".
To reproduce:
1. Create an invitation with default settings, do not restrict it to email
2. Copy an invitation link and follow it in incognito mode
See username already filled, with eg “user1”. See screenshot. Should be empty.
This bug was very likely introduced by my recent changes to UserNameSuggester.
We have a couple of site setting, `slow_down_crawler_user_agents` and `slow_down_crawler_rate`, that are meant to allow site owners to signal to specific crawlers that they're crawling the site too aggressively and that they should slow down.
When a crawler is added to the `slow_down_crawler_user_agents` setting, Discourse currently adds a `Crawl-delay` directive for that crawler in `/robots.txt`. Unfortunately, many crawlers don't support the `Crawl-delay` directive in `/robots.txt` which leaves the site owners no options if a crawler is crawling the site too aggressively.
This PR replaces the `Crawl-delay` directive with proper rate limiting for crawlers added to the `slow_down_crawler_user_agents` list. On every request made by a non-logged in user, Discourse will check the User Agent string and if it contains one of the values of the `slow_down_crawler_user_agents` list, Discourse will only allow 1 request every N seconds for that User Agent (N is the value of the `slow_down_crawler_rate` setting) and the rest of requests made within the same interval will get a 429 response.
The `slow_down_crawler_user_agents` setting becomes quite dangerous with this PR since it could rate limit lots if not all of anonymous traffic if the setting is not used appropriately. So to protect against this scenario, we've added a couple of new validations to the setting when it's changed:
1) each value added to setting must 3 characters or longer
2) each value cannot be a substring of tokens found in popular browser User Agent. The current list of prohibited values is: apple, windows, linux, ubuntu, gecko, firefox, chrome, safari, applewebkit, webkit, mozilla, macintosh, khtml, intel, osx, os x, iphone, ipad and mac.
Currently when a user creates posts that are moderated (for whatever
reason), a popup is displayed saying the post needs approval and the
total number of the user’s pending posts. But then this piece of
information is kind of lost and there is nowhere for the user to know
what are their pending posts or how many there are.
This patch solves this issue by adding a new “Pending” section to the
user’s activity page when there are some pending posts to display. When
there are none, then the “Pending” section isn’t displayed at all.
* FEATURE: Optionally send a 'noindex' header in non-canonical responses
This will be used in a SEO experiment.
Co-authored-by: David Taylor <david@taylorhq.com>
This commit adds token_hash and scopes columns to email_tokens table.
token_hash is a replacement for the token column to avoid storing email
tokens in plaintext as it can pose a security risk. The new scope column
ensures that email tokens cannot be used to perform a different action
than the one intended.
To sum up, this commit:
* Adds token_hash and scope to email_tokens
* Reuses code that schedules critical_user_email
* Refactors EmailToken.confirm and EmailToken.atomic_confirm methods
* Periodically cleans old, unconfirmed or expired email tokens
When this setting is turned on, it will check that normalized emails
are unique. Normalized emails are emails without any dots or plus
aliases.
This setting can be used to block use of aliases of the same email
address.
Use @here to mention all users that were allowed to topic directly or
through group, who liked topics or read the topic. Only first 10 users
will be notified.
Allow current user to keep existent tags when adding or removing a tag.
For example, a user could not remove a tag from a topic if the topic
had another tag that was restricted to a different category.
The users all shared the same `User#last_seen_at` column so depending on
how the database returned the records, the user that we're interested in
may be excluded from the update query.
Follow-up to 8226ab1099
In b8c8909a9d, we introduced a regression
where users may have had their `UserStat.first_unread_pm_at` set
incorrectly. This commit introduces a migration to reset `UserStat.first_unread_pm_at` back to
`User#created_at`.
Follow-up to b8c8909a9d.
Similar to site settings, adds support for `refresh` option to theme settings.
```yaml
super_feature_enabled:
type: bool
default: false
refresh: true
```
We are pushing /notification-alert/#{user_id} and /notification/#{user_id}
messages to MessageBus from both PostAlerter and User#publish_notification_state.
This can cause memory issues on large sites with many users. This commit
stems the bleeding by only sending these alert messages if the user
in question has been seen in the last 30 days, which eliminates a large
chunk of users on some sites.
When rendering the markdown code blocks we replace the
offending characters in the output string with spans highlighting a textual
representation of the character, along with a title attribute with
information about why the character was highlighted.
The list of characters stripped by this fix, which are the bidirectional
characters considered relevant, are:
U+202A
U+202B
U+202C
U+202D
U+202E
U+2066
U+2067
U+2068
U+2069
Previously, incorrect reply counts are displayed in the "top categories" section of the user summary page since we included the `moderator_action` and `small_action` post types.
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
This commit refactors the direct external upload routes (get presigned
put, complete external, create/abort/complete multipart) into a
helper which is then included in both BackupController and the
UploadController. This is done so UploadController doesn't need
strange backup logic added to it, and so each controller implementing
this helper can do their own validation/error handling nicely.
This is a follow up to e4350bb966
Currently, Discourse rate limits all incoming requests by the IP address they
originate from regardless of the user making the request. This can be
frustrating if there are multiple users using Discourse simultaneously while
sharing the same IP address (e.g. employees in an office).
This commit implements a new feature to make Discourse apply rate limits by
user id rather than IP address for users at or higher than the configured trust
level (1 is the default).
For example, let's say a Discourse instance is configured to allow 200 requests
per minute per IP address, and we have 10 users at trust level 4 using
Discourse simultaneously from the same IP address. Before this feature, the 10
users could only make a total of 200 requests per minute before they got rate
limited. But with the new feature, each user is allowed to make 200 requests
per minute because the rate limits are applied on user id rather than the IP
address.
The minimum trust level for applying user-id-based rate limits can be
configured by the `skip_per_ip_rate_limit_trust_level` global setting. The
default is 1, but it can be changed by either adding the
`DISCOURSE_SKIP_PER_IP_RATE_LIMIT_TRUST_LEVEL` environment variable with the
desired value to your `app.yml`, or changing the setting's value in the
`discourse.conf` file.
Requests made with API keys are still rate limited by IP address and the
relevant global settings that control API keys rate limits.
Before this commit, Discourse's auth cookie (`_t`) was simply a 32 characters
string that Discourse used to lookup the current user from the database and the
cookie contained no additional information about the user. However, we had to
change the cookie content in this commit so we could identify the user from the
cookie without making a database query before the rate limits logic and avoid
introducing a bottleneck on busy sites.
Besides the 32 characters auth token, the cookie now includes the user id,
trust level and the cookie's generation date, and we encrypt/sign the cookie to
prevent tampering.
Internal ticket number: t54739.
When 31035010af
was done it failed to take into account the case where the smtp_enabled
site setting was true, but the topic had no allowed groups / no
incoming email record, which caused errors for topics even with
nothing to do with group SMTP.
Uppy adds the file name as the "name" parameter in the
payload by default, which means that for things like the
emoji uploader which have a name param used by the controller,
that param will be passed as the file name. We already use
the existing file name if the name param is null, so this
commit just does further cleanup of the name param, removing
the extension if it is a filename so we don't end up with
emoji names like blah_png.
When there are multiple groups on a topic, we were selecting
the first from the topic allowed groups to act as the sender
email address when sending group SMTP replies via PostAlerter.
However, this was not ordered, and since there is no created_at
column on TopicAllowedGroup we cannot order this nicely, which
caused just a random group to be used (based on whatever postgres
decided it felt like that morning).
This commit changes the group used for SMTP sending to be the
group using the email_username of the to address of the first
incoming email for the topic, if there are more than one allowed
groups on the topic. Otherwise it just uses the only SMTP enabled
group.
Sometimes, a user may have a malformed email such as
`test@test.com<mailto:test@test.com` their email address,
and as a topic participant will be included as a CC email
when sending a GroupSmtpEmail. This causes the CC parsing to
fail and further down the line in Email::Sender the code
to check the CC addresses expects an array but gets a string
instead because of the parse failure.
Instead, we can just check if the CC addresses are valid
and drop them if they are not in the GroupSmtpEmail job.
This only affects multisite Discourse instances (where multiple forums are served from a single application server). The vast majority of self-hosted Discourse forums do not fall into this category.
On affected instances, this vulnerability could allow encrypted session cookies to be re-used between sites served by the same application instance.
Uppy and Resumable slice up their chunks differently, which causes a difference
in this algorithm. Let's take a 131.6MB file (137951695 bytes) with a 5MB (5242880 bytes)
chunk size. For resumable, there are 26 chunks, and uppy there are 27. This is
controlled by forceChunkSize in resumable which is false by default. The final
chunk size is 6879695 (chunk size + remainder) whereas in uppy it is 1636815 (just remainder).
This means that the current condition of uploaded_file_size + current_chunk_size >= total_size
is hit twice by uppy, because it uses a more correct number of chunks. This
can be solved for both uppy and resumable by checking the _previous_ chunk
number * chunk_size as the uploaded_file_size.
An example of what is happening before that change, using the current
chunk number to calculate uploaded_file_size.
chunk 26: resumable: uploaded_file_size (26 * 5242880) + current_chunk_size (6879695) = 143194575 >= total_size (137951695) ? YES
chunk 26: uppy: uploaded_file_size (26 * 5242880) + current_chunk_size (5242880) = 141557760 >= total_size (137951695) ? YES
chunk 27: uppy: uploaded_file_size (27 * 5242880) + current_chunk_size (1636815) = 143194575 >= total_size (137951695) ? YES
An example of what this looks like after the change, using the previous
chunk number to calculate uploaded_file_size:
chunk 26: resumable: uploaded_file_size (25 * 5242880) + current_chunk_size (6879695) = 137951695 >= total_size (137951695) ? YES
chunk 26: uppy: uploaded_file_size (25 * 5242880) + current_chunk_size (5242880) = 136314880 >= total_size (137951695) ? NO
chunk 27: uppy: uploaded_file_size (26 * 5242880) + current_chunk_size (1636815) = 137951695 >= total_size (137951695) ? YES
Same issue as 28b00dc6fc, the
Mocha::ExpectationError inherits from Exception instead
of StandardError so RspecErrorTracker does not show the
actual failed expectation in request specs, the status of
the response is just 500 with no further detail.
This commit adds the RailsMultisite middleware in test mode when Rails.configuration.multisite is true. This allows for much more realistic integration testing. The `multisite_spec.rb` file is rewritten to avoid needing to simulate a middleware stack.
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
We are only using list_multipart_parts right now in the
uploads controller for multipart uploads to check if the
upload exists; thus we don't need up to 1000 parts.
Also adding a note for future explorers that list_multipart_parts
only gets 1000 parts max, and adding params for max parts
and starting parts.
Support for invites alongside DiscourseConnect was added in 355d51af. This commit fixes the guardian method so that the bulk invite button functionality also works.
* FIX: Preserve field types when updating revision
When a post was edited quickly twice by the same user, the old post
revision was updated with the newest changes. To check if the change
was reverted (i.e. rename topic A to B and then back to A) a comparison
of the initial value and last value is performed. If the check passes
then the intermediary value is dismissed and only the initial value and
the last ones are preserved. Otherwise, the modification is dismissed
because the field returned to its initial value.
This used to work well for most fields, but failed for "tags" because
the field is an array and the values were transformed to strings to
perform the comparison.
* FIX: Reset last_editor_id if revision is reverted
If a post was revised and then the same revision was reverted,
last_editor_id was still set to the ID of the user who last edited the
post. This was a problem because the same person could then edit the
same post again and because it was the same user and same post, the
system attempted to update the last one (that did not exist anymore).
This commit introduces a new s3:ensure_cors_rules rake task
that is run as a prerequisite to s3:upload_assets. This rake
task calls out to the S3CorsRulesets class to ensure that
the 3 relevant sets of CORS rules are applied, depending on
site settings:
* assets
* direct S3 backups
* direct S3 uploads
This works for both Global S3 settings and Database S3 settings
(the latter set directly via SiteSetting).
As it is, only one rule can be applied, which is generally
the assets rule as it is called first. This commit changes
the ensure_cors! method to be able to apply new rules as
well as the existing ones.
This commit also slightly changes the existing rules to cover
direct S3 uploads via uppy, especially multipart, which requires
some more headers.
FinalDestination's follow_canonical mode used for embedded topics should work when canonical URLs are relative, as specified in [RFC 6596](https://datatracker.ietf.org/doc/html/rfc6596)
Previously, suppressed category topics are included in the digest emails if the user visited that topic before and the `TopicUser` record is created with any notification level except 'muted'.
We are no longer able to display the image returned by Instagram directly within a Discourse site (either in the composer, or within a cooked post within a topic), so:
- Display an image placeholder in the composer preview
- A cooked post should use an iframe to display the Instagram 'embed' content
* DEV: Output webmock errors in request specs
In request specs, if you had not properly mocked an external
HTTP call, you would end up with a 500 error with no further
information instead of your expected response code, with an
rspec output like this:
```
Failures:
1) UploadsController#generate_presigned_put when the store is external generates a presigned URL and creates an external upload stub
Failure/Error: expect(response.status).to eq(200)
expected: 200
got: 500
(compared using ==)
# ./spec/requests/uploads_controller_spec.rb:727:in `block (4 levels) in <top (required)>'
# ./spec/rails_helper.rb:280:in `block (2 levels) in <top (required)>'
```
This is not helpful at all when you want to find what you actually
failed to mock, which is shown straight away in non-request specs.
This commit introduces a rescue_from block in the application
controller to log this error, so we have a much nicer output that
helps the developer find the issue:
```
Failures:
1) UploadsController#generate_presigned_put when the store is external generates a presigned URL and creates an external upload stub
Failure/Error: expect(response.status).to eq(200)
expected: 200
got: 500
(compared using ==)
# ./spec/requests/uploads_controller_spec.rb:727:in `block (4 levels) in <top (required)>'
# ./spec/rails_helper.rb:280:in `block (2 levels) in <top (required)>'
# ------------------
# --- Caused by: ---
# WebMock::NetConnectNotAllowedError:
# Real HTTP connections are disabled. Unregistered request: GET https://s3-upload-bucket.s3.us-west-1.amazonaws.com/?cors with headers {'Accept'=>'*/*', 'Accept-Encoding'=>'', 'Authorization'=>'AWS4-HMAC-SHA256 Credential=some key/20211101/us-west-1/s3/aws4_request, SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date, Signature=test', 'Host'=>'s3-upload-bucket.s3.us-west-1.amazonaws.com', 'User-Agent'=>'aws-sdk-ruby3/3.121.2 ruby/2.7.1 x86_64-linux aws-sdk-s3/1.96.1', 'X-Amz-Content-Sha256'=>'test', 'X-Amz-Date'=>'20211101T035113Z'}
#
# You can stub this request with the following snippet:
#
# stub_request(:get, "https://s3-upload-bucket.s3.us-west-1.amazonaws.com/?cors").
# with(
# headers: {
# 'Accept'=>'*/*',
# 'Accept-Encoding'=>'',
# 'Authorization'=>'AWS4-HMAC-SHA256 Credential=some key/20211101/us-west-1/s3/aws4_request, SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date, Signature=test',
# 'Host'=>'s3-upload-bucket.s3.us-west-1.amazonaws.com',
# 'User-Agent'=>'aws-sdk-ruby3/3.121.2 ruby/2.7.1 x86_64-linux aws-sdk-s3/1.96.1',
# 'X-Amz-Content-Sha256'=>'test',
# 'X-Amz-Date'=>'20211101T035113Z'
# }).
# to_return(status: 200, body: "", headers: {})
#
# registered request stubs:
#
# stub_request(:head, "https://s3-upload-bucket.s3.us-west-1.amazonaws.com/")
#
# ============================================================
```
* DEV: Require webmock in application controller if rails.env.test
* DEV: Rescue from StandardError and NetConnectNotAllowedError
The `白名单` term becomes `名单 白名单` after it is processed by
cppjieba in :query mode. However, `白名单` is not tokenized as such by cppjieba when it
appears in a string of text. Therefore, this may lead to failed matches as
the search data generated while indexing may not contain all of the
terms generated by :query mode. We've decided to maintain parity for now
such that both indexing and querying uses the same :mix mode. This may
lead to less accurate search but our plan is to properly support CJK
search in the future.