Some time ago, we made this fix to external authentication – https://github.com/discourse/discourse/pull/13706. We didn't address Discourse Connect (https://meta.discourse.org/t/discourseconnect-official-single-sign-on-for-discourse-sso/13045) at that moment, so I wanted to fix it for Discourse Connect as well.
Turned out though that Discourse Connect doesn't contain this problem and already handles staged users correctly. This PR adds tests that confirm it. Also, I've extracted two functions in Discourse Connect implementation along the way and decided to merge this refactoring too (the refactoring is supported with tests).
A plugin API that allows customizing existing topic-backed static pages, like:
faq, tos, privacy (see: StaticController) The block passed to this
method has to return a SiteSetting name that contains a topic id.
```
add_topic_static_page("faq") do |controller|
current_user&.locale == "pl" ? "polish_faq_topic_id" : "faq_topic_id"
end
```
You can also add new pages in a plugin, but remember to add a route,
for example:
```
get "contact" => "static#show", id: "contact"
```
We can fake redis transactions so that `fab!` works for redis and PG
data, but it's too slow to be used indiscriminately. Instead, you can
opt into it with the `use_redis_snapshotting` helper.
Insofar as snapshotting allows us to `fab!` more things, it provides a
speedup.
This commit introduces a new site setting "google_oauth2_hd_groups". If enabled, group information will be fetched from Google during authentication, and stored in the Discourse database. These 'associated groups' can be connected to a Discourse group via the "Membership" tab of the group preferences UI.
The majority of the implementation is generic, so we will be able to add support to more authentication methods in the near future.
https://meta.discourse.org/t/managing-group-membership-via-authentication/175950
We don't need it anymore. Actually, I removed using of it on the client side a long time ago, when I was working on improving blank page syndrome on user activity pages (see https://github.com/discourse/discourse/pull/14311).
This PR also removes some old resource strings that we don't use anymore. We have new strings for blank pages.
Documenting the `/u/:username:/emails.json` endpoint.
Also removing some email fields from user api responses because they
aren't actually included in the response unless you are querying
yourself.
When redirecting to login, we store a destination_url cookie, which the user is then redirected to after login. We never want the user to be redirected to a JSON URL. Instead, we should return a 403 in these situations.
This should also be much less confusing for API consumers - a 403 is a better representation than a 302.
Related to: 20f736aa11.
`auto_update` is true by default at the database level, but it doesn't make sense for `auto_update` to be true on themes that are not imported from a Git repository.
Under some conditions, these varied responses could lead to cache poisoning, hence the 'security' label.
Previously the Rails application would serve JSON data in place of HTML whenever Ember CLI requested an `application.html.erb`-rendered page. This commit removes that logic, and instead parses the HTML out of the standard response. This means that Rails doesn't need to customize its response for Ember CLI.
Recently, the wrong new behavior appeared – we started to suggest to invited users usernames like "user1".
To reproduce:
1. Create an invitation with default settings, do not restrict it to email
2. Copy an invitation link and follow it in incognito mode
See username already filled, with eg “user1”. See screenshot. Should be empty.
This bug was very likely introduced by my recent changes to UserNameSuggester.
We have a couple of site setting, `slow_down_crawler_user_agents` and `slow_down_crawler_rate`, that are meant to allow site owners to signal to specific crawlers that they're crawling the site too aggressively and that they should slow down.
When a crawler is added to the `slow_down_crawler_user_agents` setting, Discourse currently adds a `Crawl-delay` directive for that crawler in `/robots.txt`. Unfortunately, many crawlers don't support the `Crawl-delay` directive in `/robots.txt` which leaves the site owners no options if a crawler is crawling the site too aggressively.
This PR replaces the `Crawl-delay` directive with proper rate limiting for crawlers added to the `slow_down_crawler_user_agents` list. On every request made by a non-logged in user, Discourse will check the User Agent string and if it contains one of the values of the `slow_down_crawler_user_agents` list, Discourse will only allow 1 request every N seconds for that User Agent (N is the value of the `slow_down_crawler_rate` setting) and the rest of requests made within the same interval will get a 429 response.
The `slow_down_crawler_user_agents` setting becomes quite dangerous with this PR since it could rate limit lots if not all of anonymous traffic if the setting is not used appropriately. So to protect against this scenario, we've added a couple of new validations to the setting when it's changed:
1) each value added to setting must 3 characters or longer
2) each value cannot be a substring of tokens found in popular browser User Agent. The current list of prohibited values is: apple, windows, linux, ubuntu, gecko, firefox, chrome, safari, applewebkit, webkit, mozilla, macintosh, khtml, intel, osx, os x, iphone, ipad and mac.
Currently when a user creates posts that are moderated (for whatever
reason), a popup is displayed saying the post needs approval and the
total number of the user’s pending posts. But then this piece of
information is kind of lost and there is nowhere for the user to know
what are their pending posts or how many there are.
This patch solves this issue by adding a new “Pending” section to the
user’s activity page when there are some pending posts to display. When
there are none, then the “Pending” section isn’t displayed at all.
* FEATURE: Optionally send a 'noindex' header in non-canonical responses
This will be used in a SEO experiment.
Co-authored-by: David Taylor <david@taylorhq.com>
This commit adds token_hash and scopes columns to email_tokens table.
token_hash is a replacement for the token column to avoid storing email
tokens in plaintext as it can pose a security risk. The new scope column
ensures that email tokens cannot be used to perform a different action
than the one intended.
To sum up, this commit:
* Adds token_hash and scope to email_tokens
* Reuses code that schedules critical_user_email
* Refactors EmailToken.confirm and EmailToken.atomic_confirm methods
* Periodically cleans old, unconfirmed or expired email tokens
This commit refactors the direct external upload routes (get presigned
put, complete external, create/abort/complete multipart) into a
helper which is then included in both BackupController and the
UploadController. This is done so UploadController doesn't need
strange backup logic added to it, and so each controller implementing
this helper can do their own validation/error handling nicely.
This is a follow up to e4350bb966
Currently, Discourse rate limits all incoming requests by the IP address they
originate from regardless of the user making the request. This can be
frustrating if there are multiple users using Discourse simultaneously while
sharing the same IP address (e.g. employees in an office).
This commit implements a new feature to make Discourse apply rate limits by
user id rather than IP address for users at or higher than the configured trust
level (1 is the default).
For example, let's say a Discourse instance is configured to allow 200 requests
per minute per IP address, and we have 10 users at trust level 4 using
Discourse simultaneously from the same IP address. Before this feature, the 10
users could only make a total of 200 requests per minute before they got rate
limited. But with the new feature, each user is allowed to make 200 requests
per minute because the rate limits are applied on user id rather than the IP
address.
The minimum trust level for applying user-id-based rate limits can be
configured by the `skip_per_ip_rate_limit_trust_level` global setting. The
default is 1, but it can be changed by either adding the
`DISCOURSE_SKIP_PER_IP_RATE_LIMIT_TRUST_LEVEL` environment variable with the
desired value to your `app.yml`, or changing the setting's value in the
`discourse.conf` file.
Requests made with API keys are still rate limited by IP address and the
relevant global settings that control API keys rate limits.
Before this commit, Discourse's auth cookie (`_t`) was simply a 32 characters
string that Discourse used to lookup the current user from the database and the
cookie contained no additional information about the user. However, we had to
change the cookie content in this commit so we could identify the user from the
cookie without making a database query before the rate limits logic and avoid
introducing a bottleneck on busy sites.
Besides the 32 characters auth token, the cookie now includes the user id,
trust level and the cookie's generation date, and we encrypt/sign the cookie to
prevent tampering.
Internal ticket number: t54739.
Uppy adds the file name as the "name" parameter in the
payload by default, which means that for things like the
emoji uploader which have a name param used by the controller,
that param will be passed as the file name. We already use
the existing file name if the name param is null, so this
commit just does further cleanup of the name param, removing
the extension if it is a filename so we don't end up with
emoji names like blah_png.
Uppy and Resumable slice up their chunks differently, which causes a difference
in this algorithm. Let's take a 131.6MB file (137951695 bytes) with a 5MB (5242880 bytes)
chunk size. For resumable, there are 26 chunks, and uppy there are 27. This is
controlled by forceChunkSize in resumable which is false by default. The final
chunk size is 6879695 (chunk size + remainder) whereas in uppy it is 1636815 (just remainder).
This means that the current condition of uploaded_file_size + current_chunk_size >= total_size
is hit twice by uppy, because it uses a more correct number of chunks. This
can be solved for both uppy and resumable by checking the _previous_ chunk
number * chunk_size as the uploaded_file_size.
An example of what is happening before that change, using the current
chunk number to calculate uploaded_file_size.
chunk 26: resumable: uploaded_file_size (26 * 5242880) + current_chunk_size (6879695) = 143194575 >= total_size (137951695) ? YES
chunk 26: uppy: uploaded_file_size (26 * 5242880) + current_chunk_size (5242880) = 141557760 >= total_size (137951695) ? YES
chunk 27: uppy: uploaded_file_size (27 * 5242880) + current_chunk_size (1636815) = 143194575 >= total_size (137951695) ? YES
An example of what this looks like after the change, using the previous
chunk number to calculate uploaded_file_size:
chunk 26: resumable: uploaded_file_size (25 * 5242880) + current_chunk_size (6879695) = 137951695 >= total_size (137951695) ? YES
chunk 26: uppy: uploaded_file_size (25 * 5242880) + current_chunk_size (5242880) = 136314880 >= total_size (137951695) ? NO
chunk 27: uppy: uploaded_file_size (26 * 5242880) + current_chunk_size (1636815) = 137951695 >= total_size (137951695) ? YES
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
We are only using list_multipart_parts right now in the
uploads controller for multipart uploads to check if the
upload exists; thus we don't need up to 1000 parts.
Also adding a note for future explorers that list_multipart_parts
only gets 1000 parts max, and adding params for max parts
and starting parts.
Support for invites alongside DiscourseConnect was added in 355d51af. This commit fixes the guardian method so that the bulk invite button functionality also works.
* FIX: allowed_theme_ids should not be persisted in GlobalSettings
It was observed that the memoized value of `GlobalSetting.allowed_theme_ids` would be persisted across requests, which could lead to unpredictable/undesired behaviours in a multisite environment.
This change moves that logic out of GlobalSettings so that the returned theme IDs are correct for the current site.
Uses get_set_cache, which ultimately uses DistributedCache, which will take care of multisite issues for us.
By default, Rails only includes the Vary:Accept header in responses when the Accept: header is included in the request. This means that proxies/browsers may cache a response to a request with a missing Accept header, and then later serve that cached version for a request which **does** supply the Accept header. This can lead to some very unexpected behavior in browsers.
This commit adds the Vary:Accept header for all requests, even if the Accept header is not present in the request. If a format parameter (e.g. `.json` suffix) is included in the path, then the Accept header is still omitted. (The format parameter takes precedence over any Accept: header, so the response is no longer varies based on the Accept header)