We have a couple of site setting, `slow_down_crawler_user_agents` and `slow_down_crawler_rate`, that are meant to allow site owners to signal to specific crawlers that they're crawling the site too aggressively and that they should slow down.
When a crawler is added to the `slow_down_crawler_user_agents` setting, Discourse currently adds a `Crawl-delay` directive for that crawler in `/robots.txt`. Unfortunately, many crawlers don't support the `Crawl-delay` directive in `/robots.txt` which leaves the site owners no options if a crawler is crawling the site too aggressively.
This PR replaces the `Crawl-delay` directive with proper rate limiting for crawlers added to the `slow_down_crawler_user_agents` list. On every request made by a non-logged in user, Discourse will check the User Agent string and if it contains one of the values of the `slow_down_crawler_user_agents` list, Discourse will only allow 1 request every N seconds for that User Agent (N is the value of the `slow_down_crawler_rate` setting) and the rest of requests made within the same interval will get a 429 response.
The `slow_down_crawler_user_agents` setting becomes quite dangerous with this PR since it could rate limit lots if not all of anonymous traffic if the setting is not used appropriately. So to protect against this scenario, we've added a couple of new validations to the setting when it's changed:
1) each value added to setting must 3 characters or longer
2) each value cannot be a substring of tokens found in popular browser User Agent. The current list of prohibited values is: apple, windows, linux, ubuntu, gecko, firefox, chrome, safari, applewebkit, webkit, mozilla, macintosh, khtml, intel, osx, os x, iphone, ipad and mac.
This is a big change to change over to using the uppy
upload mixin in the composer by default. This gets rid
of the temporary composer-editor-uppy component, as well
as removing the old ComposerUpload mixin and copying over
any missing functions that were not yet implemented by
ComposerUploadUppy. This has been working well on our
hosting for some time now and has led us to several
bug fixes.
This commit also deletes the old plugin API for adding
preprocessors for the uploads. The accepted method of doing
this now is via an uppy preprocessor plugin, which we have
several examples of in the core codebase.
Leaving the `enable_experimental_composer_uploader` site setting
intact for now because some plugins still rely on it, this
will be removed at a later date.
One step closer to ending the jQuery file uploader saga...
Currently when a user creates posts that are moderated (for whatever
reason), a popup is displayed saying the post needs approval and the
total number of the user’s pending posts. But then this piece of
information is kind of lost and there is nowhere for the user to know
what are their pending posts or how many there are.
This patch solves this issue by adding a new “Pending” section to the
user’s activity page when there are some pending posts to display. When
there are none, then the “Pending” section isn’t displayed at all.
* FEATURE: Optionally send a 'noindex' header in non-canonical responses
This will be used in a SEO experiment.
Co-authored-by: David Taylor <david@taylorhq.com>
When this setting is turned on, it will check that normalized emails
are unique. Normalized emails are emails without any dots or plus
aliases.
This setting can be used to block use of aliases of the same email
address.
Use @here to mention all users that were allowed to topic directly or
through group, who liked topics or read the topic. Only first 10 users
will be notified.
When rendering the markdown code blocks we replace the
offending characters in the output string with spans highlighting a textual
representation of the character, along with a title attribute with
information about why the character was highlighted.
The list of characters stripped by this fix, which are the bidirectional
characters considered relevant, are:
U+202A
U+202B
U+202C
U+202D
U+202E
U+2066
U+2067
U+2068
U+2069
* FEATURE: Always show advanced invite options
The UI is more simple and more efficient than how it was when the
advanced options toggle was introduced. It does not make sense to keep
it anymore.
* UX: Minor copy edits
* UX: Merge expire invite controls
There were two controls in the create invite modal. One was a static
text that displayed how much time is left until the invite expires. The
other one was a datetime selector that set the time the invite expires.
This commit merges the two controls in a single one: staff users will
continue to see the datetime selector without the static text and
regular users will only see the static text because they cannot set
when the invite expires.
* UX: Remove invite link
It should only be visible after the invite was created.
This commit refactors the direct external upload routes (get presigned
put, complete external, create/abort/complete multipart) into a
helper which is then included in both BackupController and the
UploadController. This is done so UploadController doesn't need
strange backup logic added to it, and so each controller implementing
this helper can do their own validation/error handling nicely.
This is a follow up to e4350bb966
Currently, Discourse rate limits all incoming requests by the IP address they
originate from regardless of the user making the request. This can be
frustrating if there are multiple users using Discourse simultaneously while
sharing the same IP address (e.g. employees in an office).
This commit implements a new feature to make Discourse apply rate limits by
user id rather than IP address for users at or higher than the configured trust
level (1 is the default).
For example, let's say a Discourse instance is configured to allow 200 requests
per minute per IP address, and we have 10 users at trust level 4 using
Discourse simultaneously from the same IP address. Before this feature, the 10
users could only make a total of 200 requests per minute before they got rate
limited. But with the new feature, each user is allowed to make 200 requests
per minute because the rate limits are applied on user id rather than the IP
address.
The minimum trust level for applying user-id-based rate limits can be
configured by the `skip_per_ip_rate_limit_trust_level` global setting. The
default is 1, but it can be changed by either adding the
`DISCOURSE_SKIP_PER_IP_RATE_LIMIT_TRUST_LEVEL` environment variable with the
desired value to your `app.yml`, or changing the setting's value in the
`discourse.conf` file.
Requests made with API keys are still rate limited by IP address and the
relevant global settings that control API keys rate limits.
Before this commit, Discourse's auth cookie (`_t`) was simply a 32 characters
string that Discourse used to lookup the current user from the database and the
cookie contained no additional information about the user. However, we had to
change the cookie content in this commit so we could identify the user from the
cookie without making a database query before the rate limits logic and avoid
introducing a bottleneck on busy sites.
Besides the 32 characters auth token, the cookie now includes the user id,
trust level and the cookie's generation date, and we encrypt/sign the cookie to
prevent tampering.
Internal ticket number: t54739.
* FEATURE: display warning when sharing a topic in a restricted category
If a topic belongs to a category that is not readable by everyone, display a text warning of "Only visible to members of groups: [group_a], [group_b]"
* DEV: Adding a new category means we need to bump this value
* DEV: pass category to showModal
It makes much more sense for these to be GlobalSettings, since, in
multisite clusters, only the default site's settings would be respected.
Co-authored-by: David Taylor <david@taylorhq.com>
This commit adds the RailsMultisite middleware in test mode when Rails.configuration.multisite is true. This allows for much more realistic integration testing. The `multisite_spec.rb` file is rewritten to avoid needing to simulate a middleware stack.
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
* FIX: small copy fix for embedding.sample
* FIX: improve copy for site_description and short_site_description
As part of the setup wizard, improve the description of these two
strings to add context on where they will be used, so that it is
clearer how to write each one.
Meta discussion: https://meta.discourse.org/t/unclear-double-question-in-setup-wizard/208344/
This commit changes the emoji uploader to use the UppyUploadMixin,
and makes some minor changes to the emoji uploader (tightening the
copy for drag and drop and adding a percentage for the upload).
Since no other uppy upload mixin components have needed to upload
multiple files so far, this necessitated adding a tracker for the
in progress uploads so we know when to reset the uploader once all
uploads are complete.
At the moment, the emoji uploader cannot be used for direct S3 uploads
because the admin emoji controller creates other records and does other
magic with the emojis. At some point we need to factor this kind of thing
into the ExternalUploadManager.transform! action to complete external
uploads.
Trying to use a local test hostname other than localhost
(e.g. discourse.test )for discourse development was difficult due
the fact that localhost was hardcoded in a few places. This patch
uses existing environment variables to allow a developer to use a
different domain when developing.
Signed-off-by: Ryan Lerch <rlerch@redhat.com>
We are no longer able to display the image returned by Instagram directly within a Discourse site (either in the composer, or within a cooked post within a topic), so:
- Display an image placeholder in the composer preview
- A cooked post should use an iframe to display the Instagram 'embed' content
* DEV: Sanitize HTML admin inputs
This PR adds on-save HTML sanitization for:
Client site settings
translation overrides
badges descriptions
user fields descriptions
I used Rails's SafeListSanitizer, which [accepts the following HTML tags and attributes](018cf54073/lib/rails/html/sanitizer.rb (L108))
* Make sure that the sanitization logic doesn't corrupt settings with special characters
This commit groups `auth_overrides_*`, `discourse_connect_*` and `discourse_connect_provider_*` settings separately, rather than interspersing them.
There will be no functional change. This only affects the order in which they're shown in the admin panel