Commit Graph

379 Commits

Author SHA1 Message Date
David Taylor 1fa7a87f86
SECURITY: Remove ember-cli specific response from application routes (#15155)
Under some conditions, these varied responses could lead to cache poisoning, hence the 'security' label.

Previously the Rails application would serve JSON data in place of HTML whenever Ember CLI requested an `application.html.erb`-rendered page. This commit removes that logic, and instead parses the HTML out of the standard response. This means that Rails doesn't need to customize its response for Ember CLI.
2021-12-01 16:10:40 +00:00
Osama Sayegh 7bd3986b21
FEATURE: Replace `Crawl-delay` directive with proper rate limiting (#15131)
We have a couple of site setting, `slow_down_crawler_user_agents` and `slow_down_crawler_rate`, that are meant to allow site owners to signal to specific crawlers that they're crawling the site too aggressively and that they should slow down.

When a crawler is added to the `slow_down_crawler_user_agents` setting, Discourse currently adds a `Crawl-delay` directive for that crawler in `/robots.txt`. Unfortunately, many crawlers don't support the `Crawl-delay` directive in `/robots.txt` which leaves the site owners no options if a crawler is crawling the site too aggressively.

This PR replaces the `Crawl-delay` directive with proper rate limiting for crawlers added to the `slow_down_crawler_user_agents` list. On every request made by a non-logged in user, Discourse will check the User Agent string and if it contains one of the values of the `slow_down_crawler_user_agents` list, Discourse will only allow 1 request every N seconds for that User Agent (N is the value of the `slow_down_crawler_rate` setting) and the rest of requests made within the same interval will get a 429 response. 

The `slow_down_crawler_user_agents` setting becomes quite dangerous with this PR since it could rate limit lots if not all of anonymous traffic if the setting is not used appropriately. So to protect against this scenario, we've added a couple of new validations to the setting when it's changed:

1) each value added to setting must 3 characters or longer
2) each value cannot be a substring of tokens found in popular browser User Agent. The current list of prohibited values is: apple, windows, linux, ubuntu, gecko, firefox, chrome, safari, applewebkit, webkit, mozilla, macintosh, khtml, intel, osx, os x, iphone, ipad and mac.
2021-11-30 12:55:25 +03:00
Rafael dos Santos Silva 5647819de4
FEATURE: Send a 'noindex' header in non-canonical responses (#15026)
* FEATURE: Optionally send a 'noindex' header in non-canonical responses

This will be used in a SEO experiment.

Co-authored-by: David Taylor <david@taylorhq.com>
2021-11-25 16:58:39 -03:00
Osama Sayegh b86127ad12
FEATURE: Apply rate limits per user instead of IP for trusted users (#14706)
Currently, Discourse rate limits all incoming requests by the IP address they
originate from regardless of the user making the request. This can be
frustrating if there are multiple users using Discourse simultaneously while
sharing the same IP address (e.g. employees in an office).

This commit implements a new feature to make Discourse apply rate limits by
user id rather than IP address for users at or higher than the configured trust
level (1 is the default).

For example, let's say a Discourse instance is configured to allow 200 requests
per minute per IP address, and we have 10 users at trust level 4 using
Discourse simultaneously from the same IP address. Before this feature, the 10
users could only make a total of 200 requests per minute before they got rate
limited. But with the new feature, each user is allowed to make 200 requests
per minute because the rate limits are applied on user id rather than the IP
address.

The minimum trust level for applying user-id-based rate limits can be
configured by the `skip_per_ip_rate_limit_trust_level` global setting. The
default is 1, but it can be changed by either adding the
`DISCOURSE_SKIP_PER_IP_RATE_LIMIT_TRUST_LEVEL` environment variable with the
desired value to your `app.yml`, or changing the setting's value in the
`discourse.conf` file.

Requests made with API keys are still rate limited by IP address and the
relevant global settings that control API keys rate limits.

Before this commit, Discourse's auth cookie (`_t`) was simply a 32 characters
string that Discourse used to lookup the current user from the database and the
cookie contained no additional information about the user. However, we had to
change the cookie content in this commit so we could identify the user from the
cookie without making a database query before the rate limits logic and avoid
introducing a bottleneck on busy sites.

Besides the 32 characters auth token, the cookie now includes the user id,
trust level and the cookie's generation date, and we encrypt/sign the cookie to
prevent tampering.

Internal ticket number: t54739.
2021-11-17 23:27:30 +03:00
Akshay Birajdar 6b5e8be25a Support parsing array in #param_to_integer_list
Co-authored-by: Akshay Birajdar <akshay.birajdar@coupa.com>
2021-11-16 10:27:00 -05:00
David Taylor 9ac6f1d3bb
FIX: Include the Vary:Accept header on all Accept-based responses (#14647)
By default, Rails only includes the Vary:Accept header in responses when the Accept: header is included in the request. This means that proxies/browsers may cache a response to a request with a missing Accept header, and then later serve that cached version for a request which **does** supply the Accept header. This can lead to some very unexpected behavior in browsers.

This commit adds the Vary:Accept header for all requests, even if the Accept header is not present in the request. If a format parameter (e.g. `.json` suffix) is included in the path, then the Accept header is still omitted. (The format parameter takes precedence over any Accept: header, so the response is no longer varies based on the Accept header)
2021-10-25 12:53:50 +01:00
Alan Guo Xiang Tan 1937474e84
PERF: Avoid additional database query when viewing own user. (#14239) 2021-09-06 10:38:07 +08:00
Mark VanLandingham 7fc3d7bdde
DEV: Plugin API to add directory columns (#13440) 2021-06-22 13:00:04 -05:00
Kane York c72bf1d732 FEATURE: Improvement to history stack handling on server errors
The exception page is shown before Ember can actually figure out what the final destination URL we're going to is.
This means that the new page is not present in the history stack, so if we attempt to use the history stack to go back, we will actually navigate back by two steps.
By instead forcing a navigation to the current URL, we achieve the goal of going "back" with no history mucking.

Unfortunately, the actual URL that was attempted is not available. Additionally, this only works for the on-screen back button and not the browser back.

Additionally, several modernizations of the exception page code were made.
2021-06-21 11:09:23 -07:00
Alan Guo Xiang Tan 8e3691d537 PERF: Eager load Theme associations in Stylesheet Manager.
Before this change, calling `StyleSheet::Manager.stylesheet_details`
for the first time resulted in multiple queries to the database. This is
because the code was modelled in a way where each `Theme` was loaded
from the database one at a time.

This PR restructures the code such that it allows us to load all the
theme records in a single query. It also allows us to eager load the
required associations upfront. In order to achieve this, I removed the
support of loading multiple themes per request. It was initially added
to support user selectable theme components but the feature was never
completed and abandoned because it wasn't a feature that we thought was
worth building.
2021-06-21 11:06:58 +08:00
Mark VanLandingham 95b51669ad
DEV: Revert 3 commits for plugin API to add directory columns (#13423) 2021-06-17 12:37:37 -05:00
Mark VanLandingham 0c42a29dc4
DEV: Plugin API to allow creation of directory columns with item query (#13402)
The first thing we needed here was an enum rather than a boolean to determine how a directory_column was created. Now we have `automatic`, `user_field` and `plugin` directory columns.

This plugin API is assuming that the plugin has added a migration to a column to the `directory_items` table.

This was created to be initially used by discourse-solved. PR with API usage - https://github.com/discourse/discourse-solved/pull/137/
2021-06-17 09:06:18 -05:00
Robin Ward 651b8a23b8
FIX: Ember CLI was losing some preloaded data (#13406)
The `bootstrap.json` contains most preloaded information but some routes
provide extra information, such as invites.

This fixes the issue by having the preload request pass on the preloaded
data from the source page, which is then merged with the bootstrap's
preloaded data for the final HTML payload.
2021-06-16 13:45:02 -04:00
Mark VanLandingham 0cba4d73c1
FEATURE: Add user custom fields to user directory (#13238) 2021-06-07 12:34:01 -05:00
Martin Brennan e15c86e8c5
DEV: Topic tracking state improvements (#13218)
I merged this PR in yesterday, finally thinking this was done https://github.com/discourse/discourse/pull/12958 but then a wild performance regression occurred. These are the problem methods:

1aa20bd681/app/serializers/topic_tracking_state_serializer.rb (L13-L21)

Turns out date comparison is super expensive on the backend _as well as_ the frontend.

The fix was to just move the `treat_as_new_topic_start_date` into the SQL query rather than using the slower `UserOption#treat_as_new_topic_start_date` method in ruby. After this change, 1% of the total time is spent with the `created_in_new_period` comparison instead of ~20%.

----

History:

Original PR which had to be reverted **https://github.com/discourse/discourse/pull/12555**. See the description there for what this PR is achieving, plus below.

The issue with the original PR is addressed in 92ef54f402

If you went to the `x unread` link for a tag Chrome would freeze up and possibly crash, or eventually unfreeze after nearly 10 mins. Other routes for unread/new were similarly slow. From profiling the issue was the `sync` function of `topic-tracking-state.js`, which calls down to `isNew` which in turn calls `moment`, a change I had made in the PR above. The time it takes locally with ~1400 topics in the tracking state is 2.3 seconds.

To solve this issue, I have moved these calculations for "created in new period" and "unread not too old" into the tracking state serializer.

When I was looking at the profiler I also noticed this issue which was just compounding the problem. Every time we modify topic tracking state we recalculate the sidebar tracking/everything/tag counts. However this calls `forEachTracked` and `countTags` which can be quite expensive as they go through the whole tracking state (and were also calling the removed moment functions).

I added some logs and this was being called 30 times when navigating to a new /unread route because  `sync` is being called from `build-topic-route` (one for each topic loaded due to pagination). So I just added a debounce here and it makes things even faster.

Finally, I changed topic tracking state to use a Map so our counts of the state keys is faster (Maps have .size whereas objects you have to do Object.keys(obj) which is O(n).)

<!-- NOTE: All pull requests should have tests (rspec in Ruby, qunit in JavaScript). If your code does not include test coverage, please include an explanation of why it was omitted. -->
2021-06-02 09:06:29 +10:00
Osama Sayegh b81b24dea2
Revert "DEV: Topic tracking state improvements (#12958)" (#13209)
This reverts commit 002c676344.

Perf regression, we will redo it.
2021-05-31 17:47:42 +10:00
Martin Brennan 002c676344
DEV: Topic tracking state improvements (#12958)
Original PR which had to be reverted **https://github.com/discourse/discourse/pull/12555**. See the description there for what this PR is achieving, plus below.

The issue with the original PR is addressed in 92ef54f402

If you went to the `x unread` link for a tag Chrome would freeze up and possibly crash, or eventually unfreeze after nearly 10 mins. Other routes for unread/new were similarly slow. From profiling the issue was the `sync` function of `topic-tracking-state.js`, which calls down to `isNew` which in turn calls `moment`, a change I had made in the PR above. The time it takes locally with ~1400 topics in the tracking state is 2.3 seconds.

To solve this issue, I have moved these calculations for "created in new period" and "unread not too old" into the tracking state serializer.

When I was looking at the profiler I also noticed this issue which was just compounding the problem. Every time we modify topic tracking state we recalculate the sidebar tracking/everything/tag counts. However this calls `forEachTracked` and `countTags` which can be quite expensive as they go through the whole tracking state (and were also calling the removed moment functions).

I added some logs and this was being called 30 times when navigating to a new /unread route because  `sync` is being called from `build-topic-route` (one for each topic loaded due to pagination). So I just added a debounce here and it makes things even faster.

Finally, I changed topic tracking state to use a Map so our counts of the state keys is faster (Maps have .size whereas objects you have to do Object.keys(obj) which is O(n).)
2021-05-31 09:22:28 +10:00
Josh Soref 59097b207f
DEV: Correct typos and spelling mistakes (#12812)
Over the years we accrued many spelling mistakes in the code base. 

This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change 

- comments
- test descriptions
- other low risk areas
2021-05-21 11:43:47 +10:00
Joffrey JAFFEUX 81bf581aa9
DEV: rescues site setting missing exception (#13022)
This will allow to correctly catch it client side and display a correct error.
2021-05-11 10:36:57 +02:00
Osama Sayegh 6f8413fd85
DEV: Don't force Ember CLI for proxied requests made by Ember CLI (#12909) 2021-04-30 13:27:35 +03:00
Robin Ward 51f872f13a
DEV: Require Ember CLI to be used in development mode (#12738)
We really want to encourage all developers to use Ember CLI for local
development and testing. This will display an error page if they are not
with instructions on how to start the local server.

To disable it, you can set `NO_EMBER_CLI=1` as an ENV variable
2021-04-29 14:13:36 -04:00
Robin Ward e3b1d1a718
DEV: Improve Ember CLI's bootstrap logic (#12792)
* DEV: Give a nicer error when `--proxy` argument is missing

* DEV: Improve Ember CLI's bootstrap logic

Instead of having Ember CLI know which URLs to proxy or not, have it try
the URL with a special header `HTTP_X_DISCOURSE_EMBER_CLI`. If present,
and Discourse thinks we should bootstrap the application, it will
instead stop rendering and return a HTTP HEAD with a response header
telling Ember CLI to bootstrap.

In other words, any time Rails would otherwise serve up the HTML for the
Ember app, it stops and says "no, you do it."

* DEV: Support asset filters by path using a new options object

Without this, Ember CLI's bootstrap would not get the assets it wants
because the path it was requesting was different than the browser path.
This adds an optional request header to fix it.

So far this is only used by the styleguide.
2021-04-23 10:24:42 -04:00
Osama Sayegh cd24eff5d9
FEATURE: Introduce theme/component QUnit tests (take 2) (#12661)
This commit allows themes and theme components to have QUnit tests. To add tests to your theme/component, create a top-level directory in your theme and name it `test`, and Discourse will save all the files in that directory (and its sub-directories) as "tests files" in the database. While tests files/directories are not required to be organized in a specific way, we recommend that you follow Discourse core's tests [structure](https://github.com/discourse/discourse/tree/master/app/assets/javascripts/discourse/tests).

Writing theme tests should be identical to writing plugins or core tests; all the `import` statements and APIs that you see in core (or plugins) to define/setup tests should just work in themes.

You do need a working Discourse install to run theme tests, and you have 2 ways to run theme tests:

* In the browser at the `/qunit` route. `/qunit` will run tests of all active themes/components as well as core and plugins. The `/qunit` now accepts a `theme_name` or `theme_url` params that you can use to run tests of a specific theme/component like so: `/qunit?theme_name=<your_theme_name>`.

* In the command line using the `themes:qunit` rake task. This take is meant to run tests of a single theme/component so you need to provide it with a theme name or URL like so: `bundle exec rake themes:qunit[name=<theme_name>]` or `bundle exec rake themes:qunit[url=<theme_url>]`.

There are some refactors to how Discourse processes JavaScript that comes with themes/components, and these refactors may break your JS customizations; see https://meta.discourse.org/t/upcoming-core-changes-that-may-break-some-themes-components-april-12/186252?u=osama for details on how you can check if your themes/components are affected and what you need to do to fix them.

This commit also improves theme error handling in Discourse. We will now be able to catch errors that occur when theme initializers are run and prevent them from breaking the site and other themes/components.
2021-04-12 15:02:58 +03:00
Jarek Radosz 6ff888bd2c
DEV: Retry-after header values should be strings (#12475)
Fixes `Rack::Lint::LintError: a header value must be a String, but the value of 'Retry-After' is a Integer`. (see: 14a236b4f0/lib/rack/lint.rb (L676))

I found it when I got flooded by those warning a while back in a test-related accident 😉 (ember CLI tests were hitting a local rails server at a fast rate)
2021-03-23 20:32:36 +01:00
Dan Ungureanu 4e46732346
FEATURE: Implement browser update in crawler view (#12448)
browser-update script does not work correctly in some very old browsers
because the contents of <noscript> is not accessible in JavaScript.
For these browsers, the server can display the crawler page and add the
browser update notice.

Simply loading the browser-update script in the crawler view is not a
solution because that means all crawlers will also see it.
2021-03-22 19:41:42 +02:00
David Taylor 821bb1e8cb
FEATURE: Rename 'Discourse SSO' to DiscourseConnect (#11978)
The 'Discourse SSO' protocol is being rebranded to DiscourseConnect. This should help to reduce confusion when 'SSO' is used in the generic sense.

This commit aims to:
- Rename `sso_` site settings. DiscourseConnect specific ones are prefixed `discourse_connect_`. Generic settings are prefixed `auth_`
- Add (server-side-only) backwards compatibility for the old setting names, with deprecation notices
- Copy `site_settings` database records to the new names
- Rename relevant translation keys
- Update relevant translations

This commit does **not** aim to:
- Rename any Ruby classes or methods. This might be done in a future commit
- Change any URLs. This would break existing integrations
- Make any changes to the protocol. This would break existing integrations
- Change any functionality. Further normalization across DiscourseConnect and other auth methods will be done separately

The risks are:
- There is no backwards compatibility for site settings on the client-side. Accessing auth-related site settings in Javascript is fairly rare, and an error on the client side would not be security-critical.
- If a plugin is monkey-patching parts of the auth process, changes to locale keys could cause broken error messages. This should also be unlikely. The old site setting names remain functional, so security-related overrides will remain working.

A follow-up commit will be made with a post-deploy migration to delete the old `site_settings` rows.
2021-02-08 10:04:33 +00:00
Martin Brennan e58f9f7a55
DEV: Move logic for rate limiting user second factor to one place (#11941)
This moves all the rate limiting for user second factor (based on `params[:second_factor_token]` existing) to the one place, which rate limits by IP and also by username if a user is found.
2021-02-04 09:03:30 +10:00
Vinoth Kannan a5923ad603
DEV: apply allow origin response header for CDN requests. (#11893)
Currently, it creates a CORS error while accessing those static files.
2021-01-29 07:44:49 +05:30
Dan Ungureanu 7be556fc19
FIX: Ensure 'tr' is called on a string. (#11853)
It depends on the route, but sometimes 'id' parameter can contain a
slug-like value and sometimes it is just an ID. This should work in
both cases.
2021-01-27 10:43:33 +02:00
David Taylor 8b33e2f73d
FIX: Include locale in cache key for not_found_topics (#11406)
This ensures that users are only served cached content in their own language. This commit also refactors to make use of the `Discourse.cache` framework rather than direct redis access
2020-12-07 12:24:18 +00:00
Dan Ungureanu 123107c28f
UX: Add group name to error message (#11333)
The group name used to be part of the error message, but was removed
in a past commit.
2020-11-24 13:06:52 +02:00
David Taylor a7adf30357
FEATURE: Allow /u/by-external to work for all managed authenticators (#11168)
Previously, `/u/by-external/{id}` would only work for 'Discourse SSO' systems. This commit adds a new 'provider' parameter to the URL: `/u/by-external/{provider}/{id}`

This is compatible with all auth methods which have migrated to the 'ManagedAuthenticator' pattern. That includes all core providers, and also popular plugins such as discourse-oauth2-basic and discourse-openid-connect.

The new route is admin-only, since some authenticators use sensitive information like email addresses as the external id.
2020-11-10 10:41:46 +00:00
Vinoth Kannan af4938baf1
Revert "DEV: enable cors to all cdn get requests from workbox. (#10684)" (#11076)
This reverts commit e3de45359f.

We need to improve out strategy by adding a cache breaker with this change ... some assets on CDNs and clients may have incorrect CORS headers which can cause stuff to break.
2020-10-30 16:05:35 +11:00
Vinoth Kannan e3de45359f
DEV: enable cors to all cdn get requests from workbox. (#10685)
Now all external requests from the service worker will be in CORS mode without credentials.
2020-10-28 23:36:19 +05:30
Osama Sayegh a04c300495
DEV: Add optional ENV variables for MiniProfiler snapshots transporter (#10985) 2020-10-21 19:37:28 +03:00
Daniel Waterworth 721ee36425
Replace `base_uri` with `base_path` (#10879)
DEV: Replace instances of Discourse.base_uri with Discourse.base_path

This is clearer because the base_uri is actually just a path prefix. This continues the work started in 555f467.
2020-10-09 12:51:24 +01:00
Krzysztof Kotlarek 5cf411c3ae
FIX: move hp request from /users to /token (#10795)
`hp` is a valid username and we should not prevent users from registering it.
2020-10-02 09:01:40 +10:00
Gerhard Schlager 9d4009b0e8
FIX: Use correct locale for error messages (#10776)
Error messages for exceeded rate limits and invalid parameters always used the English locale instead of the default locale or the current user's locale.
2020-09-29 21:42:45 +02:00
David Taylor f1d64bbbe5
FEATURE: Add a site setting to control automatic auth redirect (#10732)
This allows administrators to stop automatic redirect to an external authenticator. It only takes effect when there is a single authentication method, and the site is login_required
2020-09-24 17:06:07 +01:00
Osama Sayegh a92d88747e
DEV: Add ENV variable for enabling MiniProfiler snapshots (#10690)
* DEV: Add ENV variable for enabling MiniProfiler snapshots

* MiniProfiler is not loaded in test env
2020-09-17 18:18:35 +03:00
Penar Musaraj e6dbb4fcf5
DEV: Live refresh all themes when watching stylesheets (#10337) 2020-07-30 19:03:24 -04:00
Dan Ungureanu 5e2e374c72
DEV: Fix build
Follow-up to bd3c0dd59f.
2020-07-30 13:10:16 +03:00
David Taylor f25fa83b6d
FIX: Ensure correct locale is set during RenderEmpty responses
Follow-up to bcb0e623
2020-07-28 22:20:38 +01:00
David Taylor c09b5807f3
FIX: Include resolved locale in anonymous cache key (#10289)
This only applies when set_locale_from_accept_language_header is enabled
2020-07-22 18:00:07 +01:00
David Taylor bcb0e62363
FIX: Make set_locale an around_action to avoid leaking between requests (#10282) 2020-07-22 17:30:26 +01:00
Guo Xiang Tan 62ad473716
FIX: Preload readonly mode attribute seperately.
There are two problems I'm trying to tackle here.

1. The site json is cached for anonymous users so readonly mode can be
cached for up to 30 minutes which makes it confusing.

2. We've already checked for readonly mode in the controller so having
to check for readonly mode again in `SiteSerializer` is adding an extra
Redis query on every request.
2020-06-12 09:54:05 +08:00
Gerhard Schlager 8c6a42c589 FIX: Redirects containing Unicode usernames didn't work 2020-06-08 10:26:29 +02:00
Sam Saffron 57a3d4e0d2
FEATURE: whitelist theme repo mode (experimental)
In some restricted setups all JS payloads need tight control.

This setting bans admins from making changes to JS on the site and
requires all themes be whitelisted to be used.

There are edge cases we still need to work through in this mode
hence this is still not supported in production and experimental.

Use an example like this to enable:

`DISCOURSE_WHITELISTED_THEME_REPOS="https://repo.com/repo.git,https://repo.com/repo2.git"`

By default this feature is not enabled and no changes are made.

One exception is that default theme id was missing a security check
this was added for correctness.
2020-06-03 13:19:57 +10:00
Dan Ungureanu 570b12a903
FEATURE: Show a detailed 404 page for private topics (#9894) 2020-05-27 20:10:01 +03:00
Artem Vasiliev 12544c02c1
FIX: add X-Robots-Tag header for check_xhr-covered GET actions, too (#9868)
* FIX: add X-Robots-Tag header for check_xhr-covered GET actions, too

see https://meta.discourse.org/t/missing-x-robots-tag/152593/3 for context

* test: a spec making sure X-Robots-Tag header is present when needed

/groups path responds to anonymous requests and doesn't skip `check_xhr` method, so we can use it here.
2020-05-27 11:57:05 -04:00