When the server gets overloaded and lots of requests start queuing server
will attempt to shed load by returning 429 errors on background requests.
The client can flag a request as background by setting the header:
`Discourse-Background` to `true`
Out-of-the-box we shed load when the queue time goes above 0.5 seconds.
The only request we shed at the moment is the request to load up a new post
when someone posts to a topic.
We can extend this as we go with a more general pattern on the client.
Previous to this change, rate limiting would "break" the post stream which
would make suggested topics vanish and users would have to scroll the page
to see more posts in the topic.
Server needs this protection for cases where tons of clients are navigated
to a topic and a new post is made. This can lead to a self inflicted denial
of service if enough clients are viewing the topic.
Due to the internal security design of Discourse it is hard for a large
number of clients to share a channel where we would pass the full post body
via the message bus.
It also renames (and deprecates) triggerNewPostInStream to triggerNewPostsInStream
This allows us to load a batch of new posts cleanly, so the controller can
keep track of a backlog
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
- Define the CSP based on the requested domain / scheme (respecting force_https)
- Update EnforceHostname middleware to allow secondary domains, add specs
- Add URL scheme to anon cache key so that CSP headers are cached correctly
Many security scanners ship invalid mime types, this ensures we return
a very cheap response to the clients and do not log anything.
Previous attempt still re-dispatched the request to get proper error page
but in this specific case we want no error page.
Non UTF-8 user_agent requests were bypassing logging due to PG always
wanting UTF-8 strings.
This adds some conversion to ensure we are always dealing with UTF-8
Previously our custom exception handler was unable to handle situations
where an invalid mime type was sent, resulting in a warning log
This ensures we pretend a request is HTML for the purpose of rendering
the error page if an invalid mime type from a scanner is shipped to the app
This commit introduces 2 features:
1. DISCOURSE_COMPRESS_ANON_CACHE (true|false, default false): this allows
you to optionally compress the anon cache body entries in Redis, can be
useful for high load sites with Redis that lives on a separate server to
to webs
2. DISCOURSE_ANON_CACHE_STORE_THRESHOLD (default 2), only pop entries into
redis if we observe them more than N times. This avoids situations where
a crawler can walk a big pile of topics and store them all in Redis never
to be used. Our default anon cache time for topics is only 60 seconds. Anon
cache is in place to avoid the "slashdot" effect where a single topic is
hit by 100s of people in one minute.
This allows custom plugins such as prometheus exporter to log how many
requests are stored in the anon cache vs used by the anon cache.
This metric allows us to fine tune cache behaviors
This reverts commit 39c31a3d76.
Sorry about this, we have decided againse supporting 0-RTT directly in
core, this can be supported with similar hacks to this commit in a
plugin.
That said, we recommend against using a 0-RTT proxy for the Discourse
app due to inherit risk of replay attacks.
This displays more useful messages for the most common issues we see:
- CSRF (when the user switches browser)
- Invalid IAT (when the server clock is wrong)
- OAuth::Unauthorized for OAuth1 providers, when the credentials are incorrect
This commit also stops earlier for disabled authenticators. Now we stop at the request phase, rather than the callback phase.
The message_bus performs a fair amount of work prior to hijacking requests
this change ensures that if there is a situation where the server is flooded
message_bus will inform client to back off for 30 seconds + random(120 secs)
This back-off is ultra cheap and happens very early in the middleware.
It corrects a situation where a flood to message bus could cause the app
to become unresponsive
MessageBus update is here to ensure message_bus gem properly respects
Retry-After header and status 429.
Under normal state this code should never trigger, to disable raise the
value of DISCOURSE_REJECT_MESSAGE_BUS_QUEUE_SECONDS, default is to tell
message bus to go away if we are queueing for 100ms or longer
This adds support for DISCOURSE_ENABLE_PERFORMANCE_HTTP_HEADERS
when set to `true` this will turn on performance related headers
```text
X-Redis-Calls: 10 # number of redis calls
X-Redis-Time: 1.02 # redis time in seconds
X-Sql-Commands: 102 # number of SQL commands
X-Sql-Time: 1.02 # duration in SQL in seconds
X-Queue-Time: 1.01 # time the request sat in queue (depends on NGINX)
```
To get queue time NGINX must provide: HTTP_X_REQUEST_START
We do not recommend you enable this without thinking, it exposes information
about what your page is doing, usually you would only enable this if you
intend to strip off the headers further down the stream in a proxy
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
Since 5bfe051e, Discourse user agents are marked as non-crawlers (to avoid accidental blacklisting). This makes sure pageviews for these agents are tracked as crawler hits.
By default, this does nothing. Two environment variables are available:
- `DISCOURSE_LOG_SIDEKIQ`
Set to `"1"` to enable logging. This will log all completed jobs to `log/rails/sidekiq.log`, along with various db/redis/network statistics. This is useful to track down poorly performing jobs.
- `DISCOURSE_LOG_SIDEKIQ_INTERVAL`
(seconds) Check running jobs periodically, and log their current duration. They will appear in the logs with `status:pending`. This is useful to track down jobs which take a long time, then crash sidekiq before completing.
This avoids require dependency on method_profiler and anon cache.
It means that if there is any change to these files the reloader will not pick it up.
Previously the reloader was picking up the anon cache twice causing it to double load on boot.
This caused warnings.
Long term my plan is to give up on require dependency and instead use:
https://github.com/Shopify/autoload_reloader
Previously the 'reconnect' process was a bit magic - IF you were already logged into discourse, and followed the auth flow, your account would be reconnected and you would be 'logged in again'.
Now, we explicitly check for a reconnect=true parameter when the flow is started, store it in the session, and then only follow the reconnect logic if that variable is present. Setting this parameter also skips the 'logged in again' step, which means reconnect now works with 2fa enabled.
Also added some helpful functionality for plugin developers:
- Raises RuntimeException if the auth provider has been registered too late
- Logs use of deprecated parameters
* Phase 0 for user-selectable theme components
- Drops `key` column from the `themes` table
- Drops `theme_key` column from the `user_options` table
- Adds `theme_ids` (array of ints default []) column to the `user_options` table and migrates data from `theme_key` to the new column.
- Removes the `default_theme_key` site setting and adds `default_theme_id` instead.
- Replaces `theme_key` cookie with a new one called `theme_ids`
- no longer need Theme.settings_for_client
This refinement of previous fix moves the crawler blocking into
anonymous cache
This ensures we never poison the cache incorrectly when blocking crawlers
If "logged in" is being forced anonymous on certain routes, trigger
the protection for any requests that spend 50ms queueing
This means that ...
1. You need to trip it by having 3 requests take longer than 1 second in 10 second interval
2. Once tripped, if your route is still spending 50m queueuing it will continue to be protected
This means that site will continue to function with almost no delays while it is scaling up to handle the new load
If a particular path is being hit extremely hard by logged on users,
revert to anonymous cached view.
This will only come into effect if 3 requests queue for longer than 2 seconds
on a *single* path.
This can happen if a URL is shared with the entire forum base and everyone
is logged on
Detailed request loggers can be used to gather rich timing info
from all requests (which in turn can be forwarded to monitoring solution)
Middleware::RequestTracker.detailed_request_logger(->|env, data| do
# do stuff with env and data
end
This feature introduces the concept of themes. Themes are an evolution
of site customizations.
Themes introduce two very big conceptual changes:
- A theme may include other "child themes", children can include grand
children and so on.
- A theme may specify a color scheme
The change does away with the idea of "enabled" color schemes.
It also adds a bunch of big niceties like
- You can source a theme from a git repo
- History for themes is much improved
- You can only have a single enabled theme. Themes can be selected by
users, if you opt for it.
On a technical level this change comes with a whole bunch of goodies
- All CSS is now compiled using a custom pipeline that uses libsass
see /lib/stylesheet
- There is a single pipeline for css compilation (in the past we used
one for customizations and another one for the rest of the app
- The stylesheet pipeline is now divorced of sprockets, there is no
reliance on sprockets for CSS bundling
- CSS is generated with source maps everywhere (including themes) this
makes debugging much easier
- Our "live reloader" is smarter and avoid a flash of unstyled content
we run a file watcher in "puma" in dev so you no longer need to run
rake autospec to watch for CSS changes
FIX: keep the "uploading..." indicator until the server replies via the MessageBus
FIX: text was disapearing when uploading an avatar
PERF: always use a region for S3 (defaults to 'us-east-1')
FEATURE: ApplyCDN middleware when using S3
FIX: use the same pattern to store files on S3 and locally
PERF: keep a local cache of uploads when generating thumbnails
FEATURE: migrate_to_s3 rake task