Basically, say you had already downloaded a certain image from a certain URL
using pull_hotlinked_images and the onebox. The upload would be stored
by its sha as an upload record. Whenever you linked to the same URL again
in a post (e.g. in our case an og:image on review.discourse) we would
would reuse the original upload record because of the sha1.
However when you turned on secure media this could cause problems as
the first post that uses that upload after secure media is enabled
will set the access control post for the upload to the new post.
Then if the post is deleted every single onebox/link to that same image
URL will fail forever with 403 as the secure-media-uploads URL fails
if the access control post has been deleted.
To fix this when cooking posts and pulling hotlinked images, we only
allow using an original upload by URL if its access control post
matches the current post, and if the original_sha1 is filled in,
meaning it was uploaded AFTER secure media was enabled. otherwise
we just redownload the media again to be safe, as the URL will always
be new then.
The new search modifier `in:all` can be used to include both public and personal messages in the same search.
Co-authored-by: adam j hartz <hz@mit.edu>
* DEV: Add a fake Mutex that for concurrency testing with Fibers
* DEV: Support running in sleep order in concurrency tests
* FIX: A separate FallbackHandler should be used for each redis pair
This commit refactors the FallbackHandler and Connector:
* There were two different ways to determine whether the redis master
was up. There is now one way and it is the responsibility of the
new RedisStatus class.
* A background thread would be created whenever `verify_master` was
called unless the thread already existed. The thread would
periodically check the status of the redis master. However, checking
that a thread is `alive?` is an ineffective way of determining
whether it will continue to check the redis master in the future
since the thread may be in the process of winding down.
Now, this thread is created when the recorded master status goes from
up to down. Since this thread runs the only part of the code that is
able to bring the recorded status up again, we ensure that only one
thread is probing the redis master at a time and that there is always
a thread probing redis master when it is recorded as being down.
* Each time the status of the redis master was checked periodically, it
would spawn a new thread and immediately join on it. I assume this
happened to isolate the check from the current execution, but since
the join rethrows exceptions in the parent thread, this was not
effective.
* The logic for falling back was spread over the FallbackHandler and
the Connector. The connector is now a dumb object that delegates
responsibility for determining the status of redis to the
FallbackHandler.
* Previously, failing to connect to a master redis instance when it was
not recorded as down would raise an exception. Now, this exception is
passed to `Discourse.warn_exception` and the connection is made to
the slave.
This commit introduces the FallbackHandlers singleton:
* It is responsible for holding the set of FallbackHandlers.
* It adds callbacks to the fallback handlers for when a redis master
comes up or goes down. Main redis and message bus redis may exist on
different or the same redis hosts and so these callbacks may all
exist on the same FallbackHandler or on separate ones.
These objects are tested using fake concurrency provided by the
Concurrency module:
* An `around(:each)` hook is used to cause each test to run inside a
Scenario so that the test body, mocking cleanup and `after(:each)`
callbacks are run in a different Fiber.
* Therefore, holting the execution of the Execution abruptly (so that
the fibers aren't run to completion), prevents the mocking cleaning
and `after(:each)` callbacks from running. I have tried to prevent
this by recovering from all exceptions during an Execution.
* FIX: Create frozen copies of passed in config where possible
* FIX: extract start_reset method and remove method used by tests
Co-authored-by: Daniel Waterworth <me@danielwaterworth.com>
Add TopicUploadSecurityManager to handle post moves. When a post moves around or a topic changes between categories and public/private message status the uploads connected to posts in the topic need to have their secure status updated, depending on the security context the topic now lives in.
When we were pulling hotlinked images for oneboxes in the CookedPostProcessor, we were using the direct S3 URL, which returned a 403 error and thus did not set widths and heights of the images. We now cook the URL first based on whether the upload is secure before handing off to FastImage.
Let's say post #2 quotes post number #1. If a user decides to quote the
quote in post #2, it should keep the information of post #1
("user_1, post: 1, topic: X"), instead of replacing with current post
info ("user_2, post: 2, topic: X").
* enqueue spam/dmarc failing emails instead of hiding
* add translations for dmarc/spam enqueued reasons
* unescape quote
* if email_in_authserv_id is blank return gray for all emails
### General Changes and Duplication
* We now consider a post `with_secure_media?` if it is in a read-restricted category.
* When uploading we now set an upload's secure status straight away.
* When uploading if `SiteSetting.secure_media` is enabled, we do not check to see if the upload already exists using the `sha1` digest of the upload. The `sha1` column of the upload is filled with a `SecureRandom.hex(20)` value which is the same length as `Upload::SHA1_LENGTH`. The `original_sha1` column is filled with the _real_ sha1 digest of the file.
* Whether an upload `should_be_secure?` is now determined by whether the `access_control_post` is `with_secure_media?` (if there is no access control post then we leave the secure status as is).
* When serializing the upload, we now cook the URL if the upload is secure. This is so it shows up correctly in the composer preview, because we set secure status on upload.
### Viewing Secure Media
* The secure-media-upload URL will take the post that the upload is attached to into account via `Guardian.can_see?` for access permissions
* If there is no `access_control_post` then we just deliver the media. This should be a rare occurrance and shouldn't cause issues as the `access_control_post` is set when `link_post_uploads` is called via `CookedPostProcessor`
### Removed
We no longer do any of these because we do not reuse uploads by sha1 if secure media is enabled.
* We no longer have a way to prevent cross-posting of a secure upload from a private context to a public context.
* We no longer have to set `secure: false` for uploads when uploading for a theme component.
This ensures that the user object is created fresh for each example.
This is required for this particular spec as we can not risk having a stale
object, which can lead to a flaky spec.
People rarely want to have their avatars show up as the preview image on social media platforms. Instead, we should fall back to the site opengraph image.
It used to check how many quotes were inside a post, without taking
considering that some quotes can contain other quotes. This commit
selects only top level quotes.
I had to use XPath because I could not find an equivalent CSS
selector.
Meta thread: https://meta.discourse.org/t/sending-a-pm-with-the-following-title-causes-an-error/135654/3
We had an issue where if someone sent a PM with crazy
characters that are stripped and we end up with only
a number, the topic redirect errored because the slug was
a number. so instead we return the default as well if
the slug is a number after prettification
The ROTP gem is only used in a very small amount of places in the app, we don't need to globally require it.
Also set the Addressable gem to not have a specific version range, as it has not been a problem yet.
Some slight refactoring of UserSecondFactor here too to use SecondFactorManager to avoid code repetition
This is a slight workaround which helps somewhat now but is pending a larger
fix.
When this spec ran in parallel mode uploads could start cross talking and
an upload you expect to be there may vanish.
This works around the issue by making the upload unique every time it is
created
It also folds up an expensive test into the main one.
* DEV: Add API to alter uploads Markdown
* DEV: Extract data attributes from image / download Markdown
For example '[test|attachment|hello=world]' will generate an 'a' element
with a data attribute: 'data-hello=world'.
This commit also makes MarkdownIt to transform '|attachment' into
'class="attachment"'. This transformation used to be a part of the
process which resolves short URLs (i.e. upload://).
* DEV: Export imageNameFromFileName
Non UTF-8 user_agent requests were bypassing logging due to PG always
wanting UTF-8 strings.
This adds some conversion to ensure we are always dealing with UTF-8
This fixes the following issues:
* The link element on the lightbox which pops open the lightbox was linking to the S3 URL with a private ACL instead of the secure media URL for the image
* Change to use `@post.with_secure_media?` in `CookedPostProcessor` for URL cooking, as in some cases, like when a post is edited and an upload is added, `upload.secure?` can be false which resulted in `srcset` URLs not being cooked correctly to secure media upload urls.
This feature adds the ability to define synonyms for tags, and the ability to merge one tag into another while keeping it as a synonym. For example, tags named "js" and "java-script" can be synonyms of "javascript". When searching and creating topics using synonyms, they will be mapped to the base tag.
Along with this change is a new UI found on each tag's page (for example, `/tags/javascript`) where more information about the tag can be shown. It will list the synonyms, which categories it's restricted to (if any), and which tag groups it belongs to (if tag group names are public on the `/tags` page by enabling the "tags listed by group" setting). Staff users will be able to manage tags in this UI, merge tags, and add/remove synonyms.
This bug was causing some unusual behavior when the last post is filtered (e.g. from an ignored user). In some situations this would cause suggested topics to be omitted from the payload.
The next_page specs have been updated to remove most of the stubs
* FEATURE: Ability to add components to all themes
This is the first and functional step from that topic https://dev.discourse.org/t/adding-a-theme-component-is-too-much-work/15398/16
The idea here is that when a new component is added, the user can easily assign it to all themes (parents).
To achieve that, I needed to change a site-setting component to accept `setDefaultValues` action and `setDefaultValuesLabel` translated label.
Also, I needed to add `allowAny` option to disable that for theme selector.
I also refactored backend to accept both parent and child ids with one method to avoid duplication (Renamed `add_child_theme!` to more general `add_relative_theme!`)
* FIX: Improvement after code review
* FIX: Improvement after code review2
* FIX: use mapBy and filterBy directly
We already cache failed onebox URL requests client-side, we now want to cache this on the server-side for extra protection. failed onebox previews will be cached for 1 hour, and any more requests for that URL will fail with a 404 status. Forcing a rebake via the Rebake HTML action will delete the failed URL cache (like how the oneboxer preview cache is deleted).
If a user has more than 60 active sessions, the oldest sessions will be terminated automatically. This protects performance when logging in and when loading the list of recently used devices.
This is a bottom up rewrite of Discourse cache to support faster performance
and a limited surface area.
ActiveSupport::Cache::Store accepts many options we do not use, this partial
implementation only picks the bits out that we do use and want to support.
Additionally params are named which avoids typos such as "expires_at" vs "expires_in"
This also moves a few spots in Discourse to use Discourse.cache over setex
Performance of setex and Discourse.cache.write is similar.
Discourse.cache is a more consistent method to use and offers clean fallback
if you are skipping redis
This is part of a larger change that both optimizes Discoruse.cache and omits
use of setex on $redis in favor of consistently using discourse cache
Bench does reveal that use of Rails.cache and Discourse.cache is 1.25x slower
than redis.setex / get so a re-implementation will follow prior to porting
PG 12 changes internals in a subtle way, time jitter is noticed in a few new
spots (which is normal) and default ordering is a bit different which is meant
to be random anyway.
There was an issue on dev where when uploading secure media, the href of the media was correctly being replaced in the CookedPostProcessor, but the srcset urls were not being replaced correctly. This is because UrlHelper.cook_url was returning the asset host URL for the media for secure media instead of returning early with the proxied secure proxy url.
In non-login-required sites, we prevent secure uploads already used in PMs from being used in public topics.
In login_required sites, secure uploads should be reusable in any topic, PM or not.
Previously our custom exception handler was unable to handle situations
where an invalid mime type was sent, resulting in a warning log
This ensures we pretend a request is HTML for the purpose of rendering
the error page if an invalid mime type from a scanner is shipped to the app
* Fix an issue where if an edit was made to a post with a reason provided, and then another edit was made with no reason, the original edit reason got wiped out
* We now always make a post revision (even with ninja edits) if an edit reason has been provided and it is different from the current edit reason
Co-Authored-By: Sam <sam.saffron@gmail.com>
This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access.
A few notes:
- the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads
- the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured
- upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status
- when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error
- when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3
Previously people were not consistent about mocking which left internals in
a fragile state when running subfolder specs.
This introduces a simple helper `set_subfolder` which you can use to set
the subfolder for the spec. It takes care of proper configuration of subfolder
and teardown.
```
# usage
set_subfolder "/my_amazing_subfolder"
```
You should no longer stub base_uri or global_settings
This version hash is used for the filename, and so browsers/CDNs cache based on it. Previously the version hash was based only on the list of requested icons. This can cause issues in a couple of situations, most commonly when developing themes with custom icons:
- A requested icon does not exist, and then later is added to the theme. The bundle output changes, but the hash did not
- The SVG content of an icon changes, but the name of the icon does not. The bundle output changes, but the hash did not
* When viewing a tag, the search widget will now show a checkbox to scope the search by tag, which will limit search results to that tag on desktop and mobile
This method had grown into a monster. Its query had bugs
that I couldn't fix, and new features would be hard to add.
Also I don't understand how it all works anymore...
Replace it with common table expressions that can be queried
to generate the results we need, instead of subtracting
results using lots of "NOT IN" clauses.
Fixed are bugs with tag schemas that use combinations of
tag groups, parent tags, and one-tag-per-topic restrictions.
For example: https://meta.discourse.org/t/130991/6
Previously we were always hard-coding expiry, this allows the secure session
to correctly handle custom expiry times
Also adds a ttl method for looking up time to live
Instead of enabling `suppress_from_latest` setting on many categories now we can enable `mute_all_categories_by_default` site setting. Then users should opt-in to categories for them to appear in the latest and categories pages.
This makes it easy to run multiple commands with the same keyword arguments. The main use is for using `chdir` across multiple commands. The `Dir.chdir` method is not concurrency safe because it switches the working directory of the entire process.
- Allow revoking keys without deleting them
- Auto-revoke keys after a period of no use (default 6 months)
- Allow multiple keys per user
- Allow attaching a description to each key, for easier auditing
- Log changes to keys in the staff action log
- Move all key management to one place, and improve the UI
This is a follow-up to the new feature that allows a category to
require a certain number of tags from a tag group. The tag input will
shows results from the required group if none have been chosen yet.
Once a require tag is selected, the tag input will include other
results as usual. Staff users can ignore this restriction, so the input
behaviour is unchanged for them.
* use image alt as a fallback when there's no title
* update spec
we used to check that the overlay information is added when the image has a titie. This adds 2 more scenarios. One where an image has both a title and an alt, in which case the title should be used and alt ignored.
The other is when there's only an alt, it should then be used to generate the overlay
Meta thread: https://meta.discourse.org/t/cant-dismiss-unread-if-last-post-is-an-assign-or-whisper/131823/7
* when sending a whisper, the highest_staff_post_number is set
in the next_post_number method for a Topic, but the
highest_post_number is left alone. this leaves a situation
where highest_staff_post_number is > highest_post_number
* when TopicsBulkAction#dismiss_posts was run, it was only setting the topic_user
highest_seen_post_number using the highest_post_number from the topic, so if
the user was staff and the last post in a topic was a whisper
their highest seen number was not set, and the topic stayed unread
Found through testing that the bug wasn't to do with Assign/Unassign as they do not affect the post numbers, only whispering does.
In a category's settings, the Tags tab has two new fields to
specify the number of tags that must be added to a topic
from a tag group. When creating a new topic, an error will be
shown to the user if the requirement isn't met.
This was not causing any known issue, because the system user ID is always the same across all sites. However, we should cache this on a per-site basis to be safe.
This ensures we only update last_posted_at which is user facing for non messages
and non whispers.
We still update this date for secure categories, we do not revert it for
deleted posts.
Adds the settings:
raw_email_max_length, raw_rejected_email_max_length, delete_rejected_email_after_days.
These settings control retention of the "raw" emails logs.
raw_email_max_length ensures that if we get incoming email that is huge we will truncate it removing uploads from the raw log.
raw_rejected_email_max_length introduces an even more aggressive truncation for rejected incoming mail.
delete_rejected_email_after_days controls how many days we will keep rejected emails for (default 90)
* FEATURE: Site setting/ui to allow users to set their primary group
* prettier and remove logic from account template
* added 1 to 43 to make web_hook_user_serializer_spec pass
While editing the first post it does't bumped the topic when the new post revision created. Because we wrongly assumed that the hidden tags are changed even when no tags are updated.
Doing .pluck(:column).first is a very common pattern in Discourse and in
most cases, a limit cause isn't being added. Instead of adding a limit
clause to all these callsites, this commit adds two new methods to
ActiveRecord::Relation:
pluck_first, equivalent to limit(1).pluck(*columns).first
and pluck_first! which, like other finder methods, raises an exception
when no record is found
Trying to truncate encoded slugs will mean that we have to keep the URL
valid, which can be tricky as you have to be aware of multibyte
characters.
Since we already have upper bounds for the title, the slug won't grow
for more than title*6 in the worst case. The slug column in the topic
table can store that just fine.
Added a test to ensure that a generated slug is a valid URL too, so we
don't introduce regressions in the future.
Previously the 'local_cdn_url' method didn't returned the correct cdn url. So we written few incorrect spec tests too.\n\nf92a6f7ac5228342177bf089d269e2f69a69e2f5
When an admin changes the site setting slug_generation_method to
encoded, we weren't really encoding the slug, but just allowing non-ascii
characters in the slug (unicode).
That brings problems when a user posts a link to topic without the slug, as
our topic controller tries to redirect the user to the correct URL that contains
the slug with unicode characters. Having unicode in the Location header in a
response is a RFC violation and some browsers end up in a redirection loop.
Bug report: https://meta.discourse.org/t/-/125371?u=falco
This commit also checks if a site uses encoded slugs and clear all saved slugs
in the db so they can be regenerated using an onceoff job.
Using popups is becoming increasingly rare. Full page redirects are already used on mobile, and for some providers. This commit removes all logic related to popup authentication, leaving only the full page redirect method.
For more info, see https://meta.discourse.org/t/do-we-need-popups-for-login/127988
* FEATURE: Adds an extra protection layer when decompressing files.
* Rename exporter/importer to zip importer. Update old locale
* Added a new composite class to decompress a file with multiple strategies
* Set max file size inside a site setting
* Ensure that file is deleted after compression
* Sanitize path and files before compressing/decompressing
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains.
We no longer need to use Rails "require_dependency" anywhere and instead can just use standard
Ruby patterns to require files.
This is a far reaching change and we expect some followups here.
This means that TL0 users can message groups with "Who can message this
group?" set to "Everyone".
It also means that members of a group with "Who can message this
group?" set to "members, moderators and admins" can also message the
group, even when their trust level is below min_trust_to_send_messages.
* Adjustments to pass specs on Rails 6.0.0
* Use classic autoloader instead of Zeitwerk
* Update Rails 6.0.0 deprecated methods
* Rails 6.0.0 not allowing column with integer name
* Drop freedom_patches/rails6.rb
* Default value for trigger_transactional_callbacks? is true
* Bump rspec-rails version to 4.0.0.beta2
* FEATURE: Add tl2 threshold for editing new posts
* Adds a new setting and for tl2 editing posts (30 days same as old value)
* Sets the tl0/tl1 editing period as 1 day
* FIX: Spec uses wrong setting
* Fix site setting on guardian spec
* FIX: post editing period specs
* Avoid shared examples
* Use update_columns to avoid callbacks on user during tests
This commit introduces 2 features:
1. DISCOURSE_COMPRESS_ANON_CACHE (true|false, default false): this allows
you to optionally compress the anon cache body entries in Redis, can be
useful for high load sites with Redis that lives on a separate server to
to webs
2. DISCOURSE_ANON_CACHE_STORE_THRESHOLD (default 2), only pop entries into
redis if we observe them more than N times. This avoids situations where
a crawler can walk a big pile of topics and store them all in Redis never
to be used. Our default anon cache time for topics is only 60 seconds. Anon
cache is in place to avoid the "slashdot" effect where a single topic is
hit by 100s of people in one minute.
Start tracking the date an api key was last used. This has already been
the case for user_api_keys.
This information can provide us with the ability to automatically expire
unused api keys after N days.
This allows custom plugins such as prometheus exporter to log how many
requests are stored in the anon cache vs used by the anon cache.
This metric allows us to fine tune cache behaviors
* Revert "Revert "FEATURE: Publish read state on group messages. (#7989) [Undo revert] (#8024)""
This reverts commit 36425eb9f0.
* Fix: Show who read only if the attribute is enabled
* PERF: Precalculate the last post readed by a group member
* Use book-reader icon instear of far-eye
* FIX: update topic groups correctly
* DEV: Tidy up read indicator update on write
This is a very long standing bug we had, if a plugin attempted to amend a
serializer core was not "correcting" the situation for all descendant classes
this often only showed up in production cause production eager loads serializers
prior to plugins amending them.
This is a critical fix for various plugins
* Reenable: "FEATURE: Publish read state on group messages. (#7989)"
This reverts commit 67f5cc1ce8.
* FIX: Read indicator only appears when the group setting is enabled
* Enable or disable read state based on group attribute
* When read state needs to be published, the minimum unread count is calculated in the topic query. This way, we can know if someone reads the last post
* The option can be enabled/disabled from the UI
* The read indicator will live-updated using message bus
* Show read indicator on every post
* The read indicator now shows read count and can be expanded to see user avatars
* Read count gets updated everytime someone reads a message
* Simplify topic-list read indicator logic
* Unsubscribe from message bus on willDestroyElement, removed unnecesarry values from post-menu, and added a comment to explain where does minimum_unread_count comes from
There are 5 visibility levels (similar to group visibility)
public (default)
logged-in users
members only
staff
owners
Admins & group owners always have visibility to group members.
When a user liked, unliked and liked again the same post, the poster
would receive a notification such as "X and X liked ...". This happened
because PostActionNotifier.post_action_created was called twice.
All posts created by the user are counted unless they are deleted,
belong to a PM sent between a non-human user and the user or belong
to a PM created by the user which doesn't have any other recipients.
It also makes the guardian prevent self-deletes when SSO is enabled.
- Client-side censoring fixed for non-chrome browsers. (Regular expression rewritten to avoid lookback)
- Regex generation is now done on the server, to reduce repeated logic, and make it easier to extend in plugins
- Censor tests are moved to ruby, to ensure everything works end-to-end
- If "watched words regular expressions" is enabled, warn the admin when the generated regex is invalid
This feature is off by default and can can be configured with the `email_total_attachment_size_limit_kb` site setting.
Co-authored-by: Maja Komel <maja.komel@gmail.com>
* FEATURE: Add search operator to see all direct messages from a user
* Only show message if related messages >= 5
* Make "all messages" the hyperlink
* Review
If a post arrives via email but must be reviewed, we now show an
icon that can be clicked to view the raw contents of the email.
This is useful if Discourse's email parser is acting odd and the user
reviewing the post wants to know what the original contents were before
approving/rejecting the post.
* Revert "Revert "FEATURE: admin/user exports are compressed using the zip format (#7784)""
This reverts commit f89bd55576.
* Replace .tar.zip with .zip
* FEATURE: admin/user exports are compressed using the zip format
* Update translations. Theme exporter now exports .zip file. Theme importer supports .zip and .gz files
* Fix controller test, updated locale and skip saving the csv export to disk
Context: https://meta.discourse.org/t/121589
This new setting option lets group owners message/mention large groups
without granting that privilege to all members.
Groups can now be marked as visible to "logged on users". All automatic groups (except `everyone`) are now visible to "logged on users", previously they were marked as public but suppressed in the group page for non-staff.
If a database exception is raised ActiveRecord will always rollback
even if caught.
Instead we build the query in manual SQL and DO NOTHING when there's a
conflict. If we detect nothing was done, perform an update.
The behaviour of #TERM in search has been amended
1. We try category or subcategory slugs
2. We try tags
3. We try tag-groups
The term `hello #my-group` will search for all posts tagged with any of
the tags in the tag group `My Group`
Future work may be introducing a slug cache here or caching it in the table
but the assumption is that the number of tag groups will not be huge
Adds a second factor landing page that centralizes a user's second factor configuration.
This contains both TOTP and Backup, and also allows multiple TOTP tokens to be registered and organized by a name. Access to this page is authenticated via password, and cached for 30 minutes via a secure session.
This can cause unbound CPU usage in some cases, and excessive logging in other cases. This commit moves redis readonly information into the local process, but maintains the DistributedCache for postgres readonly state.
* Support private uploads in S3
* Use localStore for local avatars
* Add job to update private upload ACL on S3
* Test multisite paths
* update ACL for private uploads in migrate_to_s3 task