Recently, we added feature that we are sending `/muted` to users who muted specific topic just before `/latest` so the client knows to ignore those messages - https://github.com/discourse/discourse/pull/9482
Same `/muted` message should be included when the post is edited
TLDR; this commit vastly improves how whitespaces are handled when converting from HTML to Markdown.
It also adds support for converting HTML <tables> to markdown tables.
The previous 'remove_whitespaces!' method was traversing the whole HTML tree and used a heuristic to remove
leading and trailing whitespaces whenever it was appropriate (ie. mostly before and after HTML block elements)
It was a good idea, but it was very limited and leaded to bad conversion when the html had leading whitespaces on several lines for example.
One such example can be found [here](https://meta.discourse.org/t/86782).
For various reasons, most of the whitespaces in a HTML file is ignored when the page is being displayed in a browser.
The rules that the browsers follow are the [CSS' White Space Processing Rules](https://www.w3.org/TR/css-text-3/#white-space-rules).
They can be quite complicated when you take into account RTL languages and other various tidbits but they boils down to the following:
- Collapse whitespaces down to one space (0x20) inside an inline context (ie. nodes/tags that are being displaying on the same line)
- Remove any leading/trailing whitespaces inside an inline context
One quick & dirty way of getting this 90% solved would be to do 'HTML.gsub!(/[[:space:]]+/, " ")'.
We would also need to hoist <pre> elements in order to not mess with their whitespaces.
Unfortunately, this solution let some whitespaces creep around HTML tags which leads to more '.strip!' calls than I can bear.
I decided to "emulate" the browser's handling of whitespaces and came up with a solution in 4 parts
1. remove_not_allowed!
The HtmlToMarkdown library is recursively "visiting" all the nodes in the HTML in order to convert them to Markdown.
All the nodes that aren't handled by the library (eg. <script>, <style> or any non-textual HTML tags) are "swallowed".
In order to reduce the number of nodes visited, the method 'remove_not_allowed!' will automatically delete all the nodes
that have no "visitor" (eg. a 'visit_<tag>' method) defined.
2. remove_hidden!
Similar purpose as the previous method (eg. reducing number of nodes visited), there's no point trying to convert something that is hidden.
The 'remove_hidden!' method removes any nodes that was hidden using the "hidden" HTML attribute, some CSS or with a width or height equal to 0.
3. hoist_line_breaks!
The 'hoist_line_breaks!' method is there to handle <br> tags. I know those tiny <br> don't do much but they can be quite annoying.
The <br> tags are inline elements but they visually work like a block element (ie. they create a new line).
If you have the following HTML "<i>Foo<br>Bar</i>", it ends up visually similar to "<i>Foo</i><br><i>Bar</i>".
The latter being much more easy to process than the former, so that's what this method is doing.
The "hoist_line_breaks" will hoist <br> tags out of inline tags until their parent is a block element.
4. remove_whitespaces!
The "remove_whitespaces!" is where all the whitespace removal is happening. It's broken down into 4 methods as well
- remove_whitespaces!
- is_inline?
- collapse_spaces!
- remove_trailing_space!
The 'remove_whitespace!' method is recursively walking the HTML tree (skipping <pre> tags).
If a node has any children, they will be chunked into groups of inline elements vs block elements.
For each chunks of inline elements, it will call the "collapse_space!" and "remove_trailing_space!" methods.
For each chunks of block elements, it will call "remote_whitespace!" to keep walking the HTML tree recursively.
The "is_inline?" method determines whether a node is part of a inline context.
A node is inline iif it's a text node or it's an inline tag, but not <br>, and all its children are also inline.
The "collapse_spaces!" method will collapse any kind of (white) space into a single space (" ") character, even accros tags.
For example, if we have " Foo \n<i> Bar </i>\t42", it will return "Foo <i>Bar </i>42".
Finally, the "remove_trailing_space!" method is there to remove any trailing space that might creep in at the end of the inline chunk.
This solution is not 100% bullet-proof.
It does not support RTL languages at all and has some caveats that I felt were not worth the work to get properly fixed.
FIX: better detection of hidden elements when converting HTML to Markdown
FIX: take into account the 'allowed_href_schemes' site setting when converting HTML <a> to Markdown
FIX: added support for 'mailto:' scheme when converting <a> from HTML to Markdown
FIX: added support for <img> dimensions when converting from HTML to Markdown
FIX: added support for <dl>, <dd> and <dt> when converting from HTML to Markdown
FIX: added support for multilines emphases, strongs and strikes when converting from HTML to Markdown
FIX: added support for <acronym> when converting from HTML to Markdown
DEV: remove unused 'sanitize' gem
Wow, did you just read all that?! Congratz, here's a cookie: 🍪.
We have the `# frozen_string_literal: true` comment on all our
files. This means all string literals are frozen. There is no need
to call #freeze on any literals.
For files with `# frozen_string_literal: true`
```
puts %w{a b}[0].frozen?
=> true
puts "hi".frozen?
=> true
puts "a #{1} b".frozen?
=> true
puts ("a " + "b").frozen?
=> false
puts (-("a " + "b")).frozen?
=> true
```
For more details see: https://samsaffron.com/archive/2018/02/16/reducing-string-duplication-in-ruby
Trigger an event for plugins to consume when a user session is refreshed.
This allows external auth to be notified about account activity, and be
able to take action such as use oauth refresh tokens to keep oauth
tokens valid.
If the feature is enabled, staff members can construct a URL and publish a
topic for others to browse without the regular Discourse chrome.
This is useful if you want to use Discourse like a CMS and publish
topics as articles, which can then be embedded into other systems.
* DEPRECATION: Remove support for api creds in query params
This commit removes support for api credentials in query params except
for a few whitelisted routes like rss/json feeds and the handle_mail
route.
Several tests were written to valid these changes, but the bulk of the
spec changes are just switching them over to use header based auth so
that they will pass without changing what they were actually testing.
Original commit that notified admins this change was coming was created
over 3 months ago: 2db2003187
* fix tests
* Also allow iCalendar feeds
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
* FIX: guardian always got user but sometimes it is anonymous
```
def initialize(user = nil, request = nil)
@user = user.presence || AnonymousUser.new
@request = request
end
```
AnonymouseUser defines `blank?` method
```
class AnonymousUser
def blank?
true
end
...
end
```
so if we would use @user.present? it would be correct, however, just @user is always true
* FEATURE: add setting `auto_approve_email_domains` to auto approve users
This commit adds a new site setting `auto_approve_email_domains` to
auto approve users based on their email address domain.
Note that if a domain already exists in `email_domains_whitelist` then
`auto_approve_email_domains` needs to be duplicated there as well,
since users won’t be able to register with email address that is
not allowed in `email_domains_whitelist`.
* Update config/locales/server.en.yml
Co-Authored-By: Robin Ward <robin.ward@gmail.com>
* FIX: Perform crop using user-specified image sizes
It used to resize the images to max width and height first and then
perform the crop operation. This is wrong because it ignored the user
specified image sizes from the Markdown.
* DEV: Use real images in test
Previously we would consider a user "present" and "last seen" if the
browser window was visible.
This has many edge cases, you could be considered present and around for
days just by having a window open and no screensaver on.
Instead we now also check that you either clicked, transitioned around app
or scrolled the page in the last minute in combination with window
visibility
This will lead to more reliable notifications via email and reduce load of
message bus for cases where a user walks away from the terminal
Previous to this change slugs for leaves in 3 level nestings would not work
Our UX picks only the last two levels
This also makes the results consistent for slugs as it enforces order.
- Define the CSP based on the requested domain / scheme (respecting force_https)
- Update EnforceHostname middleware to allow secondary domains, add specs
- Add URL scheme to anon cache key so that CSP headers are cached correctly
New `duration` attribute is introduced for the `set_or_create_timer` method in the commit aad12822b7 for "based on last post" and "auto delete replies" topic timers.
Two behaviors in the mail gem collide:
1. Attachments are added as extra parts at the top level,
2. When there are both text and html parts, the content type is set to
'multipart/alternative'.
Since attachments aren't alternative renderings, for emails that contain
attachments and both html and text parts, some coercing is necessary.
* This PR implements the scheduling and notification system for bookmark reminders. Every 5 minutes a schedule runs to check any reminders that need to be sent before now, limited to **300** reminders at a time. Any leftover reminders will be sent in the next run. This is to avoid having to deal with fickle sidekiq and reminders in the far-flung future, which would necessitate having a background job anyway to clean up any missing `enqueue_at` reminders.
* If a reminder is sent its `reminder_at` time is cleared and the `reminder_last_sent_at` time is filled in. Notifications are only user-level notifications for now.
* All JavaScript and frontend code related to displaying the bookmark reminder notification is contained here. The reminder functionality is now re-enabled in the bookmark modal as well.
* This PR also implements the "Remind me next time I am at my desktop" bookmark reminder functionality. When the user is on a mobile device they are able to select this option. When they choose this option we set a key in Redis saying they have a pending at desktop reminder. The next time they change devices we check if the new device is desktop, and if it is we send reminders using a DistributedMutex. There is also a job to ensure consistency of these reminders in Redis (in case Redis drops the ball) and the at desktop reminders expire after 20 days.
* Also in this PR is a fix to delete all Bookmarks for a user via `UserDestroyer`
There are three modifiers:
- serialize_topic_excerpts (boolean)
- csp_extensions (array of strings)
- svg_icons (array of strings)
When multiple themes are active, the values will be combined. The combination method varies based on the setting. CSP/SVG arrays will be combined. serialize_topic_excerpts will use `Enumerable#any`.
PostMover passes to PostCreator a `created_at` that is a `ActiveSupport::WithTimeZone` instance (and also `is_a? Time`). Previously it was always being passed through `Time.zone.parse` so it would lose sub-second information. Now, it takes `Time` input as-is, while still parsing other types.
This is because the TOTP gem identifies as a colon as an addressable
protocol. The solution for now is to remove the colon in the issuer
name.
Changing the issuer changes the token values, but now it was completely
broken for colons so this should not be breaking anyone new.
A follow-up correction to this change https://github.com/discourse/discourse/pull/9001.
When admin changes staff email still enforce old email confirm. Only allow auto-confirm of a new email by admin IF the target user is not also an admin. If an admin gets locked out of their email the site admin can use the rails console to solve the issue in a pinch.
When admin changes a user's email from the preferences page of that user:
* The user will not be sent an email to confirm that their
email is changing. They will be sent a reset password email
so they can set the password for their account at the new
email address.
* The user will still be sent an email to their old email to inform
them that it was changed.
* Admin and staff users still need to follow the same old + new
confirm process, as do users changing their own email.
If a group mention could be notified on preview it was given an `<a>`
tag with the `.notify` class. When cooked it would display differently.
This patch makes the server side cooking match the client preview.
After adding a tag as a synonym of another tag,
both tags will have the wrong topic counts. It's
corrected within 12 hours by the EnsureDbConsistency
job. This fix ensures the topic counts are updated
much sooner.
This is not used in core or official plugins, and has been printing a deprecation notice since v2.3.0beta4. All OpenID 2.0 code and dependencies have been dropped. The user_open_ids table remains for now, in case anyone has missed the deprecation notice, and needs to migrate their data.
Context at https://meta.discourse.org/t/-/113249
This commit removes logic about spoilers because it should live inside
of the discourse-spoiler-alert plugin.
This PR:
https://github.com/discourse/discourse-spoiler-alert/pull/38
also completely removes spoilers from excerpts in order to keep them
from leaking in topic previews and notifications.
Some auth providers (e.g. Auth0 with default configuration) send the email address in the name field. In Discourse, the name field is made public, so this commit adds a safeguard to prevent emails being made public.
For some reasons, we have two ways of associating "custom fields" to a new topic:
using 'meta_data' and 'custom_fields'.
However, if we were to provide both arguments, the 'meta_data' would be overwritten
by any 'custom_fields' provided.
This commit ensures we can use both and merges the 'custom_fields' with the 'meta_data'.
This fix ensures that the site setting `post_edit_time_limit` does not
bypass the limit of the site setting `min_trust_to_edit_post`. This
prevents a bug where users that did not meet the minimum trust level to
edit could edit the title of topics.
When we change upload's sha1 (e.g. when resizing images) it won't match the data in the most recent S3 inventory index. With this change the uploads that have been updated since the inventory has been generated are ignored.
When FinalDestination is given a URL it encodes it before doing anything else. however S3 presigned URLs should not be messed with in any way otherwise we can end up with 400 errors when downloading the URL e.g.
<Error><Code>InvalidToken</Code><Message>The provided token is malformed or otherwise invalid.</Message>
The signature of presigned URLs is very important and is automatically generated and should be preserved.
For example /t/ URLs were being replaced if they contained secure-media-uploads so if you made a topic called "Secure Media Uploads Are Cool" the View Topic link in the user notifications would be stripped out.
Refactored code so this secure URL detection happens in one place.
Previously if somehow a user created a blank markdown document using tag
tricks (eg `<p></p><p></p><p></p><p></p><p></p><p></p>`) and so on, we would
completely strip the document down to blank on post process due to onebox
hack.
Needs a followup cause I am still unclear about the reason for empty p stripping
and it can cause some unclear cases when we re-cook posts.
Basically, say you had already downloaded a certain image from a certain URL
using pull_hotlinked_images and the onebox. The upload would be stored
by its sha as an upload record. Whenever you linked to the same URL again
in a post (e.g. in our case an og:image on review.discourse) we would
would reuse the original upload record because of the sha1.
However when you turned on secure media this could cause problems as
the first post that uses that upload after secure media is enabled
will set the access control post for the upload to the new post.
Then if the post is deleted every single onebox/link to that same image
URL will fail forever with 403 as the secure-media-uploads URL fails
if the access control post has been deleted.
To fix this when cooking posts and pulling hotlinked images, we only
allow using an original upload by URL if its access control post
matches the current post, and if the original_sha1 is filled in,
meaning it was uploaded AFTER secure media was enabled. otherwise
we just redownload the media again to be safe, as the URL will always
be new then.
The new search modifier `in:all` can be used to include both public and personal messages in the same search.
Co-authored-by: adam j hartz <hz@mit.edu>
* DEV: Add a fake Mutex that for concurrency testing with Fibers
* DEV: Support running in sleep order in concurrency tests
* FIX: A separate FallbackHandler should be used for each redis pair
This commit refactors the FallbackHandler and Connector:
* There were two different ways to determine whether the redis master
was up. There is now one way and it is the responsibility of the
new RedisStatus class.
* A background thread would be created whenever `verify_master` was
called unless the thread already existed. The thread would
periodically check the status of the redis master. However, checking
that a thread is `alive?` is an ineffective way of determining
whether it will continue to check the redis master in the future
since the thread may be in the process of winding down.
Now, this thread is created when the recorded master status goes from
up to down. Since this thread runs the only part of the code that is
able to bring the recorded status up again, we ensure that only one
thread is probing the redis master at a time and that there is always
a thread probing redis master when it is recorded as being down.
* Each time the status of the redis master was checked periodically, it
would spawn a new thread and immediately join on it. I assume this
happened to isolate the check from the current execution, but since
the join rethrows exceptions in the parent thread, this was not
effective.
* The logic for falling back was spread over the FallbackHandler and
the Connector. The connector is now a dumb object that delegates
responsibility for determining the status of redis to the
FallbackHandler.
* Previously, failing to connect to a master redis instance when it was
not recorded as down would raise an exception. Now, this exception is
passed to `Discourse.warn_exception` and the connection is made to
the slave.
This commit introduces the FallbackHandlers singleton:
* It is responsible for holding the set of FallbackHandlers.
* It adds callbacks to the fallback handlers for when a redis master
comes up or goes down. Main redis and message bus redis may exist on
different or the same redis hosts and so these callbacks may all
exist on the same FallbackHandler or on separate ones.
These objects are tested using fake concurrency provided by the
Concurrency module:
* An `around(:each)` hook is used to cause each test to run inside a
Scenario so that the test body, mocking cleanup and `after(:each)`
callbacks are run in a different Fiber.
* Therefore, holting the execution of the Execution abruptly (so that
the fibers aren't run to completion), prevents the mocking cleaning
and `after(:each)` callbacks from running. I have tried to prevent
this by recovering from all exceptions during an Execution.
* FIX: Create frozen copies of passed in config where possible
* FIX: extract start_reset method and remove method used by tests
Co-authored-by: Daniel Waterworth <me@danielwaterworth.com>
Add TopicUploadSecurityManager to handle post moves. When a post moves around or a topic changes between categories and public/private message status the uploads connected to posts in the topic need to have their secure status updated, depending on the security context the topic now lives in.
When we were pulling hotlinked images for oneboxes in the CookedPostProcessor, we were using the direct S3 URL, which returned a 403 error and thus did not set widths and heights of the images. We now cook the URL first based on whether the upload is secure before handing off to FastImage.
Let's say post #2 quotes post number #1. If a user decides to quote the
quote in post #2, it should keep the information of post #1
("user_1, post: 1, topic: X"), instead of replacing with current post
info ("user_2, post: 2, topic: X").
* enqueue spam/dmarc failing emails instead of hiding
* add translations for dmarc/spam enqueued reasons
* unescape quote
* if email_in_authserv_id is blank return gray for all emails
### General Changes and Duplication
* We now consider a post `with_secure_media?` if it is in a read-restricted category.
* When uploading we now set an upload's secure status straight away.
* When uploading if `SiteSetting.secure_media` is enabled, we do not check to see if the upload already exists using the `sha1` digest of the upload. The `sha1` column of the upload is filled with a `SecureRandom.hex(20)` value which is the same length as `Upload::SHA1_LENGTH`. The `original_sha1` column is filled with the _real_ sha1 digest of the file.
* Whether an upload `should_be_secure?` is now determined by whether the `access_control_post` is `with_secure_media?` (if there is no access control post then we leave the secure status as is).
* When serializing the upload, we now cook the URL if the upload is secure. This is so it shows up correctly in the composer preview, because we set secure status on upload.
### Viewing Secure Media
* The secure-media-upload URL will take the post that the upload is attached to into account via `Guardian.can_see?` for access permissions
* If there is no `access_control_post` then we just deliver the media. This should be a rare occurrance and shouldn't cause issues as the `access_control_post` is set when `link_post_uploads` is called via `CookedPostProcessor`
### Removed
We no longer do any of these because we do not reuse uploads by sha1 if secure media is enabled.
* We no longer have a way to prevent cross-posting of a secure upload from a private context to a public context.
* We no longer have to set `secure: false` for uploads when uploading for a theme component.
This ensures that the user object is created fresh for each example.
This is required for this particular spec as we can not risk having a stale
object, which can lead to a flaky spec.
People rarely want to have their avatars show up as the preview image on social media platforms. Instead, we should fall back to the site opengraph image.
It used to check how many quotes were inside a post, without taking
considering that some quotes can contain other quotes. This commit
selects only top level quotes.
I had to use XPath because I could not find an equivalent CSS
selector.
Meta thread: https://meta.discourse.org/t/sending-a-pm-with-the-following-title-causes-an-error/135654/3
We had an issue where if someone sent a PM with crazy
characters that are stripped and we end up with only
a number, the topic redirect errored because the slug was
a number. so instead we return the default as well if
the slug is a number after prettification
The ROTP gem is only used in a very small amount of places in the app, we don't need to globally require it.
Also set the Addressable gem to not have a specific version range, as it has not been a problem yet.
Some slight refactoring of UserSecondFactor here too to use SecondFactorManager to avoid code repetition
This is a slight workaround which helps somewhat now but is pending a larger
fix.
When this spec ran in parallel mode uploads could start cross talking and
an upload you expect to be there may vanish.
This works around the issue by making the upload unique every time it is
created
It also folds up an expensive test into the main one.
* DEV: Add API to alter uploads Markdown
* DEV: Extract data attributes from image / download Markdown
For example '[test|attachment|hello=world]' will generate an 'a' element
with a data attribute: 'data-hello=world'.
This commit also makes MarkdownIt to transform '|attachment' into
'class="attachment"'. This transformation used to be a part of the
process which resolves short URLs (i.e. upload://).
* DEV: Export imageNameFromFileName
Non UTF-8 user_agent requests were bypassing logging due to PG always
wanting UTF-8 strings.
This adds some conversion to ensure we are always dealing with UTF-8
This fixes the following issues:
* The link element on the lightbox which pops open the lightbox was linking to the S3 URL with a private ACL instead of the secure media URL for the image
* Change to use `@post.with_secure_media?` in `CookedPostProcessor` for URL cooking, as in some cases, like when a post is edited and an upload is added, `upload.secure?` can be false which resulted in `srcset` URLs not being cooked correctly to secure media upload urls.
This feature adds the ability to define synonyms for tags, and the ability to merge one tag into another while keeping it as a synonym. For example, tags named "js" and "java-script" can be synonyms of "javascript". When searching and creating topics using synonyms, they will be mapped to the base tag.
Along with this change is a new UI found on each tag's page (for example, `/tags/javascript`) where more information about the tag can be shown. It will list the synonyms, which categories it's restricted to (if any), and which tag groups it belongs to (if tag group names are public on the `/tags` page by enabling the "tags listed by group" setting). Staff users will be able to manage tags in this UI, merge tags, and add/remove synonyms.
This bug was causing some unusual behavior when the last post is filtered (e.g. from an ignored user). In some situations this would cause suggested topics to be omitted from the payload.
The next_page specs have been updated to remove most of the stubs
* FEATURE: Ability to add components to all themes
This is the first and functional step from that topic https://dev.discourse.org/t/adding-a-theme-component-is-too-much-work/15398/16
The idea here is that when a new component is added, the user can easily assign it to all themes (parents).
To achieve that, I needed to change a site-setting component to accept `setDefaultValues` action and `setDefaultValuesLabel` translated label.
Also, I needed to add `allowAny` option to disable that for theme selector.
I also refactored backend to accept both parent and child ids with one method to avoid duplication (Renamed `add_child_theme!` to more general `add_relative_theme!`)
* FIX: Improvement after code review
* FIX: Improvement after code review2
* FIX: use mapBy and filterBy directly
We already cache failed onebox URL requests client-side, we now want to cache this on the server-side for extra protection. failed onebox previews will be cached for 1 hour, and any more requests for that URL will fail with a 404 status. Forcing a rebake via the Rebake HTML action will delete the failed URL cache (like how the oneboxer preview cache is deleted).
If a user has more than 60 active sessions, the oldest sessions will be terminated automatically. This protects performance when logging in and when loading the list of recently used devices.
This is a bottom up rewrite of Discourse cache to support faster performance
and a limited surface area.
ActiveSupport::Cache::Store accepts many options we do not use, this partial
implementation only picks the bits out that we do use and want to support.
Additionally params are named which avoids typos such as "expires_at" vs "expires_in"
This also moves a few spots in Discourse to use Discourse.cache over setex
Performance of setex and Discourse.cache.write is similar.
Discourse.cache is a more consistent method to use and offers clean fallback
if you are skipping redis
This is part of a larger change that both optimizes Discoruse.cache and omits
use of setex on $redis in favor of consistently using discourse cache
Bench does reveal that use of Rails.cache and Discourse.cache is 1.25x slower
than redis.setex / get so a re-implementation will follow prior to porting
PG 12 changes internals in a subtle way, time jitter is noticed in a few new
spots (which is normal) and default ordering is a bit different which is meant
to be random anyway.
There was an issue on dev where when uploading secure media, the href of the media was correctly being replaced in the CookedPostProcessor, but the srcset urls were not being replaced correctly. This is because UrlHelper.cook_url was returning the asset host URL for the media for secure media instead of returning early with the proxied secure proxy url.
In non-login-required sites, we prevent secure uploads already used in PMs from being used in public topics.
In login_required sites, secure uploads should be reusable in any topic, PM or not.
Previously our custom exception handler was unable to handle situations
where an invalid mime type was sent, resulting in a warning log
This ensures we pretend a request is HTML for the purpose of rendering
the error page if an invalid mime type from a scanner is shipped to the app
* Fix an issue where if an edit was made to a post with a reason provided, and then another edit was made with no reason, the original edit reason got wiped out
* We now always make a post revision (even with ninja edits) if an edit reason has been provided and it is different from the current edit reason
Co-Authored-By: Sam <sam.saffron@gmail.com>
This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access.
A few notes:
- the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads
- the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured
- upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status
- when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error
- when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3
Previously people were not consistent about mocking which left internals in
a fragile state when running subfolder specs.
This introduces a simple helper `set_subfolder` which you can use to set
the subfolder for the spec. It takes care of proper configuration of subfolder
and teardown.
```
# usage
set_subfolder "/my_amazing_subfolder"
```
You should no longer stub base_uri or global_settings
This version hash is used for the filename, and so browsers/CDNs cache based on it. Previously the version hash was based only on the list of requested icons. This can cause issues in a couple of situations, most commonly when developing themes with custom icons:
- A requested icon does not exist, and then later is added to the theme. The bundle output changes, but the hash did not
- The SVG content of an icon changes, but the name of the icon does not. The bundle output changes, but the hash did not
* When viewing a tag, the search widget will now show a checkbox to scope the search by tag, which will limit search results to that tag on desktop and mobile
This method had grown into a monster. Its query had bugs
that I couldn't fix, and new features would be hard to add.
Also I don't understand how it all works anymore...
Replace it with common table expressions that can be queried
to generate the results we need, instead of subtracting
results using lots of "NOT IN" clauses.
Fixed are bugs with tag schemas that use combinations of
tag groups, parent tags, and one-tag-per-topic restrictions.
For example: https://meta.discourse.org/t/130991/6
Previously we were always hard-coding expiry, this allows the secure session
to correctly handle custom expiry times
Also adds a ttl method for looking up time to live
Instead of enabling `suppress_from_latest` setting on many categories now we can enable `mute_all_categories_by_default` site setting. Then users should opt-in to categories for them to appear in the latest and categories pages.
This makes it easy to run multiple commands with the same keyword arguments. The main use is for using `chdir` across multiple commands. The `Dir.chdir` method is not concurrency safe because it switches the working directory of the entire process.
- Allow revoking keys without deleting them
- Auto-revoke keys after a period of no use (default 6 months)
- Allow multiple keys per user
- Allow attaching a description to each key, for easier auditing
- Log changes to keys in the staff action log
- Move all key management to one place, and improve the UI
This is a follow-up to the new feature that allows a category to
require a certain number of tags from a tag group. The tag input will
shows results from the required group if none have been chosen yet.
Once a require tag is selected, the tag input will include other
results as usual. Staff users can ignore this restriction, so the input
behaviour is unchanged for them.
* use image alt as a fallback when there's no title
* update spec
we used to check that the overlay information is added when the image has a titie. This adds 2 more scenarios. One where an image has both a title and an alt, in which case the title should be used and alt ignored.
The other is when there's only an alt, it should then be used to generate the overlay
Meta thread: https://meta.discourse.org/t/cant-dismiss-unread-if-last-post-is-an-assign-or-whisper/131823/7
* when sending a whisper, the highest_staff_post_number is set
in the next_post_number method for a Topic, but the
highest_post_number is left alone. this leaves a situation
where highest_staff_post_number is > highest_post_number
* when TopicsBulkAction#dismiss_posts was run, it was only setting the topic_user
highest_seen_post_number using the highest_post_number from the topic, so if
the user was staff and the last post in a topic was a whisper
their highest seen number was not set, and the topic stayed unread
Found through testing that the bug wasn't to do with Assign/Unassign as they do not affect the post numbers, only whispering does.
In a category's settings, the Tags tab has two new fields to
specify the number of tags that must be added to a topic
from a tag group. When creating a new topic, an error will be
shown to the user if the requirement isn't met.
This was not causing any known issue, because the system user ID is always the same across all sites. However, we should cache this on a per-site basis to be safe.
This ensures we only update last_posted_at which is user facing for non messages
and non whispers.
We still update this date for secure categories, we do not revert it for
deleted posts.
Adds the settings:
raw_email_max_length, raw_rejected_email_max_length, delete_rejected_email_after_days.
These settings control retention of the "raw" emails logs.
raw_email_max_length ensures that if we get incoming email that is huge we will truncate it removing uploads from the raw log.
raw_rejected_email_max_length introduces an even more aggressive truncation for rejected incoming mail.
delete_rejected_email_after_days controls how many days we will keep rejected emails for (default 90)
* FEATURE: Site setting/ui to allow users to set their primary group
* prettier and remove logic from account template
* added 1 to 43 to make web_hook_user_serializer_spec pass
While editing the first post it does't bumped the topic when the new post revision created. Because we wrongly assumed that the hidden tags are changed even when no tags are updated.
Doing .pluck(:column).first is a very common pattern in Discourse and in
most cases, a limit cause isn't being added. Instead of adding a limit
clause to all these callsites, this commit adds two new methods to
ActiveRecord::Relation:
pluck_first, equivalent to limit(1).pluck(*columns).first
and pluck_first! which, like other finder methods, raises an exception
when no record is found
Trying to truncate encoded slugs will mean that we have to keep the URL
valid, which can be tricky as you have to be aware of multibyte
characters.
Since we already have upper bounds for the title, the slug won't grow
for more than title*6 in the worst case. The slug column in the topic
table can store that just fine.
Added a test to ensure that a generated slug is a valid URL too, so we
don't introduce regressions in the future.
Previously the 'local_cdn_url' method didn't returned the correct cdn url. So we written few incorrect spec tests too.\n\nf92a6f7ac5228342177bf089d269e2f69a69e2f5
When an admin changes the site setting slug_generation_method to
encoded, we weren't really encoding the slug, but just allowing non-ascii
characters in the slug (unicode).
That brings problems when a user posts a link to topic without the slug, as
our topic controller tries to redirect the user to the correct URL that contains
the slug with unicode characters. Having unicode in the Location header in a
response is a RFC violation and some browsers end up in a redirection loop.
Bug report: https://meta.discourse.org/t/-/125371?u=falco
This commit also checks if a site uses encoded slugs and clear all saved slugs
in the db so they can be regenerated using an onceoff job.
Using popups is becoming increasingly rare. Full page redirects are already used on mobile, and for some providers. This commit removes all logic related to popup authentication, leaving only the full page redirect method.
For more info, see https://meta.discourse.org/t/do-we-need-popups-for-login/127988
* FEATURE: Adds an extra protection layer when decompressing files.
* Rename exporter/importer to zip importer. Update old locale
* Added a new composite class to decompress a file with multiple strategies
* Set max file size inside a site setting
* Ensure that file is deleted after compression
* Sanitize path and files before compressing/decompressing
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains.
We no longer need to use Rails "require_dependency" anywhere and instead can just use standard
Ruby patterns to require files.
This is a far reaching change and we expect some followups here.
This means that TL0 users can message groups with "Who can message this
group?" set to "Everyone".
It also means that members of a group with "Who can message this
group?" set to "members, moderators and admins" can also message the
group, even when their trust level is below min_trust_to_send_messages.
* Adjustments to pass specs on Rails 6.0.0
* Use classic autoloader instead of Zeitwerk
* Update Rails 6.0.0 deprecated methods
* Rails 6.0.0 not allowing column with integer name
* Drop freedom_patches/rails6.rb
* Default value for trigger_transactional_callbacks? is true
* Bump rspec-rails version to 4.0.0.beta2
* FEATURE: Add tl2 threshold for editing new posts
* Adds a new setting and for tl2 editing posts (30 days same as old value)
* Sets the tl0/tl1 editing period as 1 day
* FIX: Spec uses wrong setting
* Fix site setting on guardian spec
* FIX: post editing period specs
* Avoid shared examples
* Use update_columns to avoid callbacks on user during tests
This commit introduces 2 features:
1. DISCOURSE_COMPRESS_ANON_CACHE (true|false, default false): this allows
you to optionally compress the anon cache body entries in Redis, can be
useful for high load sites with Redis that lives on a separate server to
to webs
2. DISCOURSE_ANON_CACHE_STORE_THRESHOLD (default 2), only pop entries into
redis if we observe them more than N times. This avoids situations where
a crawler can walk a big pile of topics and store them all in Redis never
to be used. Our default anon cache time for topics is only 60 seconds. Anon
cache is in place to avoid the "slashdot" effect where a single topic is
hit by 100s of people in one minute.
Start tracking the date an api key was last used. This has already been
the case for user_api_keys.
This information can provide us with the ability to automatically expire
unused api keys after N days.
This allows custom plugins such as prometheus exporter to log how many
requests are stored in the anon cache vs used by the anon cache.
This metric allows us to fine tune cache behaviors
* Revert "Revert "FEATURE: Publish read state on group messages. (#7989) [Undo revert] (#8024)""
This reverts commit 36425eb9f0.
* Fix: Show who read only if the attribute is enabled
* PERF: Precalculate the last post readed by a group member
* Use book-reader icon instear of far-eye
* FIX: update topic groups correctly
* DEV: Tidy up read indicator update on write
This is a very long standing bug we had, if a plugin attempted to amend a
serializer core was not "correcting" the situation for all descendant classes
this often only showed up in production cause production eager loads serializers
prior to plugins amending them.
This is a critical fix for various plugins
* Reenable: "FEATURE: Publish read state on group messages. (#7989)"
This reverts commit 67f5cc1ce8.
* FIX: Read indicator only appears when the group setting is enabled
* Enable or disable read state based on group attribute
* When read state needs to be published, the minimum unread count is calculated in the topic query. This way, we can know if someone reads the last post
* The option can be enabled/disabled from the UI
* The read indicator will live-updated using message bus
* Show read indicator on every post
* The read indicator now shows read count and can be expanded to see user avatars
* Read count gets updated everytime someone reads a message
* Simplify topic-list read indicator logic
* Unsubscribe from message bus on willDestroyElement, removed unnecesarry values from post-menu, and added a comment to explain where does minimum_unread_count comes from
There are 5 visibility levels (similar to group visibility)
public (default)
logged-in users
members only
staff
owners
Admins & group owners always have visibility to group members.
When a user liked, unliked and liked again the same post, the poster
would receive a notification such as "X and X liked ...". This happened
because PostActionNotifier.post_action_created was called twice.
All posts created by the user are counted unless they are deleted,
belong to a PM sent between a non-human user and the user or belong
to a PM created by the user which doesn't have any other recipients.
It also makes the guardian prevent self-deletes when SSO is enabled.
- Client-side censoring fixed for non-chrome browsers. (Regular expression rewritten to avoid lookback)
- Regex generation is now done on the server, to reduce repeated logic, and make it easier to extend in plugins
- Censor tests are moved to ruby, to ensure everything works end-to-end
- If "watched words regular expressions" is enabled, warn the admin when the generated regex is invalid
This feature is off by default and can can be configured with the `email_total_attachment_size_limit_kb` site setting.
Co-authored-by: Maja Komel <maja.komel@gmail.com>
* FEATURE: Add search operator to see all direct messages from a user
* Only show message if related messages >= 5
* Make "all messages" the hyperlink
* Review
If a post arrives via email but must be reviewed, we now show an
icon that can be clicked to view the raw contents of the email.
This is useful if Discourse's email parser is acting odd and the user
reviewing the post wants to know what the original contents were before
approving/rejecting the post.
* Revert "Revert "FEATURE: admin/user exports are compressed using the zip format (#7784)""
This reverts commit f89bd55576.
* Replace .tar.zip with .zip
* FEATURE: admin/user exports are compressed using the zip format
* Update translations. Theme exporter now exports .zip file. Theme importer supports .zip and .gz files
* Fix controller test, updated locale and skip saving the csv export to disk
Context: https://meta.discourse.org/t/121589
This new setting option lets group owners message/mention large groups
without granting that privilege to all members.
Groups can now be marked as visible to "logged on users". All automatic groups (except `everyone`) are now visible to "logged on users", previously they were marked as public but suppressed in the group page for non-staff.
If a database exception is raised ActiveRecord will always rollback
even if caught.
Instead we build the query in manual SQL and DO NOTHING when there's a
conflict. If we detect nothing was done, perform an update.
The behaviour of #TERM in search has been amended
1. We try category or subcategory slugs
2. We try tags
3. We try tag-groups
The term `hello #my-group` will search for all posts tagged with any of
the tags in the tag group `My Group`
Future work may be introducing a slug cache here or caching it in the table
but the assumption is that the number of tag groups will not be huge
Adds a second factor landing page that centralizes a user's second factor configuration.
This contains both TOTP and Backup, and also allows multiple TOTP tokens to be registered and organized by a name. Access to this page is authenticated via password, and cached for 30 minutes via a secure session.
This can cause unbound CPU usage in some cases, and excessive logging in other cases. This commit moves redis readonly information into the local process, but maintains the DistributedCache for postgres readonly state.
* Support private uploads in S3
* Use localStore for local avatars
* Add job to update private upload ACL on S3
* Test multisite paths
* update ACL for private uploads in migrate_to_s3 task
This adds support for DISCOURSE_ENABLE_PERFORMANCE_HTTP_HEADERS
when set to `true` this will turn on performance related headers
```text
X-Redis-Calls: 10 # number of redis calls
X-Redis-Time: 1.02 # redis time in seconds
X-Sql-Commands: 102 # number of SQL commands
X-Sql-Time: 1.02 # duration in SQL in seconds
X-Queue-Time: 1.01 # time the request sat in queue (depends on NGINX)
```
To get queue time NGINX must provide: HTTP_X_REQUEST_START
We do not recommend you enable this without thinking, it exposes information
about what your page is doing, usually you would only enable this if you
intend to strip off the headers further down the stream in a proxy
Previously as soon as any override was defined we would regress to the slow
path for locale lookups. Additionally if `raise: true` was specified which
rails likes to add in views we would bypass the cache
The new design manages to use the fast path for many more cases
SSO uses a special param to username suggester that whitelists a username
due to previous work we amended our lookup logic and started ignoring this
whitelist.
The fix ensures we always respect it, and also improves on the original
implementation that forgot to normalize the username.
This also corrects FileHelper.download so it supports "follow_redirect"
correctly (it used to always follow 1 redirect) and adds a `validate_url`
param that will bypass all uri validation if set to false (default is true)
Net::HTTP always returns ASCII-8BIT encoding. File.read auto-detects the encoding. This leads to an encoding inconsistency between a fresh download, and a cached download. This commit ensures all downloaded files are treated equally, by always returning the cached version from the filesystem, even during initial download.
One symptom of this problem is during theme exports: https://meta.discourse.org/t/116907
Related ruby ticket: https://bugs.ruby-lang.org/issues/2567
Previously username suggester would give up after 100 attempts at getting
a username and fallback to random string.
This amends the logic so we do all the work of figuring out a good username
in SQL and avoids a large amount of queries in cases where a lot of usernames
were used up.
This corrects an issue on sites with large numbers of anon users
Use the cooked version of the post and the quote to compare their content in
order to take into account the "typographer" option of the markdown pipeline.
We were blocking user registrations with same username and password,
but allowing usernames to be changed to be same as password later.
Also disallow names to be the same as password.
* English shouldn't fallback to any other locale
* Calculate fallback for default locale if it isn't English (useful for en_US)
* Reuse the fallback locale list when outputting translations to JavaScript
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
The instagram onebox sometimes surrounds the image with an `<a>` tag, which was breaking the aspect ratio logic, and therefore causing posts to change height on load.
Since 5bfe051e, Discourse user agents are marked as non-crawlers (to avoid accidental blacklisting). This makes sure pageviews for these agents are tracked as crawler hits.
We found score hard to understand. It is still there behind the scenes
for sorting purposes, but it is no longer shown.
You can now filter by minimum priority (low, med, high) instead of
score.