In some cases, between Discourse forums the hostname of a URL could match if they are hosting S3 files on the same bucket but the S3 bucket path might not. So e.g. https://testbucket.somesite.com/testpath/some/file/url.png vs https://testbucket.somesite.com/prodpath/some/file/url.png. So has_been_uploaded? was returning true for the second URL, even though it may have been uploaded on a different Discourse forum.
This is a very rare case but must be accounted for, because this impacts UrlHelper.is_local which mistakenly thinks the file has already been downloaded and thus allows the URL to be cooked, where we want to return the full URL to be downloaded using PullHotlinkedImages.
* the post_actions table has no FK to users, so if a user has been
deleted we may end up with dangling post_action records, which then
interferes with the bookmarks migration because bookmarks DO have
an FK to users
PG already handles English stop words, the list in cppjieba is
bigger than the list PG uses, which in turn causes confusion cause
words such as "volume" are stripped using cppijieba stop word list
We will follow up with another commit here to apply the Chinese
word stopwords, but for now to eliminate the confusion we are
skipping applying the stopword list when the dictionary in PG is
in English.
* DEV: Add framework for filtered plugin registers
Plugins often need to add values to a list, and we need to filter those lists at runtime to ignore values from disabled plugins. This commit provides a re-usable way to do that, which should make it easier to add new registers in future, and also reduce repeated code.
Follow-up commits will migrate existing registers to use this new system
* DEV: Migrate user and group custom field APIs to plugin registry
This gives us a consistent system for checking plugin enabled state, so we are repeating less logic. API changes are backwards compatible
Prior to this change we would never clear memory from contexts and
rely on V8 reacting to pressure
This could lead to bloating of PrettyText and Transpiler contexts
This optimisations ensures that we will clear memory 2 seconds after
the last eval on the context
Previously we would raise a warning in the logs if downloading
a file (from s3) takes longer than 60 seconds.
At scale this happens reasonably frequently.
1. Raised the duration to 3 minutes
2. Pulled the resizing mutex out of the downloading mutex
so we have less and clearer error logs
* DEV: Standardize table sorting verbiage
This commit creates a common component that tables can use to make their
headers sortable. This commit also standardizes on using `desc` as the
default and passing in the `asc=true` flag to adjust the sorting
direction.
* Add deprecation warnings
Adds deprecation warnings if using previous params and maintains
backwards compatibility. Set the default sort value for group members to
be asc.
* switch group requests to use common table-header-toggle
* update fixture
* PERF: Dematerialize topic_reply_count
It's only ever used for trust level promotions that run daily, or compared to 0. We don't need to track it on every post creation.
* UX: Add symbol in TL3 report if topic reply count is capped
* DEV: Drop user_stats.topic_reply_count column
Co-authored-by: Mark VanLandingham <markvanlan@gmail.com>
Co-authored-by: Robin Ward <robin.ward@gmail.com>
Co-authored-by: Mark VanLandingham <markvanlan@gmail.com>
Use a helper method to simplify creating a new register. Previously this would require creating lots of different methods manually, and adding every register to the clear/reset functions
This commit prevents a 500 error from occurring if someone is trying to
setup their discourse instance as a sso provider and they don't pass in
a `return_sso_url` in their payload.
This refactors default_current_user_provider in a few ways:
- Introduce a generic `api_parameter_allowed?` method which checks for whitelisted routes/formats
- Only read the api_key parameter on allowed routes. It is now completely ignored on other routes (previously it would raise a 403)
- Start reading user_api_key parameter on allowed routes
- Refactor tests as end-end integration tests
A plugin API for PARAMETER_API_PATTERNS will be added soon
For pages that do not specify canonical URL we will default to `https://SITENAME/PATH`.
This ensures that if a URL is crawled on the CDN the search ranking will transfer to the main site.
Additionally we whitelist the `?page` param
Adds a new rake task to auto generate a constants.js file with the
constants present. This makes migrating to Ember CLI easier, but also
slightly speeds up asset compilation by having to do less work.
If the constants change you need to run:
`rake javascripts:update_constants`
There were two constants here, `INLINE_ONEBOX_LOADING_CSS_CLASS` and
`INLINE_ONEBOX_CSS_CLASS` that were both longer than the strings they
were DRYing up: `inline-onebox-loading` and `inline-onebox`
I normally appreciate constants, but in this case it meant that we had
a lot of JS imports resulting in many more lines of code (and CPU cycles
spent figuring them out.)
It also meant we had an `.erb` file and had to invoke Ruby to create the
JS file, which meant the app was harder to port to Ember CLI.
I removed the constants. It's less DRY but faster and simpler, and
arguably the loss of DRYness is not significant as you can still search
for the `inline-onebox-loading` and `inline-onebox` strings easily if
you are refactoring.
We now show an options gear icon next to the bookmark name.
When expanded we show the "delete bookmark when reminder sent" option. The value of this checkbox is saved in local storage for the user.
If this is ticked, when a reminder is sent for the bookmark the bookmark itself is deleted. This is so people can use the reminder functionality by itself.
Also remove the blue alert reminder section from the "Edit Bookmark" modal as it just added clutter, because the user can already see they had a reminder set:
Adds a default false boolean column `delete_when_reminder_sent` to bookmarks.
Locale files get precompiled after deployment and they contained translations from the `default_locale`. That's especially bad in multisites, because the initial `default_locale` is `en_US`. Sites where the `default_locale` isn't `en_US` could see missing translations. The same thing could happen when users are allowed to chose a different locale.
This change simplifies the logic by not using the `default_locale` in the locale chain. It always falls back to `en` in case of missing translations.
We were sharing `Discourse` both as an application object and a
namespace which complicated things for Ember CLI. This patch
moves raw templates into `__DISCOURSE_RAW_TEMPLATES` and adds
a couple helper methods to create/remove them.
This introduces new APIs for obtaining optimized thumbnails for topics. There are a few building blocks required for this:
- Introduces new `image_upload_id` columns on the `posts` and `topics` table. This replaces the old `image_url` column, which means that thumbnails are now restricted to uploads. Hotlinked thumbnails are no longer possible. In normal use (with pull_hotlinked_images enabled), this has no noticeable impact
- A migration attempts to match existing urls to upload records. If a match cannot be found then the posts will be queued for rebake
- Optimized thumbnails are generated during post_process_cooked. If thumbnails are missing when serializing a topic list, then a sidekiq job is queued
- Topic lists and topics now include a `thumbnails` key, which includes all the available images:
```
"thumbnails": [
{
"max_width": null,
"max_height": null,
"url": "//example.com/original-image.png",
"width": 1380,
"height": 1840
},
{
"max_width": 1024,
"max_height": 1024,
"url": "//example.com/optimized-image.png",
"width": 768,
"height": 1024
}
]
```
- Themes can request additional thumbnail sizes by using a modifier in their `about.json` file:
```
"modifiers": {
"topic_thumbnail_sizes": [
[200, 200],
[800, 800]
],
...
```
Remember that these are generated asynchronously, so your theme should include logic to fallback to other available thumbnails if your requested size has not yet been generated
- Two new raw plugin outlets are introduced, to improve the customisability of the topic list. `topic-list-before-columns` and `topic-list-before-link`
The PR https://github.com/rails/rails/pull/36612 changes the raised exception
if the error message includes the target database name.
Since the error message contains the hostname, this could be triggered when
the hostname contains the database name.
* When hovering over the bookmark icon for a post, show the name of the bookmark at the end of the tooltip _if_ it has been set.
* Order bookmarks by `updated_at DESC` in the user list and show that instead of created at.
Recently, we added feature that we are sending `/muted` to users who muted specific topic just before `/latest` so the client knows to ignore those messages - https://github.com/discourse/discourse/pull/9482
Same `/muted` message should be included when the post is edited
* Remove Handlebars.SafeString usage
* DEV: Support for `import Handlebars from 'handlebars'`;
* FIX: Sprockets was broken when `node_modules` was present
By default the old version of sprockets looks for application.js
anywhere, including in a node_modules folder if this exists
(which it will when we move to Ember CLI.)
Some providers don't implement the Expect: 100-continue support,
which results in a mismatch in the object signature.
With this settings, users can disable the header and use such providers.
TLDR; this commit vastly improves how whitespaces are handled when converting from HTML to Markdown.
It also adds support for converting HTML <tables> to markdown tables.
The previous 'remove_whitespaces!' method was traversing the whole HTML tree and used a heuristic to remove
leading and trailing whitespaces whenever it was appropriate (ie. mostly before and after HTML block elements)
It was a good idea, but it was very limited and leaded to bad conversion when the html had leading whitespaces on several lines for example.
One such example can be found [here](https://meta.discourse.org/t/86782).
For various reasons, most of the whitespaces in a HTML file is ignored when the page is being displayed in a browser.
The rules that the browsers follow are the [CSS' White Space Processing Rules](https://www.w3.org/TR/css-text-3/#white-space-rules).
They can be quite complicated when you take into account RTL languages and other various tidbits but they boils down to the following:
- Collapse whitespaces down to one space (0x20) inside an inline context (ie. nodes/tags that are being displaying on the same line)
- Remove any leading/trailing whitespaces inside an inline context
One quick & dirty way of getting this 90% solved would be to do 'HTML.gsub!(/[[:space:]]+/, " ")'.
We would also need to hoist <pre> elements in order to not mess with their whitespaces.
Unfortunately, this solution let some whitespaces creep around HTML tags which leads to more '.strip!' calls than I can bear.
I decided to "emulate" the browser's handling of whitespaces and came up with a solution in 4 parts
1. remove_not_allowed!
The HtmlToMarkdown library is recursively "visiting" all the nodes in the HTML in order to convert them to Markdown.
All the nodes that aren't handled by the library (eg. <script>, <style> or any non-textual HTML tags) are "swallowed".
In order to reduce the number of nodes visited, the method 'remove_not_allowed!' will automatically delete all the nodes
that have no "visitor" (eg. a 'visit_<tag>' method) defined.
2. remove_hidden!
Similar purpose as the previous method (eg. reducing number of nodes visited), there's no point trying to convert something that is hidden.
The 'remove_hidden!' method removes any nodes that was hidden using the "hidden" HTML attribute, some CSS or with a width or height equal to 0.
3. hoist_line_breaks!
The 'hoist_line_breaks!' method is there to handle <br> tags. I know those tiny <br> don't do much but they can be quite annoying.
The <br> tags are inline elements but they visually work like a block element (ie. they create a new line).
If you have the following HTML "<i>Foo<br>Bar</i>", it ends up visually similar to "<i>Foo</i><br><i>Bar</i>".
The latter being much more easy to process than the former, so that's what this method is doing.
The "hoist_line_breaks" will hoist <br> tags out of inline tags until their parent is a block element.
4. remove_whitespaces!
The "remove_whitespaces!" is where all the whitespace removal is happening. It's broken down into 4 methods as well
- remove_whitespaces!
- is_inline?
- collapse_spaces!
- remove_trailing_space!
The 'remove_whitespace!' method is recursively walking the HTML tree (skipping <pre> tags).
If a node has any children, they will be chunked into groups of inline elements vs block elements.
For each chunks of inline elements, it will call the "collapse_space!" and "remove_trailing_space!" methods.
For each chunks of block elements, it will call "remote_whitespace!" to keep walking the HTML tree recursively.
The "is_inline?" method determines whether a node is part of a inline context.
A node is inline iif it's a text node or it's an inline tag, but not <br>, and all its children are also inline.
The "collapse_spaces!" method will collapse any kind of (white) space into a single space (" ") character, even accros tags.
For example, if we have " Foo \n<i> Bar </i>\t42", it will return "Foo <i>Bar </i>42".
Finally, the "remove_trailing_space!" method is there to remove any trailing space that might creep in at the end of the inline chunk.
This solution is not 100% bullet-proof.
It does not support RTL languages at all and has some caveats that I felt were not worth the work to get properly fixed.
FIX: better detection of hidden elements when converting HTML to Markdown
FIX: take into account the 'allowed_href_schemes' site setting when converting HTML <a> to Markdown
FIX: added support for 'mailto:' scheme when converting <a> from HTML to Markdown
FIX: added support for <img> dimensions when converting from HTML to Markdown
FIX: added support for <dl>, <dd> and <dt> when converting from HTML to Markdown
FIX: added support for multilines emphases, strongs and strikes when converting from HTML to Markdown
FIX: added support for <acronym> when converting from HTML to Markdown
DEV: remove unused 'sanitize' gem
Wow, did you just read all that?! Congratz, here's a cookie: 🍪.
We have the `# frozen_string_literal: true` comment on all our
files. This means all string literals are frozen. There is no need
to call #freeze on any literals.
For files with `# frozen_string_literal: true`
```
puts %w{a b}[0].frozen?
=> true
puts "hi".frozen?
=> true
puts "a #{1} b".frozen?
=> true
puts ("a " + "b").frozen?
=> false
puts (-("a " + "b")).frozen?
=> true
```
For more details see: https://samsaffron.com/archive/2018/02/16/reducing-string-duplication-in-ruby
This is to help with the migration to Ember CLI. In the current running
version of Discourse everything should be the same as before, just with
a few extra files that are not used. However, using Ember CLI this can
be installed as an Ember addon.
Co-Authored-By: Jarek Radosz <jradosz@gmail.com>
rebuilding user_actions is not something that should be done.
Plugins such as solved and assigned extend it, there are tons of
little rules that were not captured in `user_actions:rebuild`
Make sure the topic_user.bookmarked column is set correctly when user bookmarks/unbookmarks any post in a topic. For example, you bookmarked a post in the topic that was not the OP, the bookmark icon in the topic list would not be shown. Same if deleting a bookmark for the last bookmarked post in a topic, the bookmark icon in the topic list would not be removed.
Previously this was only setting it to true if bookmarking the OP/topic, which was not correct -- we want to show the icon on the topic list if any post is bookmarked.
Also set to false if unbookmarking the last bookmarked post in the topic.
Also in this PR is a migration to correct any out of sync topic_user.bookmarked columns, based on the new logic.
* When copying the markdown for an image between posts, we were not adding the srcset and data-small-image attributes which are done by calling optimize_image! in cooked post processor
* Refactored the code which was confusing in its current state (the consider_for_reuse method was super confusing) and fixed the issue
* FEATURE: don't display new/unread notification for muted topics
Currently, even if user mute topic, when a new reply to that topic arrives, the user will get "See 1 new or updated topic" message. After clicking on that link, nothing is visible (because the topic is muted)
To solve that problem, we will send background message to all users who recently muted that topic that update is coming and they can ignore the next message about that topic.
DO does not implement tagging support for S3 objects. Removing our default
empty tag fixes compatibility.
The expire_missing_assets rake task can't be used with that service still,
but this patch allows normal operation.
The main thrust of this PR is to take all the conditional checks based on the `enable_bookmarks_with_reminders` away and only keep the code from the `true` path, making bookmarks with reminders the core bookmarks feature. There is also a migration to create `Bookmark` records out of `PostAction` bookmarks for a site.
### Summary
* Remove logic based on whether enable_bookmarks_with_reminders is true. This site setting is now obsolete, the old bookmark functionality is being removed. Retain the setting and set the value to `true` in a migration.
* Use the code from the rake task to create a database migration that creates bookmarks from post actions.
* Change the bookmark report to read from the new table.
* Get rid of old endpoints for bookmarks
* Link to the new bookmarks list from the user summary page
Otherwise we are trying to update the reminder type with a string which often evaluates to 0 (At Desktop) which causes reminders to come through early.
* DEV: Use `render_json_error` (Adds specs for Admin::GroupsController)
* DEV: Use a specific error on blank category slug (Fixes a `render_json_error` warning)
* DEV: Use a specific error on reviewable claim conflict (Fixes a `render_json_error` warning)
* DEV: Use specific errors in Admin::UsersController (Fixes `render_json_error` warnings)
* FIX: PublishedPages error responses
* FIX: TopicsController error responses (There was an issue of two separate `Topic` instances for the same record. This makes sure there's only one up-to-date instance.)
There is now an explicit "Delete Bookmark" button in the edit modal. A confirmation is shown before deleting.
Along with this, when the bookmarked post icon is clicked the modal is now shown instead of just deleting the bookmark. Also, the "Delete Bookmark" button from the user bookmark list now confirms the action.
Add a `d d` shortcut in the modal to delete the bookmark.
The `uplaods:migrate_to_s3` rake task should always use the environment variables, because you usually don't want to break your site's uploads during the migration. But restoring a backup should work with site settings as well as environment variables, otherwise you can't restore uploads to S3 from the web interface.
Users can now edit the bookmark name and reminder time from their list of bookmarks.
We use "Custom" for the date and time in the modal because if the user set a reminder for "tomorrow" then edit the reminder "tomorrow", the definition of what "tomorrow" is has changed.
Unicorn uses the USR1 to indicate that log files should be reopened. This commit implements the same functionality for our forked sidekiq workers:
- USR1 is intercepted in the unicorn master, and re-issued to all child processes
- USR1 is trapped in the sidekiq processes, and `Unicorn::Util.reopen_logs` is used to re-open log files
* Count user summary bookmarks from new Bookmark table if bookmarks with reminders enabled
* Update topic user bookmarked column when new topic bookmark changed
* Make in:bookmarks search work with new bookmarks
* Fix batch inserts for bookmark rake task (and thus migration). We were only inserting one bookmark at a time, completely defeating the purpose of batching!
* Show the correct bookmark with clock icon when topic-level bookmark reminder time is set and show the time of the reminder in the title on hover.
* Add a new bookmark lib and reminder time formatting function to show time with today/tomorrow shorthand for readability. E.g. tomorrow at 8:00am instead of Apr 16 2020 at 8:00am. This only applies to today + tomorrow, future dates are still treated the same.
Trigger an event for plugins to consume when a user session is refreshed.
This allows external auth to be notified about account activity, and be
able to take action such as use oauth refresh tokens to keep oauth
tokens valid.
This reverts commit 8b46f14744.
It corrects the reason for the revert:
We rely on SafeMigrate existing cause we call it from migrations,
Zeitwerk will autoload it.
Instead of previous pattern we explicitly bypass all the hacks in
production mode.
We need to disable SafeMigrate cause it is not thread safe.
A thread safe implementation is possible but not worth the effort,
we catch the issues in dev and test.
This adds support for a new piece of metadata to your plugin.rb
files. If you add:
```
transpile_js: true
```
Then Discourse will support transpilation of assets in your
`assets/javascripts` directory. Previously they had to be named
`.js.es6` but now regular `.js` will work.
Note this is opt-in because some plugins currently have `.js` files in
app/assets that are not meant to be transpiled.
Going forward all plugins should migrate to this setting as they are
comfortable able to do so.
SafeMigrate outputs text when we detect attempts to unsafely drop tables
and columns
It is unfortunately not thread safe
This is not needed in production as we would have already caught it by then
in our test suite.
Previously we were migrating multisites serially, this is extremely slow
especially when 200 dbs are involved.
The new implementation defaults to running 20 migrations concurrently, leading
to a 20x speedup.
We also amended it so errors are printed out last, something that makes
debugging failures easier.
This is code specific to Discourse cause we integrate SeedFu with our
migrations and can not include this in the multisite gem.
If a migration performs no changes it should not output stuff.
Previously we would output information about seeds which was very noisy.
On multisite this was particularly bad
If the feature is enabled, staff members can construct a URL and publish a
topic for others to browse without the regular Discourse chrome.
This is useful if you want to use Discourse like a CMS and publish
topics as articles, which can then be embedded into other systems.
* DEPRECATION: Remove support for api creds in query params
This commit removes support for api credentials in query params except
for a few whitelisted routes like rss/json feeds and the handle_mail
route.
Several tests were written to valid these changes, but the bulk of the
spec changes are just switching them over to use header based auth so
that they will pass without changing what they were actually testing.
Original commit that notified admins this change was coming was created
over 3 months ago: 2db2003187
* fix tests
* Also allow iCalendar feeds
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
* FIX: guardian always got user but sometimes it is anonymous
```
def initialize(user = nil, request = nil)
@user = user.presence || AnonymousUser.new
@request = request
end
```
AnonymouseUser defines `blank?` method
```
class AnonymousUser
def blank?
true
end
...
end
```
so if we would use @user.present? it would be correct, however, just @user is always true
This is so users with huge amount of bookmarks do not have to wait a long time to see results.
* Add a bookmark list and list serializer to server-side to be able to handle paging and load more URL
* Use load-more component to load more bookmark items, 20 at a time in user activity
* Change the way current user is loaded for bookmark ember models because it was breaking/losing resolvedTimezone when loading more items
* FEATURE: add setting `auto_approve_email_domains` to auto approve users
This commit adds a new site setting `auto_approve_email_domains` to
auto approve users based on their email address domain.
Note that if a domain already exists in `email_domains_whitelist` then
`auto_approve_email_domains` needs to be duplicated there as well,
since users won’t be able to register with email address that is
not allowed in `email_domains_whitelist`.
* Update config/locales/server.en.yml
Co-Authored-By: Robin Ward <robin.ward@gmail.com>
Usage:
```
{{d-button icon="times" label="foo.bar" isLoading=true}}
```
Note that a button loading without an icon will shrink text size to prevent button to jump in size.
A button while loading is disabled.
* FIX: Perform crop using user-specified image sizes
It used to resize the images to max width and height first and then
perform the crop operation. This is wrong because it ignored the user
specified image sizes from the Markdown.
* DEV: Use real images in test
Previously we would consider a user "present" and "last seen" if the
browser window was visible.
This has many edge cases, you could be considered present and around for
days just by having a window open and no screensaver on.
Instead we now also check that you either clicked, transitioned around app
or scrolled the page in the last minute in combination with window
visibility
This will lead to more reliable notifications via email and reduce load of
message bus for cases where a user walks away from the terminal
Get rid of harmful each loop over uploads to update. Instead we put all the unique access control posts for the uploads into a map for fast access (vs using the slow .find through array) and look up the post when it is needed when looping through the uploads in batches.
On a Discourse instance with ~93k uploads, a simplified version of the old method takes > 1 minute, and a simplified version of the new method takes ~18s and uses a lot less memory.
If the “secure media” site setting is enabled then ALL files uploaded to Discourse (images, video, audio, pdf, txt, zip etc. etc.) will follow the secure media rules. The “prevent anons from downloading files” setting will no longer have any bearing on upload security. Basically, the feature will more appropriately be called “secure uploads” instead of “secure media”.
This is being done because there are communities out there that would like all attachments and media to be secure based on category rules but still allow anonymous users to download attachments in public places, which is not possible in the current arrangement.
* FIX: Do not use original filename to extract the original filename
Prefer extracting filename from the destination path, which is build
using extracted image information.
* UX: Show better error images
Previous to this change slugs for leaves in 3 level nestings would not work
Our UX picks only the last two levels
This also makes the results consistent for slugs as it enforces order.
- Define the CSP based on the requested domain / scheme (respecting force_https)
- Update EnforceHostname middleware to allow secondary domains, add specs
- Add URL scheme to anon cache key so that CSP headers are cached correctly
This is mostly useful while developing a plugin, to avoid manual actions of deleting tables and schema_migrations rows.
Usage:
bundle exec rake plugin:migrate:down[discourse-calendar]
New `duration` attribute is introduced for the `set_or_create_timer` method in the commit aad12822b7 for "based on last post" and "auto delete replies" topic timers.
* Preload custom fields for BookmarkQuery and add preload callback. Copy TopicQuery preload methodology to allow plugins to preload data for the BookmarkQuery. This fixes assigned plugin custom fields N1
* Include topic tags in initial query to avoid tags N1
Related: discourse/discourse-assign#63
* DEV: Replace User.unstage and User#unstage API with User#unstage!
Quoting @SamSaffron:
> User.unstage mixes concerns of both unstaging users and updating params which is fragile/surprising.
> u.unstage destroys notifications and raises a user_unstaged event prior to the user becoming unstaged and the user object being saved.
User#unstage! no longer updates user attributes and saves the object before triggering the `user_unstaged` event.
* Update one more spec
* Assign attributes after unstaging
* Improve the bookmark mobile on modal so it doesn't go all the way to the edge and the custom datetime input is easier to use
* Improve the rake task for syncing so it does not error for topics that no longer exist and batches 2000 inserts at a time, clearing the array each time