Indexing query strings in URLS produces inconsistent results in PG and
pollutes the search data for really little gain.
The following seems to work as expected...
```
discourse_development=# SELECT TO_TSVECTOR('https://www.discourse.org?test=2&test2=3');
to_tsvector
------------------------------------------------------
'2':3 '3':5 'test':2 'test2':4 'www.discourse.org':1
```
However, once a path is present
```
discourse_development=# SELECT TO_TSVECTOR('https://www.discourse.org/latest?test=2&test2=3');
to_tsvector
----------------------------------------------------------------------------------------------
'/latest?test=2&test2=3':3 'www.discourse.org':2 'www.discourse.org/latest?test=2&test2=3':1
```
The lexeme contains both the path and the query string.
```
discourse_development=# SELECT alias, lexemes FROM TS_DEBUG('www.discourse.org');
alias | lexemes
-------+---------------------
host | {www.discourse.org}
discourse_development=# SELECT TO_TSVECTOR('www.discourse.org');
to_tsvector
-----------------------
'www.discourse.org':1
```
Given the above lexeme, we will inject additional lexeme by splitting
the host on `.`. The actual tsvector stored will look something like
```
tsvector
---------------------------------------
'discourse':1 'discourse.org':1 'org':1 'www':1 'www.discourse.org':1
```
This is causing issues when purging old users, if they are set up in the
exact condition where they will be demoted into another group, but also
do not have a primary email.
This reverts commit 20780a1eee.
* SECURITY: re-adds accidentally reverted commit:
03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
instead of the 03d26cd6 parent (which contains security fixes)
* PERF: Dematerialize topic_reply_count
It's only ever used for trust level promotions that run daily, or compared to 0. We don't need to track it on every post creation.
* UX: Add symbol in TL3 report if topic reply count is capped
* DEV: Drop user_stats.topic_reply_count column
This ensures that at a minimum you are notified once a day of
repeat edits by the same user.
Long term we may consider winding this down to say 1 hour or
making it configurable.
Due to a refactor in e90f9e5cc4 we stopped notifying on edits if
a user liked a post and then edited.
The like could have happened a long time ago so this gets extra
confusing.
This change makes the suppression more deliberate. We only want
to suppress quite/link/mention if the user already got a reply
notification.
We can expand this suppression if it is not enough.
It's possible to cause a 500 error by putting in weird characters in the
input field for updating a users website on their profile.
Normal invalid input like not including the domain extension is already
handled by the user_profile model validation. This fix ensures a server
error doesn't occur for weird input characters.
It previously failed to match URLs with characters other than `[a-zA-z0-9\.\/:-]`. This meant that `PullHotlinkedImages` would sometimes download an external image and then never use it in any posts.
* FIX: FlagSockpuppets should not flag a post if a post of that user was already rejected by staff
* Update spec/services/flag_sockpuppets_spec.rb
Co-Authored-By: Robin Ward <robin.ward@gmail.com>
Co-authored-by: Robin Ward <robin.ward@gmail.com>
This fix ensures that if a staged user is linked to or quoted they won't
be emailed about it.
A staged user could email into a category, and another user could quote
them inside of a completely different category and we don't want a
staged user to receive an email for this.
Bug report:
https://meta.discourse.org/t/-/145202/9
For example given a custom badge with SQL:
```
SELECT 1
-- I am a comment
```
You end up with
```
FROM (SELECT 1
-- I am a comment) q
```
This fix adds newlines so you end up with the now-valid:
```
FROM (
SELECT 1
-- I am a comment
) q
```
New `duration` attribute is introduced for the `set_or_create_timer` method in the commit aad12822b7 for "based on last post" and "auto delete replies" topic timers.
* This PR implements the scheduling and notification system for bookmark reminders. Every 5 minutes a schedule runs to check any reminders that need to be sent before now, limited to **300** reminders at a time. Any leftover reminders will be sent in the next run. This is to avoid having to deal with fickle sidekiq and reminders in the far-flung future, which would necessitate having a background job anyway to clean up any missing `enqueue_at` reminders.
* If a reminder is sent its `reminder_at` time is cleared and the `reminder_last_sent_at` time is filled in. Notifications are only user-level notifications for now.
* All JavaScript and frontend code related to displaying the bookmark reminder notification is contained here. The reminder functionality is now re-enabled in the bookmark modal as well.
* This PR also implements the "Remind me next time I am at my desktop" bookmark reminder functionality. When the user is on a mobile device they are able to select this option. When they choose this option we set a key in Redis saying they have a pending at desktop reminder. The next time they change devices we check if the new device is desktop, and if it is we send reminders using a DistributedMutex. There is also a job to ensure consistency of these reminders in Redis (in case Redis drops the ball) and the at desktop reminders expire after 20 days.
* Also in this PR is a fix to delete all Bookmarks for a user via `UserDestroyer`
* FIX: when unread reply notification exists don't create new
From time to time, the user is creating a reply post and then they want to add additional details. They edit an existing post and for example, add a quote from a previous one.
In that situation, if the user to whom reply was directed to already have the unread notification, we should not create the new one.
That behaviour was mentioned here: https://meta.discourse.org/t/reply-then-edit-to-add-quote-notification-redundancy/138358
* FIX: dont create new notification if already exists
The commit: 75069ff179
allows users to remove their primary group, but this introduced a bug
where if you were to edit any other profile info like location or
website which is a form on a separate page then the flair dropdown,
would cause the selected flair to be removed.
This fix ensures that if the `primary_group_id` parameter is missing
from the update payload it does not remove the existing
`primary_group_id`. It will only remove the `primary_group_id` if it is
present in the payload and empty.
This is not used in core or official plugins, and has been printing a deprecation notice since v2.3.0beta4. All OpenID 2.0 code and dependencies have been dropped. The user_open_ids table remains for now, in case anyone has missed the deprecation notice, and needs to migrate their data.
Context at https://meta.discourse.org/t/-/113249
* FEATURE: Replace existing badge owners when using the bulk award feature
* Use ActiveRecord to sanitize title update query, Change replace checkbox text
Co-Authored-By: Robin Ward <robin.ward@gmail.com>
Co-authored-by: Robin Ward <robin.ward@gmail.com>
group membership and `CategoryUser` notification level should be
respected to determine whether to notify staged users about activity in
private categories, instead of only ever generating notifications for staged
users' own topics (which has been the behaviour since
0c4ac2a7bc)
Let's say post #2 quotes post number #1. If a user decides to quote the
quote in post #2, it should keep the information of post #1
("user_1, post: 1, topic: X"), instead of replacing with current post
info ("user_2, post: 2, topic: X").
This fix allows a user to remove their currently assigned primary group
if the Site Setting `user selected primary groups` is enabled.
Before this fix, if a user selected "none" for their primary group it
would silently fail and never be updated.
This is required for people using apps with custom protocols. We still verify the entire URL (including protocol) against the site setting value.
Refactored wildcard_url_checker so that it always returns a boolean, rather than sometimes returning a regex match.
* FEATURE: Ability to add components to all themes
This is the first and functional step from that topic https://dev.discourse.org/t/adding-a-theme-component-is-too-much-work/15398/16
The idea here is that when a new component is added, the user can easily assign it to all themes (parents).
To achieve that, I needed to change a site-setting component to accept `setDefaultValues` action and `setDefaultValuesLabel` translated label.
Also, I needed to add `allowAny` option to disable that for theme selector.
I also refactored backend to accept both parent and child ids with one method to avoid duplication (Renamed `add_child_theme!` to more general `add_relative_theme!`)
* FIX: Improvement after code review
* FIX: Improvement after code review2
* FIX: use mapBy and filterBy directly
Previously people were not consistent about mocking which left internals in
a fragile state when running subfolder specs.
This introduces a simple helper `set_subfolder` which you can use to set
the subfolder for the spec. It takes care of proper configuration of subfolder
and teardown.
```
# usage
set_subfolder "/my_amazing_subfolder"
```
You should no longer stub base_uri or global_settings
* Fix user title logic when badge name customized
* Fix an issue where a user's title was not considered a badge granted title when the user used a badge for their title and the badge name was customized. this affected the effectiveness of revoke_ungranted_titles! which only operates on badge_granted_titles.
* When a user's title is set now it is considered a badge_granted_title if the badge name OR the badge custom name from TranslationOverride is the same as the title
* When a user's badge is revoked we now also revoke their title if the user's title matches the badge name OR the badge custom name from TranslationOverride
* Add a user history log when the title is revoked to remove confusion about why titles are revoked
* Add granted_title_badge_id to user_profile, now when we set badge_granted_title on a user profile when updating a user's title based on a badge, we also remember which badge matched the title
* When badge name (or custom text) changes update titles of users in a background job
* When the name of a badge changes, or in the case of system badges when their custom translation text changes, then we need to update the title of all corresponding users who have a badge_granted_title and matching granted_title_badge_id. In the case of system badges we need to first get the proper badge ID based on the translation key e.g. badges.regular.name
* Add migration to backfill all granted_title_badge_ids for both normal badge name titles and titles using custom badge text.
Issue was mentioned in this [meta topic](https://meta.discourse.org/t/send-a-notification-to-watching-users-when-adding-tag/125314)
It is working well when category is changed because NotifyCategoryChange job already got that code:
```
if post&.topic&.visible?
post_alerter = PostAlerter.new
post_alerter.notify_post_users(post, User.where(id: args[:notified_user_ids]))
post_alerter.notify_first_post_watchers(post, post_alerter.category_watchers(post.topic))
end
```
For NotifyTagChange job notify post users were missing so it worked only when your notification was set to `watching first post`
- Allow revoking keys without deleting them
- Auto-revoke keys after a period of no use (default 6 months)
- Allow multiple keys per user
- Allow attaching a description to each key, for easier auditing
- Log changes to keys in the staff action log
- Move all key management to one place, and improve the UI
This is a major change to draft internals. Previously there were quite a
few cases where the draft system would say "draft saved", when in fact
we just skipped saving.
This commit ensures the draft system deals with draft ownership handover in
a predictable way.
For example:
- Window 1 editing draft
- Window 2 editing same draft at the same time
Previously we would allow window 1 and 2 to just fight on the same draft
each window overwriting the same draft over an over.
This commit introduces an ownership concept where either window 1 or 2 win
and user is prompted on the loser window to reload screen to correct the issue
This also corrects edge cases where a user could have multiple browser windows
open and posts in 1 window, later to post in the second window. Previously
drafts would break in the second window, this corrects it.
* FEATURE: Site setting/ui to allow users to set their primary group
* prettier and remove logic from account template
* added 1 to 43 to make web_hook_user_serializer_spec pass
Doing .pluck(:column).first is a very common pattern in Discourse and in
most cases, a limit cause isn't being added. Instead of adding a limit
clause to all these callsites, this commit adds two new methods to
ActiveRecord::Relation:
pluck_first, equivalent to limit(1).pluck(*columns).first
and pluck_first! which, like other finder methods, raises an exception
when no record is found
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains.
We no longer need to use Rails "require_dependency" anywhere and instead can just use standard
Ruby patterns to require files.
This is a far reaching change and we expect some followups here.
- Client-side censoring fixed for non-chrome browsers. (Regular expression rewritten to avoid lookback)
- Regex generation is now done on the server, to reduce repeated logic, and make it easier to extend in plugins
- Censor tests are moved to ruby, to ensure everything works end-to-end
- If "watched words regular expressions" is enabled, warn the admin when the generated regex is invalid
* REFACTOR: Rename SiteSetting.disable_edit_notifications to disable_system_edit_notifications
- The older name could cause some confusion because the setting does not disable all edit notifications, only system ones.
* FIX: Add frozen_string_literal: true in the migration
* DEV: Deprecate 'disable_edit_notifications'
This feature adds the ability to customize the HTML part of all emails using a custom HTML template and optionally some CSS to style it. The CSS will be parsed and converted into inline styles because CSS is poorly supported by email clients. When writing the custom HTML and CSS, be aware of what email clients support. Keep customizations very simple.
Customizations can be added and edited in Admin > Customize > Email Style.
Since the summary email is already heavily styled, there is a setting to disable custom styles for summary emails called "apply custom styles to digest" found in Admin > Settings > Email.
As part of this work, RTL locales are now rendered correctly for all emails.
This commit contains 3 features:
- FEATURE: Allow downloading watched words
This introduces a button that allows admins to download watched words per action in a `.txt` file.
- FEATURE: Allow clearing watched words in bulk
This adds a "Clear All" button that clears all deleted words per action (e.g. block, flag etc.)
- FEATURE: List all blocked words contained in the post when it's blocked
When a post is rejected because it contains one or more blocked words, the error message now lists all the blocked words contained in the post.
-------
This also changes the format of the file for importing watched words from `.csv` to `.txt` so it becomes inconsistent with the extension of the file when watched words are exported.
Created a rake task for destroying multiple categories along with any
subcategories and topics the belong to those categories.
Also created a rake task for listing all of your categories.
Refactored existing destroy rake tasks to use new logging method, that
allows for puts output in the console but prevents it from showing in
the specs.
Context: https://meta.discourse.org/t/121589
This new setting option lets group owners message/mention large groups
without granting that privilege to all members.
This can cause unbound CPU usage in some cases, and excessive logging in other cases. This commit moves redis readonly information into the local process, but maintains the DistributedCache for postgres readonly state.
The site settings beginning with "topic views heat" and "topic post like
heat" are set to defaults when installing Discourse, but there has not
been a process or guidance for updating these values based on
community activity.
This feature will update them once a month. The low, medium, and
high settings will be based on the minimums of the 45th, 25th, and
10th percentile topics respectively, so that 45% of topics will have
some "heat".
Disable automatic changes with the automatic_topic_heat_values setting.
Previous to this fix is a post had the test www.test.com/abc it would fail
to index.
This also simplifies the rules to avoid full url parsing which can be
expensive
Previously we used custom fields to denote a user was anonymous, this was
risky in that custom fields are prone to race conditions and are not
properly dedicated, missing constraints and so on.
The new table `anonymous_users` is properly protected. There is only one
possible shadow account per user, which is enforced using a constraint.
Every anonymous user will have a unique row in the new table.
* Introduced fab!, a helper that creates database state for a group
It's almost identical to let_it_be, except:
1. It creates a new object for each test by default,
2. You can disable it using PREFABRICATION=0
`Upload#url` is more likely and can change from time to time. When it
does changes, we don't want to have to look through multiple tables to
ensure that the URLs are all up to date. Instead, we simply associate
uploads properly to `UserProfile` so that it does not have to replicate
the URLs in the table.
This change both speeds up specs (less strings to allocate) and helps catch
cases where methods in Discourse are mutating inputs.
Overall we will be migrating everything to use #frozen_string_literal: true
it will take a while, but this is the first and safest move in this direction
This commit fixes the follow quality issue with `PostSearchData#raw_data`:
1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.
`Post#cooked`:
```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```
2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.
From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.
`Post#cooked`
```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```
In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.
Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
* This is causing certain posts to appear in searches incorrectly as `PostSearchData#raw_data` contains the outdated title, category name and tag names.
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by
* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
It is not a setting, and only relevant in specs. The new API is:
```
Jobs.run_later! # jobs will be thrown on the queue
Jobs.run_immediately! # jobs will run right away, avoid the queue
```