The new `prioritize_exact_search_match` can be used to force the search
algorithm to prioritize exact term matches in title when ranking results.
This is scoped narrowly to titles for cases such as a topic titled:
"organisation chart" and a search of "org chart".
If we scoped this wider, all discussion about "org chart" would float to
the top and leave a very common title de-prioritized.
This is a hidden site setting and it has some performance impact due
to double ranking.
That said, performance impact is somewhat mitigated cause ranking on
title alone is a very cheap operation.
* FEATURE: allow restricting duplication in search index
This introduces the site setting `max_duplicate_search_index_terms`.
Using this number we limit the amount of duplication in our search index.
This allows us to more correctly weight title searches, so bloated posts
don't unfairly bump to the top of search results.
This feature is completely disabled by default and behind a site setting
We will experiment with it first. Note entire search index must be rebuilt
for it to take effect.
---------
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
Many users seems surprised by prefix matching in search leading to
unexpected results.
Over the years we always would return results starting with a search term
and not expect exact matches.
Meaning a search for `abra` would find `abracadabra`
This introduces the Site Setting `enable_search_prefix_matching` which
defaults to true. (behavior unchanged)
We plan to experiment on select sites with exact matches to see if the
results are less surprising
* FIX: Ensure soft-deleted topics can be deleted
The topic was not found during the deletion process because it was
deleted and `@post.topic` was nil.
* DEV: Use @topic instead of finding the topic every time
Under some situations, we would inadvertently return a public (unauthenticated) result to an authenticated API request. This commit adds the `Api-Key` header to our anonymous cache bypass logic.
When checking whether an existing upload should be secure
based on upload references, do not count deleted posts, since
there is still a reference attached to them. This can lead to
issues where e.g. an upload is used for a post then later on
a custom emoji.
Currently, `Tag#topic_count` is a count of all regular topics regardless of whether the topic is in a read restricted category or not. As a result, any users can technically poll a sensitive tag to determine if a new topic is created in a category which the user has not excess to. We classify this as a minor leak in sensitive information.
The following changes are introduced in this commit:
1. Introduce `Tag#public_topic_count` which only count topics which have been tagged with a given tag in public categories.
2. Rename `Tag#topic_count` to `Tag#staff_topic_count` which counts the same way as `Tag#topic_count`. In other words, it counts all topics tagged with a given tag regardless of the category the topic is in. The rename is also done so that we indicate that this column contains sensitive information.
3. Change all previous spots which relied on `Topic#topic_count` to rely on `Tag.topic_column_count(guardian)` which will return the right "topic count" column to use based on the current scope.
4. Introduce `SiteSetting.include_secure_categories_in_tag_counts` site setting to allow site administrators to always display the tag topics count using `Tag#staff_topic_count` instead.
This fixes a longstanding issue for sites with the
secure_uploads setting enabled. What would happen is a scenario
like this, since we did not check all places an upload could be
linked to whenever we used UploadSecurity to check whether an
upload should be secure:
* Upload is created and used for site setting, set to secure: false
since site setting uploads should not be secure. Let's say favicon
* Favicon for the site is used inside a post in a private category,
e.g. via a Onebox
* We changed the secure status for the upload to true, since it's been
used in a private category and we don't check if it's originator
was a public place
* The site favicon breaks :'(
This was a source of constant consternation. Now, when an upload is _not_
being created, and we are checking if an existing upload should be
secure, we now check to see what the first record in the UploadReference
table is for that upload. If it's something public like a site setting,
then we will never change the upload to `secure`.
This commit fixes the following issue:
* User creates a post
* Akismet or some other thing like requiring posts to be approved puts
the post in the review queue, deleting it
* Admin approves the post
* Email is never sent to mailing list mode subscribers
We intentionally do not enqueue this for every single post when
recovering a topic (i.e. recovering the first post) since the topics
could have a lot of posts with emails already sent, and we don't want
to clog sidekiq with thousands of notify jobs.
TL4 users can already list and unlist topics, but they can't see
the unlisted topics. This change brings this to par by allowing
TL4 users to also see unlisted topics.
Added in c2013865d7,
this migration was supposed to only turn off the hashtag
setting for existing sites (since that was the old default)
but its doing it for new ones too because we run all migrations
on new sites.
Instead, we should only run this if the first migration was
only just created, meaning its a new site.
When the thread is aborted, an exception is raised before the `start` of a job is set, and therefore raises an exception in the `ensure` block. This commit checks that `start` exists, and also adds `abort_on_exception=true` so that this issue would have caused test failures.
The `enable_new_notifications_menu` site setting allows sites that have
`navigation_menu` set to `legacy` to use the redesigned notifications
menu before switching to the new sidebar navigation menu.
This will give us some aggregate stats on the defer queue performance.
It is limited to 100 entries (for safety) which is stored in an LRU cache.
Scheduler::Defer.stats can then be used to get an array that denotes:
- number of runs and completions (queued, finished)
- error count (errors)
- total duration (duration)
We can look later at exposing these metrics to gain visibility on the reason
the defer queue is clogged.
Currently when generating a onebox for Discourse topics, some important
context is missing such as categories and tags.
This patch addresses this issue by introducing a new onebox engine
dedicated to display this information when available. Indeed to get this
new information, categories and tags are exposed in the topic metadata
as opengraph tags.
Fixes the support for kwargs in `DiscourseEvent.trigger()` on Ruby 3, e.g.
```rb
DiscourseEvent.trigger(:before_system_message_sent, message_type: type, recipient: @recipient, post_creator_args: post_creator_args, params: method_params)
```
Fixes https://github.com/discourse/discourse-local-site-contacts
If a secure upload's access_control_post was trashed, and an anon user
tried to look at that upload, they would get a 500 error rather than
the correct 403 because of an error inside the PostGuardian logic.
This commit does a couple of things:
1. Changes the limit of tags to include a subject for a
notification email to the `max_tags_per_topic` setting
instead of the arbitrary 3 limit
2. Adds both an X-Discourse-Tags and X-Discourse-Category
custom header to outbound emails containing the tags
and category from the subject, so people on mail clients
that allow advanced filtering (i.e. not Gmail) can filter
mail by tags and category, which is useful for mailing
list mode users
c.f. https://meta.discourse.org/t/headers-for-email-notifications-so-that-gmail-users-can-filter-on-tags/249982/17
This commit fixes an issue where the chat message bookmarks
did not respect the user's `bookmark_auto_delete_preference`
which they select in their user preference page.
Also, it changes the default for that value to "keep bookmark and clear reminder"
rather than "never", which ends up leaving a lot of expired bookmark
reminders around which are a pain to clean up.
This commit changes the default return value of `Auth::ManagedAuthenticator#primary_email_verified?` to false. We're changing the default to force developers to think about email verification when building a new authentication method. All existing authenticators (in core and official plugins) have been updated to explicitly define the `primary_email_verified?` method in their subclass of `Auth::ManagedAuthenticator` (example commit 65f57a4d05).
Internal topic: t/82084.
# Context
When a topic is reviewable by a group we give those group moderators some admin abilities including the ability to delete a topic.
# Problem
There are two main problems:
1. Currently when a group moderator deletes a topic they are redirected to root (not the same for staff)
2. Viewing the categories deleted topics (`c/foo/1/?status=deleted`) does not display the deleted topic to the group moderator (not the same for staff).
# Fix
If the `deleted_by` user is part a group that matches the `reviewable_by_group` on a topic then don't redirect. This is the default interaction for staff to give them the ability to do things like restore the topic in case it was accidentally deleted.
To render the deleted topics as expected for the group moderator I am utilizing [the guardian scope of `guardian.can_see_deleted_topics?` for said category](https://github.com/discourse/discourse/pull/19618/files#diff-288e61b8bacdb29d9c2e05b42da6837b0036dcf1867332d977ca7c5e74a44297R802-R803)
There is an issue where chat message processing breaks due to
unhandles `SocketError` exceptions originating in the SSRF check,
specifically in `FinalDestination::Resolver`.
This change gives `FinalDestination::SSRFDetector` a new error class
to wrap the `SocketError` in, and haves the `RetrieveTitle` class
handle that error gracefully.
A few specs in `dashboard_controller_spec.rb` set some state in redis but don't clean it up afterwards which causes other specs to fail when they're ran after `dashboard_controller_spec.rb`.
Related commit: 18467d4.
We were adding to the resolver's work queue before setting up the `@lookup` and `@parent` information. That could lead to the lookup being performed on the wrong (or `nil`) hostname. This also lead to some flakiness in specs.
This test occasionally fails in CI. I haven't been able to reproduce the issue locally. This logging will print some extra information when the assertion fails.
* DEV: Fix png optimization test flakyness
Update fixture with oxipng 7
This test broke when the pngoptimizer got better so the pre-optimized png in the fixtures was compressed further on upload creation, breaking the expected size.
* UX: Wizard Step Enhancements
- Remove illustrations
- Add Emoji graphic to top of steps
- Add description below step title
- Move point of contact to last step
* Move step count to header, plus some button navigation tweaks
* add remaining emoji to step headers
* fix button logic on steps
* Update Point of Contact
* remove automated messages field
* adjust styling for counter, title, and emoji
* Update wording for logos
* Fix tests
* fix prettier
* fix specs
* set same with for steps except for styling screen
* use sentence case; remove duplicate copy under your organization fields
* fix missing buttons on small screens
* add spacing to buttons; adjust font weight to labels
* adjust styling for community logo step; use sentence case for button
* update copy for point of contact text helper
* use sentence case for field labels
* fix ui tests
* use btn-back class to fix ui tests
* reduce bottom margin for toggle fields
* clean up
Co-authored-by: Ella <ella.estigoy@gmail.com>
Previously, restricted category chat channel was available for all groups - even `readonly`. From now on, only user who belong to group with `create_post` or `full` permissions can access that chat channel.
* DEV: Remove enable_whispers site setting
Whispers are enabled as long as there is at least one group allowed to
whisper, see whispers_allowed_groups site setting.
* DEV: Always enable whispers for admins if at least one group is allowed.
We were changing the user's user_option.bookmark_auto_delete_preference
to whatever they changed it to in the bookmark modal to use as default
for future bookmarks. However this was leading to a lot of confusion
since if you wanted to set it for one bookmark you had to remember to
change it back on the next one.
This commit removes that automatic functionality, and instead moves
the bookmark auto delete preference to User Preferences > Interface
in an explicit dropdown.
This commit adds a new notification that gets sent to admins when the site gets new features after an upgrade/deploy. Clicking on the notification takes the admin to the admin dashboard at `/admin` where they can see the new features under the "New Features" section.
Internal topic: t/87166.
Depending on the current state of things, sometimes the homepage style
wouldn't update because we were incorrectly blocking updates the
`desktop_category_page_style` site setting if the first item in the top
menu was 'categories'.
Added a test case to handle this situation.
See https://meta.discourse.org/t/248354
The guardian is useful for plugins to determine if the callback should
do anything. A common use case is to not do anything in the callback if
the user is anonymous.
* FEATURE: Add chat and sidebar toggles to the setup wizard
- Fix css alighnment
- Add Enable Chat Toggle
- Add Enable Sidebar Toggle
* Check for the chat plugin
* Account for new sidebar step
* update chat and sidebar description
* UI: add checkmark as a visual indicator that it is enabled
* use new navigation_memu site setting for enabling the sidebar
* fix tests
* Add tests
* Update lib/wizard/step_updater.rb
Use HEADER_DROPDOWN instead of LEGACY
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* Fix spec. Use HEADER_DROPDOWN instead of LEGACY
Co-authored-by: Ella <ella.estigoy@gmail.com>
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
The tsquery used for searching is generated using both functions from
Ruby and Postgresql (for example, unaccent function). Depending on the
term used, it generated an invalid tsquery. For example "can’t"
generated "''can''t''" instead of "''can''''t''".
* FIX: Use Category.secured(guardian) for hashtag datasource
Follow up to comments in #19219, changing the category
hashtag datasource to use Category.secured(guardian) instead
of Site.new(guardian).categories here since the latter does
more work for not much benefit, and the query time is the
same. Also eliminates some Hash -> Model back and forth
busywork. Add some more specs too.
* FIX: Server-side hashtag lookup cooking user loading
When we were using the PrettyText.options.currentUser
and parsing back and forth with JSON for the hashtag
lookups server-side, we had a bug where the user's
secure categories were not loaded since we never actually
loaded a User model from the database, only parsed it
from JSON.
This commit fixes the issue by instead using the
PretyText.options.userId and looking up the user directly
from the database when calling hashtag_lookup via the
PrettyText::Helpers code when cooking server-side. Added
the missing spec to check for this as well.
If configured, this will be used for static JS assets which are stored on S3. This can be useful if you want to use different CDN providers/configuration for Uploads and JS
This new site setting replaces the
`enable_experimental_sidebar_hamburger` and `enable_sidebar` site
settings as the sidebar feature exits the experimental phase.
Note that we're replacing this without depreciation since the previous
site setting was considered experimental.
Internal Ref: /t/86563
When looking up hashtags which were conflicting (e.g.
management::tag and management) where the user did
not have permission for one of them, we ended up returning
the one they did have permission to (e.g. the tag) twice
because of the way the lookup fallback code worked. This
fixes the issue, and another related one where the
::type was not added to the found item's .ref, and
so the hashtag replacement on the client was not working
correctly.
Update failing spec which previously used non-staff user to create
hidden posts.
Also add new spec for non-staff use cases to prevent future
regressions.
In some cases (e.g. user notification emails) we
are passing an excerpted/stripped version of the
post HTML to Email::Styles, at which point the
<span> elements surrounding the hashtag text have
been stripped. This caused an error when trying to
remove that element to replace the text.
Instead we can just remove all elements inside
a.hashtag-cooked and replace with the raw #hashtag
text which will work in more cases.
The centralization helps in reducing code duplication in our code base
and more importantly, centralizing logic for guardian checks into a
single spot.
This is closer to git's redirect following behaviour. We prevented git
following redirects when we clone in order to prevent SSRF attacks.
Follow-up-to: 291bbc4fb9
When doing local oneboxes we sometimes want to allow
SVGs in the final preview HTML. The main case currently
is for the new cooked hashtags, which include an SVG
icon.
SVGs will be included in local oneboxes via `ExcerptParser` _only_
if they have the d-icon class, and if the caller for `post.excerpt`
specifies the `keep_svg: true` option.
Adds stats for API and user API requests similar to regular page views.
This comes with a new report to visualize API requests per day like the
consolidated page views one.
This commit introduce a new API for registering callbacks, which we'll execute when a user gets destroyed, and the `delete_posts` opt is true. The chat plugin registers one callback and queues a job to destroy every message from that user in batches.
Previously we would unconditionally fetch all images via HTTP to grab
original sizing from cooked post processor in 2 different spots.
This was wasteful as we already calculate and cache this info in upload records.
This also simplifies some specs and reduces use of mocks.
The `decodedMap` prop comes from https://github.com/terser/terser/pull/1190
> This also exposes a new `decodedMap` property on the result object. Decoded maps are free to create (it's a shallow clone of the `GenMapping` instance), and passing them to `@jridgewell/trace-mapping` is copy-free. With Babel [recently](https://github.com/babel/babel/pull/14497) adding a `decodedMap` field, a dev could pass from the Babel transpilation to Terser without any added memory use for sourcemaps.
Raw paths like `/test/path` are not supported natively in the CSP. This commit prepends the site's base URL to these paths. This allows plugins to add 'local' assets to the CSP without needing to hardcode the site's hostname.
In this PR, we're making sure when a theme upload which is used in the theme's CSS is missing it won't break the stylesheet precompilation process. See also: 6ebd2cecda
Adds the description as a title="" attribute on the hashtag
autocomplete search items for tags, categories, and channels.
These descriptions can be seen by the user since they are
able to see the results that are returned by the search via
Guardian checks.
This fix changes the hashtag-raw hashtags, which are
the ones that do not actually match anything, back
to the old style which does not look like mentions.
This will be used by plugins to handle the client side of their custom
post validations without having to overwrite the whole composer save
action as it was done in other plugins.
Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
This commit fleshes out and adds functionality for the new `#hashtag` search and
lookup system, still hidden behind the `enable_experimental_hashtag_autocomplete`
feature flag.
**Serverside**
We have two plugin API registration methods that are used to define data sources
(`register_hashtag_data_source`) and hashtag result type priorities depending on
the context (`register_hashtag_type_in_context`). Reading the comments in plugin.rb
should make it clear what these are doing. Reading the `HashtagAutocompleteService`
in full will likely help a lot as well.
Each data source is responsible for providing its own **lookup** and **search**
method that returns hashtag results based on the arguments provided. For example,
the category hashtag data source has to take into account parent categories and
how they relate, and each data source has to define their own icon to use for the
hashtag, and so on.
The `Site` serializer has two new attributes that source data from `HashtagAutocompleteService`.
There is `hashtag_icons` that is just a simple array of all the different icons that
can be used for allowlisting in our markdown pipeline, and there is `hashtag_context_configurations`
that is used to store the type priority orders for each registered context.
When sending emails, we cannot render the SVG icons for hashtags, so
we need to change the HTML hashtags to the normal `#hashtag` text.
**Markdown**
The `hashtag-autocomplete.js` file is where I have added the new `hashtag-autocomplete`
markdown rule, and like all of our rules this is used to cook the raw text on both the clientside
and on the serverside using MiniRacer. Only on the server side do we actually reach out to
the database with the `hashtagLookup` function, on the clientside we just render a plainer
version of the hashtag HTML. Only in the composer preview do we do further lookups based
on this.
This rule is the first one (that I can find) that uses the `currentUser` based on a passed
in `user_id` for guardian checks in markdown rendering code. This is the `last_editor_id`
for both the post and chat message. In some cases we need to cook without a user present,
so the `Discourse.system_user` is used in this case.
**Chat Channels**
This also contains the changes required for chat so that chat channels can be used
as a data source for hashtag searches and lookups. This data source will only be
used when `enable_experimental_hashtag_autocomplete` is `true`, so we don't have
to worry about channel results suddenly turning up.
------
**Known Rough Edges**
- Onebox excerpts will not render the icon svg/use tags, I plan to address that in a follow up PR
- Selecting a hashtag + pressing the Quote button will result in weird behaviour, I plan to address that in a follow up PR
- Mixed hashtag contexts for hashtags without a type suffix will not work correctly, e.g. #ux which is both a category and a channel slug will resolve to a category when used inside a post or within a [chat] transcript in that post. Users can get around this manually by adding the correct suffix, for example ::channel. We may get to this at some point in future
- Icons will not show for the hashtags in emails since SVG support is so terrible in email (this is not likely to be resolved, but still noting for posterity)
- Additional refinements and review fixes wil
The hidden site setting `suppress_secured_categories_from_admin` will
suppress visibility of categories without explicit access from admins
in a few key areas (category drop downs and topic lists)
It is not intended to be a security wall since admins can amend any site
setting. Instead it is feature that allows hiding the categories from the
UI.
Admins will still be able to see topics in categories without explicit
access using direct URLs or flags.
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* FEATURE: Default Composer Category Site Setting
- Create the default_composer_category site setting
- Replace general_category_id logic for auto selecting the composer
category
- Prevent Uncategorized from being selected if not allowed
- Add default_composer_category option to seeded categories
- Create a migration to populate the default_composer_category site
setting if there is a general_category_id populated
- Added some tests
* Add missing translation for the new site setting
* fix some js tests
* Just check that the header value is null
Currently, moderators are able to set primary group for users
irrespective of the of the `moderators_manage_categories_and_groups` site
setting value.
This change updates Guardian implementation to honour it.
Previously the stylesheet cachebusting hash was based on the maximum mtime of files. This works well in development and during in-container updates (e.g. via docker_manager). However, when a fresh docker image is created for each deploy, the file mtimes will change even if the contents has not.
This commit changes the production logic to calculate the cachebuster from the filenames and contents of the relevant assets. This should be consistent across deploys, thereby improving cache hits and improving page load times.
Since the system user is a regular user, it can have its
`allow_private_messages` user option turned off, which
with our current `can_send_private_message?(Discourse.system_user)`
check inside the CurrentUserSerializer, will prevent any
user from sending messages in the UI if the system user is not
accepting PMs.
This commit adds a new `can_send_private_messages?` method to
the Guardian, which can be used in serializers and not depend
on the system user. When the user actually sends a message
we still rely on the old `can_send_private_message?(target)`
call to see if they are allowed to send the message to the target.
The new method is just to say they can "generally" send
private messages.