Commit Graph

10305 Commits

Author SHA1 Message Date
Sam c5345d0e54
FEATURE: prioritize_exact_search_title_match hidden setting (#20089)
The new `prioritize_exact_search_match` can be used to force the search
algorithm to prioritize exact term matches in title when ranking results.

This is scoped narrowly to titles for cases such as a topic titled:

"organisation chart" and a search of "org chart".

If we scoped this wider, all discussion about "org chart" would float to
the top and leave a very common title de-prioritized.

This is a hidden site setting and it has some performance impact due
to double ranking.

That said, performance impact is somewhat mitigated cause ranking on
title alone is a very cheap operation.
2023-01-31 16:34:01 +11:00
Alan Guo Xiang Tan f31f0b70f8
SECURITY: Hide PM count for tags by default (#20061)
Currently `Topic#pm_topic_count` is a count of all personal messages tagged for a given tag. As a result, any user with access to PM tags can poll a sensitive tag to determine if a new personal message has been created using that tag even if the user does not have access to the personal message. We classify this as a minor leak in sensitive information.

With this commit, `Topic#pm_topic_count` is hidden from users by default unless the `display_personal_messages_tag_counts` site setting is enabled.
2023-01-31 12:08:23 +08:00
Sam 07679888c8
FEATURE: allow restricting duplication in search index (#20062)
* FEATURE: allow restricting duplication in search index

This introduces the site setting `max_duplicate_search_index_terms`.
Using this number we limit the amount of duplication in our search index.

This allows us to more correctly weight title searches, so bloated posts
don't unfairly bump to the top of search results.

This feature is completely disabled by default and behind a site setting

We will experiment with it first. Note entire search index must be rebuilt
for it to take effect.


---------

Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2023-01-31 12:41:31 +11:00
Alan Guo Xiang Tan c5c72a74b7
DEV: Fix flaky test due to a lack of deterministic ordering (#20087) 2023-01-31 08:57:34 +08:00
Alan Guo Xiang Tan 6934edd97c
DEV: Add hidden site setting to configure search ranking weights (#20086)
This site setting is mostly experimental at this point.
2023-01-31 08:57:13 +08:00
Sam 5d669d8aa2
Revert "FEATURE: hidden site setting to disable search prefix matching (#20058)" (#20073)
This reverts commit 64f7b97d08.

Too many side effects for this setting, we have decided to remove it
2023-01-31 07:39:23 +08:00
David Taylor 4d12bdfdcb
DEV: Fix user_status_controller_spec flakiness (#20083)
In some situations, these HTTP calls would cause some cache to warmup and send a `/distributed_hash` message-bus message. We can avoid tracking those by passing a specific channel name to `track_publish`.
2023-01-30 22:42:47 +00:00
Joffrey JAFFEUX a4c32e3970
DEV: attempts to fix flakey spec (#20075) 2023-01-30 21:47:44 +01:00
Joffrey JAFFEUX 137f28e0d6
DEV: skip spec failing on CI (#20077) 2023-01-30 21:47:31 +01:00
David Taylor fa7f8d8e1b
DEV: Update user-status test to assert message-bus channels (#20068)
This test appears to be flaky. This assertion should help us track down the reason.
2023-01-30 13:54:44 +00:00
David Taylor 79bea9464c
PERF: Move user-tips and narrative to per-user messagebus channels (#19773)
Using a shared channel with per-message permissions means that every client is updated with the channel's 'last_id', even if there are no messages available to them. Per-user channel names avoid this problem - the last_id will only be incremented when there is a message for the given user.
2023-01-30 11:48:09 +00:00
Bianca Nenciu 23a74ecf8f
FIX: Truncate existing user status to 100 chars (#20044)
This commits adds a database migration to limit the user status to 100
characters, limits the user status in the UI and makes sure that the
emoji is valid.

Follow up to commit b6f75e231c.
2023-01-30 10:49:08 +02:00
Ayke Halder 9f14d643a5
DEV: use structured data in crawler-linkback-list for referencing only (#16237)
This simplifies the crawler-linkback-list to only be a point of reference to the actual DiscussionForumPosting objects.

See "Summary page": https://developers.google.com/search/docs/advanced/structured-data/carousel?hl=en#summary-page
> [It] defines an ItemList, where each ListItem has only three properties: @type (set to ListItem), position (the position in the list), and url (the URL of a page with full details about that item).
2023-01-30 08:26:55 +01:00
Ayke Halder 137dbaf0dc
DEV: declare post position as simple number in structured data (#16231)
This replaces the position declared as `#123` with the more simple version `123`.

The property position may be of type Integer or Text. A value of type Integer, or more precise of type Text which simply casts to integer, is sufficient here.
See: https://schema.org/position

In category-view the topic-list already uses this notation for the position of topics:
`<meta itemprop="position" content="123">`
2023-01-30 08:07:04 +01:00
Sam 64f7b97d08
FEATURE: hidden site setting to disable search prefix matching (#20058)
Many users seems surprised by prefix matching in search leading to
unexpected results.

Over the years we always would return results starting with a search term
and not expect exact matches.

Meaning a search for `abra` would find `abracadabra`

This introduces the Site Setting `enable_search_prefix_matching` which
defaults to true. (behavior unchanged)

We plan to experiment on select sites with exact matches to see if the
results are less surprising
2023-01-30 12:44:40 +08:00
Alan Guo Xiang Tan 7ec6e6b3d0
PERF: N+1 queries on `/tags` with multiple categories tags (#19906)
When the `tags_listed_by_group` site setting is disable, we were seeing
the N+1 queries problem when multiple `CategoryTag` records are listed.
This commit fixes that by ensuring that we are not filtering through the
category `tags` association after the association has been eager loaded.
2023-01-30 08:53:17 +08:00
Keegan George a4c68d4a2e
FIX: Failing system spec for rate limited search (#20046) 2023-01-27 12:14:29 -08:00
Sam 2c8dfc3dbc
FEATURE: rate limit anon searches per second (#19708) 2023-01-27 10:05:27 -08:00
Bianca Nenciu b6f75e231c
FIX: Limit user status to 100 characters (#20040)
* FIX: Limit user status to 100 characters

* FIX: Make sure the emoji is valid
2023-01-27 16:32:27 +02:00
Bianca Nenciu 8fc11215e1
FIX: Ensure soft-deleted topics can be deleted (#19802)
* FIX: Ensure soft-deleted topics can be deleted

The topic was not found during the deletion process because it was
deleted and `@post.topic` was nil.

* DEV: Use @topic instead of finding the topic every time
2023-01-27 16:15:33 +02:00
Martin Brennan 48eb8d5f5a
Revert "DEV: Delete dead Topic#incoming_email_addresses code (#19970)" (#20037)
This reverts commit 88a972c61b.

It's actually used in some plugins.
2023-01-27 11:27:15 +10:00
Martin Brennan c8f8d9dbb6
DEV: Change slugs/generate endpoint from GET to POST (#19984)
Followup on feedback on PR #19928
https://github.com/discourse/discourse/pull/19928#discussion_r1083687839,
it makes more sense to have this endpoint as a POST rather than
a GET.
2023-01-27 10:58:33 +10:00
David Taylor 798b4bb604
FIX: Ensure anon-cached values are never returned for API requests (#20021)
Under some situations, we would inadvertently return a public (unauthenticated) result to an authenticated API request. This commit adds the `Api-Key` header to our anonymous cache bypass logic.
2023-01-26 13:26:29 +00:00
Penar Musaraj 60990aab55
DEV: Fix flakey assertion in test (#20011)
This assertion was failing in internal builds. I can repro locally if I
set `foobarbaz` to be created after `quxbarbaz`.

For now, I think this complication in the test is unnecessary, hence this
removes the `quxbarbaz` case.
2023-01-25 13:24:13 -05:00
Bianca Nenciu c186a46910
SECURITY: Prevent XSS in local oneboxes (#20008)
Co-authored-by: OsamaSayegh <asooomaasoooma90@gmail.com>
2023-01-25 19:17:21 +02:00
Bianca Nenciu f55e0fe791
SECURITY: Update to exclude tag topic filter (#20006)
Ignores tags specified in exclude_tag topics param that a user does not
have access to.

Co-authored-by: Blake Erickson <o.blakeerickson@gmail.com>
2023-01-25 18:56:22 +02:00
Bianca Nenciu 105fee978d
SECURITY: only show restricted tag lists to authorized users (#20004)
Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2023-01-25 18:55:55 +02:00
Bianca Nenciu cd7c8861ae
SECURITY: Remove bypass for base_url (#19995)
The check used to be necessary because we validated the referrer too and
this bypass was a workaround a bug that is present in some browsers that
do not send the correct referrer.
2023-01-25 13:50:45 +02:00
Natalie Tay d5745d34c2
SECURITY: Limit the character count of group membership requests (#19993)
When creating a group membership request, there is no character
limit on the 'reason' field. This can be potentially be used by
an attacker to create enormous amount of data in the database.

Co-authored-by: Ted Johansson <ted@discourse.org>
2023-01-25 13:50:33 +02:00
Natalie Tay f91ac52a22
SECURITY: Limit the length of drafts (#19989)
Co-authored-by: Loïc Guitaut <loic@discourse.org>
2023-01-25 13:50:21 +02:00
Loïc Guitaut ec2ed5b7f6 FIX: Delete reviewables associated to posts automatically
Currently we don’t have an association between reviewables and posts.
This sometimes leads to inconsistencies in the DB as a post can have
been deleted but an associated reviewable is still present.

This patch addresses this issue simply by adding a new association to
the `Post` model and by using the `dependent: :destroy` option.
2023-01-25 09:45:36 +01:00
Martin Brennan 82182ec0c7
DEV: Add hashtag controller specs (#19983)
This is just cleaning up a TODO I had to add more specs
to this controller -- there are more thorough tests on the
actual HashtagService class and the type-specific hashtag
classes.
2023-01-25 17:13:32 +10:00
Martin Brennan 88a972c61b
DEV: Delete dead Topic#incoming_email_addresses code (#19970)
This code has been dead since b463a80cbf,
we can delete it now.
2023-01-25 09:34:41 +10:00
David Taylor eee97ad29a
DEV: Patch capybara to ignore client-triggered errors (#19972)
In dev/prod, these are absorbed by unicorn. Most commonly, they occur when a client interrupts a message-bus long-polling request.

Also reverts the EPIPE workaround introduced in 011c9b9973
2023-01-24 11:07:29 +00:00
Martin Brennan 63fdb6dd65
FIX: Do not add empty use/svg tags in ExcerptParser (#19969)
There was an issue where if hashtag-cooked HTML was sent
to the ExcerptParser without the keep_svg option, we would
end up with empty </use> and </svg> tags on the parts of the
excerpt where the hashtag was, in this case when a post
push notification was sent.

Fixed this, and also added a way to only display a plaintext
version of the hashtag for cases like this via PrettyText#excerpt.
2023-01-24 14:40:24 +10:00
Vinoth Kannan 799202d50b
FIX: skip email if blank while syncing SSO attributes. (#19939)
Also, return email blank error in `EmailValidator`  when the email is blank.
2023-01-24 09:10:24 +05:30
Martin Brennan 0924f874bd
DEV: Use UploadReference instead of ChatUpload in chat (#19947)
We've had the UploadReference table for some time now in core,
but it was added after ChatUpload was and chat was just never
moved over to this new system.

This commit changes all chat code dealing with uploads to create/
update/delete/query UploadReference records instead of ChatUpload
records for consistency. At a later date we will drop the ChatUpload
table, but for now keeping it for data backup.

The migration + post migration are the same, we need both in case
any chat uploads are added/removed during deploy.
2023-01-24 13:28:21 +10:00
Martin Brennan 110c96e6d7
FIX: Do not count deleted post for upload ref security (#19949)
When checking whether an existing upload should be secure
based on upload references, do not count deleted posts, since
there is still a reference attached to them. This can lead to
issues where e.g. an upload is used for a post then later on
a custom emoji.
2023-01-24 10:01:48 +10:00
Blake Erickson 774feb6614
FEATURE: Add api scope for create invite endpoint (#19964)
Adds an api scope for the POST /invite endpoint.
2023-01-23 16:20:22 -07:00
Blake Erickson 09f5235538
FEATURE: Add api scope for search endpoint (#19955)
Adds two new api scopes for the /search endpoints:

- `/search.json?q=term`
- `/search/query.json?term=term`

see: https://meta.discourse.org/t/search-api-key-permissions/227244
2023-01-23 14:06:57 -07:00
Martin Brennan 641e94fc3c
FEATURE: Allow changing slug on create channel (#19928)
This commit allows us to set the channel slug when creating new chat
channels. As well as this, it introduces a new `SlugsController` which can
generate a slug using `Slug.for` and a name string for input. We call this
after the user finishes typing the channel name (debounced) and fill in
the autogenerated slug in the background, and update the slug input
placeholder.

This autogenerated slug is used by default, but if the user writes anything
else in the input it will be used instead.
2023-01-23 14:48:33 +10:00
Krzysztof Kotlarek ae20ce8654
FIX: TL4 user can see deleted topics (#19946)
New feature that TL4 users can delete/recover topics and post was introduced https://github.com/discourse/discourse/pull/19766

One guardian was missed to ensure that can see deleted topics
2023-01-23 12:02:47 +11:00
Krzysztof Kotlarek 019ec74076
FEATURE: setting which allows TL4 users to deleted posts (#19766)
New setting which allows TL4 users to delete/view/recover posts and topics
2023-01-20 13:31:51 +11:00
Krzysztof Kotlarek f409e977a9
FIX: deleted misconfigured embeddable hosts (#19833)
When EmbeddableHost is configured for a specific category and that category is deleted, then EmbeddableHost should be deleted as well.

In addition, migration was added to fix existing data.
2023-01-20 13:29:49 +11:00
Alan Guo Xiang Tan f122f24b35
SECURITY: Default tags to show count of topics in unrestricted categories (#19916)
Currently, `Tag#topic_count` is a count of all regular topics regardless of whether the topic is in a read restricted category or not. As a result, any users can technically poll a sensitive tag to determine if a new topic is created in a category which the user has not excess to. We classify this as a minor leak in sensitive information.

The following changes are introduced in this commit:

1. Introduce `Tag#public_topic_count` which only count topics which have been tagged with a given tag in public categories.
2. Rename `Tag#topic_count` to `Tag#staff_topic_count` which counts the same way as `Tag#topic_count`. In other words, it counts all topics tagged with a given tag regardless of the category the topic is in. The rename is also done so that we indicate that this column contains sensitive information. 
3. Change all previous spots which relied on `Topic#topic_count` to rely on `Tag.topic_column_count(guardian)` which will return the right "topic count" column to use based on the current scope. 
4. Introduce `SiteSetting.include_secure_categories_in_tag_counts` site setting to allow site administrators to always display the tag topics count using `Tag#staff_topic_count` instead.
2023-01-20 09:50:24 +08:00
Martin Brennan 4d2a95ffe6
FIX: Query UploadReference in UploadSecurity for existing uploads (#19917)
This fixes a longstanding issue for sites with the
secure_uploads setting enabled. What would happen is a scenario
like this, since we did not check all places an upload could be
linked to whenever we used UploadSecurity to check whether an
upload should be secure:

* Upload is created and used for site setting, set to secure: false
  since site setting uploads should not be secure. Let's say favicon
* Favicon for the site is used inside a post in a private category,
  e.g. via a Onebox
* We changed the secure status for the upload to true, since it's been
  used in a private category and we don't check if it's originator
  was a public place
* The site favicon breaks :'(

This was a source of constant consternation. Now, when an upload is _not_
being created, and we are checking if an existing upload should be
secure, we now check to see what the first record in the UploadReference
table is for that upload. If it's something public like a site setting,
then we will never change the upload to `secure`.
2023-01-20 10:24:52 +10:00
Alan Guo Xiang Tan b00e160dae
PERF: Don't parse posts for mentions when user status is disabled (#19915)
Prior to this change, we were parsing `Post#cooked` every time we
serialize a post to extract the usernames of mentioned users in the
post. However, the only reason we have to do this is to support
displaying a user's status beside each mention in a post on the client side when
the `enable_user_status` site setting is enabled. When
`enable_user_status` is disabled, we should avoid having to parse
`Post#cooked` since there is no point in doing so.
2023-01-20 07:58:00 +08:00
Isaac Janzen 292d3677e9
FEATURE: Allow admins to permanently delete revisions (#19913)
# Context
This PR introduces the ability to permanently delete revisions from a post while maintaining the changes implemented by the revisions.
Additional Context: /t/90301

# Functionality
In the case a staff member wants to _remove the visual cue_ that a post has been edited eg.

<img width="86" alt="Screenshot 2023-01-18 at 2 59 12 PM" src="https://user-images.githubusercontent.com/50783505/213293333-9c881229-ab18-4591-b39b-e3419a67907d.png">

while maintaining the changes made in the edits, they can enable the (hidden) site setting of `can_permanently_delete`.
When this is enabled, after _hiding_ the revisions

<img width="149" alt="Screenshot 2023-01-19 at 1 53 35 PM" src="https://user-images.githubusercontent.com/50783505/213546080-2a9e9c55-b3ef-428e-a93d-1b6ba287dfae.png">

there will be an additional button in the history modal to <kbd>Delete revisions</kbd> on a post.

<img width="997" alt="Screenshot 2023-01-19 at 1 49 51 PM" src="https://user-images.githubusercontent.com/50783505/213546333-49042558-50ab-4724-9da7-08bacc68d38d.png">

Since this action is permanent, we display a confirmation dialog prior to triggering the destroy call

<img width="722" alt="Screenshot 2023-01-19 at 1 55 59 PM" src="https://user-images.githubusercontent.com/50783505/213546487-96ea6e89-ac49-4892-b4b0-28996e3c867f.png">

Once confirmed the history modal will close and the post will `rebake` to display an _unedited_ post.

<img width="868" alt="Screenshot 2023-01-19 at 1 56 35 PM" src="https://user-images.githubusercontent.com/50783505/213546608-d6436717-8484-4132-a1a8-b7a348d92728.png">
 
see that there is not a visual que for _revision have been made on this post_ for a post that **HAS** been edited. In addition to this, a user history log for `purge_post_revisions` will be added for each action completed.

# Limits
- Admins are rate limited to 20 posts per minute
2023-01-19 15:09:01 -06:00
David Taylor 5406e24acb
FEATURE: Introduce pg_force_readonly_mode GlobalSetting (#19612)
This allows the entire cluster to be forced into pg readonly mode. Equivalent to running `Discourse.enable_pg_force_readonly_mode` on the console.
2023-01-19 13:59:11 +00:00
Martin Brennan 56a93f7532
FEATURE: Add rake task to mark old hashtag format for rebake (#19876)
Since the new hashtag format has been added, we want site
admins to be able to rebake old posts with the old hashtag
format. This can now be done with `rake hashtags:mark_old_format_for_rebake`
which goes and marks posts with the old cooked version of hashtags
in this format for rebake:

```
<a class=\"hashtag\" href=\"/c/ux/14\">#<span>ux</span></a>
```

c.f. https://meta.discourse.org/t/what-rebake-is-required-for-the-new-autocomplete-styling/249642/12
2023-01-18 10:16:05 +10:00