Previously jobs would fail silently in test mode. Now they will raise the exception and cause the relevant test to fail. This identified a few broken tests, which I will fix in a followup commit
- Plugin developers using OpenID2.0 should migrate to OAuth2 or OIDC. OpenID2.0 APIs will be removed in v2.4.0
- For sites requiring Yahoo login, it can be implemented using the OpenID Connect plugin: https://meta.discourse.org/t/103632
For more information, see https://meta.discourse.org/t/113249
Previously we tried using a sub-query which has some terrible performance
implications, by adding 1 extra query we can eliminate the PG issue
A join to user_stats also is prone to the same slowdown
This optimisation avoids large scans joining the topics table with the
topic_users table.
Previously when a user carried a lot of read state we would have to join
the entire read state with the topics table. This operation would slow down
home page and every topic page. The more read state you accumulated the
larger the impact.
The optimisation helps people who clean up unread, however if you carry
unread from years ago it will only have minimal impact.
A new checkbox has been added to the Tags tab of the category settings modal
which is used when some tags and/or tag groups are restricted to the category,
and all other unrestricted tags should also be allowed.
Default is the same as the previous behaviour: only allow the specified set of
tags and tag groups in the category.
Previously due to #b2dc65f9534ea date on the quoted_posts table could not
be trusted.
This changes it so we use the date on the actual post as the grant date.
Note: there is an edge case where you create a post and only add a quote
a week later. In this case the badge will not be awarded at the correct
time (it will display it was granted a week ago).
That said this is far more rare than the current situation.
This functionality was never supported but before the new review queue
it didn't have any errors. Now the combination of settings is prevented
and existing sites with sso enabled will be migrated to remove invite
only.
The default ranking options ranks by the number of matches which is
highly problematic when posts are stuffed with a keyword. The ranking
will now be divided by the document length which is a much fairer way to
rank.
This commit fixes the follow quality issue with `PostSearchData#raw_data`:
1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.
`Post#cooked`:
```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```
2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.
From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.
`Post#cooked`
```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```
In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.