Commit Graph

467 Commits

Author SHA1 Message Date
jbrw 3593e582a3
FIX - limit number of embedded media items in a post (#10391)
* FIX - limit number of embedded media items in a post

* Add renamed settings to DeprecatedSettings
2020-08-07 12:08:59 -04:00
David Taylor ceb858c70a
PERF: Release post_upload records when downloaded image is removed (#10379)
Previously we would unconditionally keep all images downloaded via pull_hotlinked_images, even if they are later removed from the post. This commit removes that logic, and relies on the existing link_post_uploads process to pick up the downloaded images in `cooked`. Specs are added to ensure this is working correctly for regular hotlinked images, and for oneboxes.
2020-08-06 10:06:34 +10:00
David Taylor cb12a721c4
REFACTOR: Refactor pull_hotlinked_images job
This commit should cause no functional change
- Split into functions to avoid deep nesting
- Register custom field type, and remove manual json parse/serialize
- Recover from deleted upload records

Also adds a test to ensure pull_hotlinked_images redownloads secure images only once
2020-08-05 12:14:59 +01:00
Krzysztof Kotlarek e0d9232259
FIX: use allowlist and blocklist terminology (#10209)
This is a PR of the renaming whitelist to allowlist and blacklist to the blocklist.
2020-07-27 10:23:54 +10:00
jbrw 2aec92d0b4
FEATURE - allow Group Moderators to edit category description (#10292)
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2020-07-23 09:50:00 -04:00
Guo Xiang Tan cbe1dd8ec7
Revert "FIX: Delete related search data when record has been deleted."
This reverts commit ecc799ab56.

This commit does not fix anything because we've always been deleting
records in `Searchable`.
2020-07-09 10:08:35 +08:00
Guo Xiang Tan ecc799ab56
FIX: Delete related search data when record has been deleted. 2020-07-02 12:24:59 +08:00
Régis Hanol 7109d94ee7 FIX: properly invalidate inline oneboxes when rebaking
When rebaking a post we were invalidating _regular_ oneboxes but not inline oneboxes.

DEV: also renamed 'InlineOneboxer.purge' to 'InlineOneboxer.invalidate' to keep
the API consistent with 'Oneboxer.invalidate'
2020-06-24 11:54:54 +02:00
Jarek Radosz 58a5293d9b
FIX: Delete PostUploads on Post deletion (#10090)
Related uploads will then be removed by Jobs::CleanUpUploads
2020-06-19 17:45:08 +02:00
David Taylor a2f80670e1
FIX: Do not count youtube thumbnail when counting post images (#10049) 2020-06-15 20:25:30 +01:00
David Taylor 75b1298e99
DEV: Drop unused image_url column from posts and topics (#9953)
This has been superseded by image_upload_id. The image_url value in API responses is now generated dynamically from the upload record.
2020-06-02 16:21:38 +10:00
Jarek Radosz 00aab49829
FIX: Restore the ability to rebake posts in deleted topics (#9951)
Lost in #9852
2020-06-01 07:04:16 +02:00
David Taylor 28f46c171c
FIX: Pull hotlinked images even when edited by system users (#9890)
Previously the pull hotlinked images job was skipped after system edits. This ensured that we never had an infinite loop of system-edit/pull-hotlinked/system-edit/pull-hotlinked etc.

A side effect was that edits made by system for any other reason (e.g. API, removing full quotes) would prevent pulling hotlinked images. This commit removes the system edit check, and replaces it with another method to avoid an infinite job scheduling loop.
2020-05-29 13:07:47 +01:00
Michael Brown d9a02d1336
Revert "Revert "Merge branch 'master' of https://github.com/discourse/discourse""
This reverts commit 20780a1eee.

* SECURITY: re-adds accidentally reverted commit:
  03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
  instead of the 03d26cd6 parent (which contains security fixes)
2020-05-23 00:56:13 -04:00
Jeff Atwood 20780a1eee Revert "Merge branch 'master' of https://github.com/discourse/discourse"
This reverts commit e62a85cf6f, reversing
changes made to 2660c2e21d.
2020-05-22 20:25:56 -07:00
Martin Brennan f9d55b4941
FEATURE: Update the topic excerpt when the OP is rebaked (#9852)
* We now have a site setting "topic_excerpt_maxlength" that is used when the OP is created or revised to generate a topic excerpt.
* However, posts created before this setting was introduced cannot benefit from this change unless they are revised, and if the topic excerpt length setting is changed that situation is also not covererd.
* This PR makes a change to rebake! to update the topic excerpt IF the post is the OP.
2020-05-22 13:04:15 +10:00
Martin Brennan df68d11c38
FEATURE: Add topic excerpt max length site setting (#9847)
Adds a new topic_excerpt_maxlength site setting.

* When topic excerpt is requested for a post, use the new topic_excerpt_maxlength site setting to limit the size of the excerpt
* Remove code for getting/setting Post.excerpt_size as it is not used anywhere
2020-05-21 13:19:48 +10:00
Kane York 1059557ce1
DEV: Standardize ignored_columns removal comments (#9771) 2020-05-13 13:08:15 -07:00
David Taylor 03818e642a
FEATURE: Include optimized thumbnails for topics (#9215)
This introduces new APIs for obtaining optimized thumbnails for topics. There are a few building blocks required for this:

- Introduces new `image_upload_id` columns on the `posts` and `topics` table. This replaces the old `image_url` column, which means that thumbnails are now restricted to uploads. Hotlinked thumbnails are no longer possible. In normal use (with pull_hotlinked_images enabled), this has no noticeable impact

- A migration attempts to match existing urls to upload records. If a match cannot be found then the posts will be queued for rebake

- Optimized thumbnails are generated during post_process_cooked. If thumbnails are missing when serializing a topic list, then a sidekiq job is queued

- Topic lists and topics now include a `thumbnails` key, which includes all the available images:
   ```
   "thumbnails": [
   {
     "max_width": null,
     "max_height": null,
     "url": "//example.com/original-image.png",
     "width": 1380,
     "height": 1840
   },
   {
     "max_width": 1024,
     "max_height": 1024,
     "url": "//example.com/optimized-image.png",
     "width": 768,
     "height": 1024
   }
   ]
  ```

- Themes can request additional thumbnail sizes by using a modifier in their `about.json` file:
   ```
    "modifiers": {
      "topic_thumbnail_sizes": [
        [200, 200],
        [800, 800]
      ],
      ...
  ```
  Remember that these are generated asynchronously, so your theme should include logic to fallback to other available thumbnails if your requested size has not yet been generated

- Two new raw plugin outlets are introduced, to improve the customisability of the topic list. `topic-list-before-columns` and `topic-list-before-link`
2020-05-05 09:07:50 +01:00
Krzysztof Kotlarek 9bff0882c3
FEATURE: Nokogumbo (#9577)
* FEATURE: Nokogumbo

Use Nokogumbo HTML parser.
2020-05-05 13:46:57 +10:00
Sam Saffron 310a7edee5
DEV: remove unused columns from posts and topics
avg_time on posts and topics have not been used in a year.

This uses a re-runnable ddl transaction diasabled migration to
drop the column, cause it touchs very high traffic table and may
deadlock
2020-04-30 11:22:33 +10:00
Dan Ungureanu e733701887
FEATURE: Make report filters reusable (#9444)
This commit also adds 'include subcategories' report filter
2020-04-22 11:52:50 +03:00
Robin Ward b6b92a562c
FEATURE: New site setting `embed_unlisted` (#9391)
If enabled, posts imported to discourse via embeddings will default to
unlisted until they receive a reply.
2020-04-13 15:17:02 -04:00
Dan Ungureanu 799ddeea3c
FIX: Include subcategories in 'posts' report (#9410) 2020-04-13 18:57:52 +03:00
David Taylor 5ff505cea6
SECURITY: Respect topic permissions when loading draft metadata
Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
2020-03-23 11:30:40 +00:00
Penar Musaraj e69b6379ad
FEATURE: Broader support for post uploads in video markup (#9152)
Ensures URLs in the following HTML attributes are included in post uploads:
- video poster
- source src
- track src
2020-03-10 09:01:40 -04:00
Jarek Radosz 85e03a7f68
DEV: Replace `Time.new` with `Time.now` (#9142)
(or `Time.zone.now`)
2020-03-09 17:37:49 +01:00
Martin Brennan 8123538c94
DEV: Minor review fixes and fix bookmark spec logging (#9045)
As per:

https://review.discourse.org/t/fix-never-allow-custom-emoji-to-be-marked-secure-8965/9072
https://review.discourse.org/t/feature-improving-bookmarks-part-2-topic-bookmarking-8954/9038
2020-03-02 15:40:29 +10:00
Martin Brennan 56b16bc68e
FIX: Never allow custom emoji to be marked secure (#8965)
* Because custom emoji count as post "uploads" we were
marking them as secure when updating the secure status for post uploads.
* We were also giving them an access control post id, which meant
broken image previews from 403 errors in the admin custom emoji list.
* We now check if an upload is used as a custom emoji and do not
assign the access control post + never mark as secure.
2020-02-14 11:17:09 +10:00
Martin Brennan e1e74abd4f
FEATURE: Improving bookmarks part 2 -- Topic Bookmarking (#8954)
### UI Changes

If `SiteSetting.enable_bookmarks_with_reminders` is enabled:

* Clicking "Bookmark" on a topic will create a new Bookmark record instead of a post + user action
* Clicking "Clear Bookmarks" on a topic will delete all the new Bookmark records on a topic
* The topic bookmark buttons control the post bookmark flags correctly and vice-versa
Disabled selecting the "reminder type" for bookmarks in the UI because the backend functionality is not done yet (of sending users notifications etc.)

### Other Changes

* Added delete bookmark route (but no UI yet)
* Added a rake task to sync the old PostAction bookmarks to the new Bookmark table, which can be run as many times as we want for a site (it will not create duplicates).
2020-02-13 16:26:02 +10:00
Robin Ward 37888d9818 FIX: Never return the same reply more than once via reply_ids
If our reply tree somehow ends up with cycles or other odd
structures, we only want to consider a reply once, at the first
level in the tree that it appears.
2020-02-03 13:41:18 -05:00
Robin Ward f83362b05b FIX: Don't return post replies from other topics
It seems in some situations replies have been moved to other topics but
the `PostReply` table has not been updated. I will try and fix this in a
follow up PR, but for now this fix ensures that every time we ask a post
for its replies that we restrict it to the same topic.
2020-02-03 13:12:27 -05:00
Martin Brennan 45b37a8bd1
FIX: Resolve pull hotlinked image and broken link issues for secure media URLs (#8777)
When pull_hotlinked_images tried to run on posts with secure media (which had already been downloaded from external sources) we were getting a 404 when trying to download the image because the secure endpoint doesn't allow anon downloads.

Also, we were getting into an infinite loop of pull_hotlinked_images because the job didn't consider the secure media URLs as "downloaded" already so it kept trying to download them over and over.

In this PR I have also refactored secure-media-upload URL checks and mutations into single source of truth in Upload, adding a SECURE_MEDIA_ROUTE constant to check URLs against too.
2020-01-24 11:59:30 +10:00
Martin Brennan 1b3b0708c0
FEATURE: Update upload security status on post move, topic conversion, category change (#8731)
Add TopicUploadSecurityManager to handle post moves. When a post moves around or a topic changes between categories and public/private message status the uploads connected to posts in the topic need to have their secure status updated, depending on the security context the topic now lives in.
2020-01-23 12:01:10 +10:00
Gerhard Schlager ab07b945c2
Merge pull request #8736 from gschlager/rename_reply_id_column
REFACTOR: Rename `post_replies.reply_id` column to `post_replies.reply_post_id`
2020-01-17 17:24:49 +01:00
Martin Brennan 7c32411881
FEATURE: Secure media allowing duplicated uploads with category-level privacy and post-based access rules (#8664)
### General Changes and Duplication

* We now consider a post `with_secure_media?` if it is in a read-restricted category.
* When uploading we now set an upload's secure status straight away.
* When uploading if `SiteSetting.secure_media` is enabled, we do not check to see if the upload already exists using the `sha1` digest of the upload. The `sha1` column of the upload is filled with a `SecureRandom.hex(20)` value which is the same length as `Upload::SHA1_LENGTH`. The `original_sha1` column is filled with the _real_ sha1 digest of the file. 
* Whether an upload `should_be_secure?` is now determined by whether the `access_control_post` is `with_secure_media?` (if there is no access control post then we leave the secure status as is).
* When serializing the upload, we now cook the URL if the upload is secure. This is so it shows up correctly in the composer preview, because we set secure status on upload.

### Viewing Secure Media

* The secure-media-upload URL will take the post that the upload is attached to into account via `Guardian.can_see?` for access permissions
* If there is no `access_control_post` then we just deliver the media. This should be a rare occurrance and shouldn't cause issues as the `access_control_post` is set when `link_post_uploads` is called via `CookedPostProcessor`

### Removed

We no longer do any of these because we do not reuse uploads by sha1 if secure media is enabled.

* We no longer have a way to prevent cross-posting of a secure upload from a private context to a public context.
* We no longer have to set `secure: false` for uploads when uploading for a theme component.
2020-01-16 13:50:27 +10:00
Martin Brennan edbc356593
FIX: Replace deprecated URI.encode, URI.escape, URI.unescape and URI.unencode (#8528)
The following methods have long been deprecated in ruby due to flaws in their implementation per http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb/ruby/ruby-core/29293?29179-31097:

URI.escape
URI.unescape
URI.encode
URI.unencode
escape/encode are just aliases for one another. This PR uses the Addressable gem to replace these methods with its own encode, unencode, and encode_component methods where appropriate.

I have put all references to Addressable::URI here into the UrlHelper to keep them corralled in one place to make changes to this implementation easier.

Addressable is now also an explicit gem dependency.
2019-12-12 12:49:21 +10:00
Joffrey JAFFEUX 0d3d2c43a0
DEV: s/\$redis/Discourse\.redis (#8431)
This commit also adds a rubocop rule to prevent global variables.
2019-12-03 10:05:53 +01:00
Leo McArdle 2714149fd2 FEATURE: hide posts from incoming email based on dmarc verdict (#8333) 2019-11-26 15:55:22 +01:00
Dan Ungureanu a992caf741
DEV: Replace magic values (#8398)
Follow-up to 35942f7c7c.
2019-11-25 14:32:19 +02:00
Penar Musaraj 11d22293fb FIX: Allow private media uploads to be reused in login_required sites
In non-login-required sites, we prevent secure uploads already used in PMs from being used in public topics.

In login_required sites, secure uploads should be reusable in any topic, PM or not.
2019-11-21 09:14:06 -05:00
Penar Musaraj 102909edb3 FEATURE: Add support for secure media (#7888)
This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access. 

A few notes: 

- the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads
- the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured
- upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status
- when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error
- when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3
2019-11-18 11:25:42 +10:00
Robin Ward ab53b485b4 Add setter for `excerpt_size` so plugins can set it more easily 2019-10-21 13:44:32 -04:00
Daniel Waterworth 55a1394342 DEV: pluck_first
Doing .pluck(:column).first is a very common pattern in Discourse and in
most cases, a limit cause isn't being added. Instead of adding a limit
clause to all these callsites, this commit adds two new methods to
ActiveRecord::Relation:

pluck_first, equivalent to limit(1).pluck(*columns).first

and pluck_first! which, like other finder methods, raises an exception
when no record is found
2019-10-21 12:08:20 +01:00
Robin Ward f14020dd0a Allow the topic excerpt size to be customizable by plugins 2019-10-16 16:28:28 -04:00
Krzysztof Kotlarek 427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Vinoth Kannan 02731ef33e FIX: include video tags and short urls in 'have_uploads' method.
While checking the existence of upload in posts we must include <video> tags and 'short-url' format of upload URLs.
2019-09-24 23:17:59 +05:30
Vinoth Kannan 301c5a303f FIX: include 'short_path' as src in each_upload_url method. 2019-09-22 15:32:28 +05:30
Roman Rizzi 7d5f3c1338 UX/PERF: Update readers count when a post from another user is read. Don't fetch the post data again just to update the count. (#8078) 2019-09-09 11:29:15 +10:00
Vinoth Kannan aa012d12dc FIX: include 'short_url' as src if upload url not exist
The URL '/images/transparent.png' will be used in the cooked content if upload record not found. In that case we have to use 'short_url' as image src in 'post.each_upload_url' method.
2019-09-02 15:11:22 +05:30
Sam Saffron e53a171916 FIX: hold s3 related distributed locks longer
These operations are pretty expensive and can take multiple minutes due to
networking.

Hold distributed mutex for much longer.
2019-08-15 11:48:44 +10:00
Guo Xiang Tan 8a6ee09008 FIX: `Post#each_upload_url` yields incorrect path to block when CDN is enabled. 2019-07-31 10:00:52 +08:00
Guo Xiang Tan ef46231214 Fix the build. 2019-07-29 20:02:18 +08:00
Guo Xiang Tan faea594436 DEV: Extract common regexps for multisite. 2019-07-29 19:01:36 +08:00
Guo Xiang Tan 8a64b0c8e8 Revert "DEV: Remove unused kwarg and properly check for local missing uploads."
This reverts commit 97769f3d02.

The code is confusing but this change is quite risky. Defer for now
until we can look at it properly.
2019-07-29 14:35:34 +08:00
Guo Xiang Tan 97769f3d02 DEV: Remove unused kwarg and properly check for local missing uploads. 2019-07-29 14:21:06 +08:00
Vinoth Kannan ad04ce9f43 FIX: remove post upload record creation inside 'find_missing_uploads' method. 2019-07-19 01:44:08 +05:30
Vinoth Kannan b1ca64487a FIX: multisite upload urls must have either db name or the word 'short-url'. 2019-06-25 01:19:58 +05:30
Guo Xiang Tan 73a45048a0 FIX: `Upload#short_url` generates incorrect URL when extension is `nil`. 2019-06-19 09:10:50 +08:00
Vinoth Kannan 788f995f30 FIX: skip external urls which has upload url in query string.
Add spec tests for post.each_upload_url method. e8fafbc123
2019-06-11 19:55:02 +05:30
Bianca Nenciu 227c45107d FEATURE: Implement Onebox for posts including polls. (#7539) 2019-05-29 17:05:52 +02:00
Guo Xiang Tan f0620e7118 FEATURE: Support `[description|attachment](upload://<short-sha>)` in MD take 2.
Previous attempt was missing `post_uploads` records.
2019-05-29 09:26:32 +08:00
Vinoth Kannan 0e677daaee FIX: include posts with data-orig-src attribute in have_uploads scope query. 2019-05-16 16:39:38 +05:30
Vinoth Kannan 56ada8374f DEV: wrap find_missing_uploads method in distributed mutex
And skip posts with deleted topics.\ne8fafbc123170dd1f7d2a8adea4e7810585d3e76
2019-05-16 15:17:53 +05:30
Vinoth Kannan 636b75fa16 REFACTOR: remove duplicate reject loop and implicit return
e8fafbc123
2019-05-16 10:04:04 +05:30
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Vinoth Kannan 8c07c272f2 make rubocop happy. 2019-05-09 05:25:44 +05:30
Vinoth Kannan 87cd4701b8 FEATURE: option to skip posts with ignored missing uploads 2019-05-09 05:11:15 +05:30
Guo Xiang Tan 152238b4cf DEV: Prefer `public_send` over `send`. 2019-05-07 09:33:21 +08:00
Robin Ward 31e100530f FEATURE: Flag count in post menu
This change shows a notification number besides the flag icon in the
post menu if there is reviewable content associated with the post.
Additionally, if there is pending stuff to review, the icon has a red
background.

We have also removed the list of links below a post with the flag
status. A reviewer is meant to click the number beside the flag icon to
view the flags. As a consequence of losing those links, we've removed
the ability to undo or ignore flags below a post.
2019-05-06 16:13:31 -04:00
Sam Saffron f8eddd40ad PERF: remove avg_time calculations and regular jobs from posts and topics
After careful analysis of large data-sets it became apparent that avg_time
had no impact whatsoever on "best of" topic scoring. Calculating avg_time
was a very costly operation especially on large databases.

We have some longer term plans of introducing other weighting that is read
time based into our scoring for "best of" and "top" topics, but in the
interim to stop a large amount of work that is not achieving any value we
are removing the jobs.

Column removal will follow once we decide on a new replacement metric.
2019-05-06 15:59:01 +10:00
Vinoth Kannan e8fafbc123 List and restore missing post uploads from S3 inventory. 2019-05-04 01:16:20 +05:30
Rafael dos Santos Silva 1fdeec564b PERF: Move `where` clause up to speed up CalculateAvgTime daily job (#7462)
Cuts down affected posts earlier in the query, so the generated plan
deals with less rows, and runs faster.

https://meta.discourse.org/t/post-calculate-avg-time-taking-up-a-long-time/49750/13?u=falco
2019-04-30 13:34:46 +10:00
Sam Saffron 45285f1477 DEV: remove update_attributes which is deprecated in Rails 6
See: https://github.com/rails/rails/pull/31998

update_attributes is a relic of the past, it should no longer be used.
2019-04-29 17:32:25 +10:00
Dan Ungureanu 57d1dea8a2
FEATURE: Let staff add custom post notices. (#7377) 2019-04-19 17:53:58 +03:00
Vinoth Kannan 8e40c35eb8 FIX: 'have_uploads' scope should include all uploads without multisite 'upload_path' prefix 2019-04-15 01:54:55 +05:30
Guo Xiang Tan d705dd8bb8 Update annotations. 2019-04-11 12:37:24 +08:00
Vinoth Kannan b0600e52b6 FIX: use 'freeze' method again to fix 'cant modify frozen string' error
8d5c900142
2019-04-10 18:30:59 +05:30
Vinoth Kannan 8d5c900142 DEV: add unique missing uploads index in post custom fields
https://review.discourse.org/t/feature-send-missing-post-uploads-stat-to-prometheus/2609/6?u=vinothkannans
2019-04-10 18:09:35 +05:30
Vinoth Kannan d0fe42e2ef FIX: should look through posts for image markdown
Downloaded onebox images only included in the cooked HTML content. So we have to check 'post.cooked' instead of 'raw'. bfdd0fe64c
2019-04-10 13:52:35 +05:30
Guo Xiang Tan c82a929025 PERF: Add `index_for_rebake_old` to `posts`.
The index becomes smaller over time and is much faster.

Follow up to 4791d992dc.
2019-04-09 13:59:46 +08:00
Guo Xiang Tan 28d117898f Update annotations. 2019-04-09 13:27:32 +08:00
Guo Xiang Tan 6525613b89 PERF: Use joins for `Post.for_mailing_list` instead of `NOT IN`.
Joining on the topics table and then filtering the topics against a large set of
ids is so much slower than doing a join on a sub query.
2019-04-06 07:58:55 +08:00
Guo Xiang Tan 8794d940d3 Revert `update_columns` -> `update!` when rebaking/post-processing post.
`update!` goes through validation which means old posts that doesn't
adhere to the existing validations will raise an error.
2019-04-01 16:29:00 +08:00
Guo Xiang Tan cfd507822f
PERF: Improve quality of `PostSearchData#raw_data`. (#7275)
This commit fixes the follow quality issue with `PostSearchData#raw_data`:

1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.

`Post#cooked`:

```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```

2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.

From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.

`Post#cooked`

```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```

In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
2019-04-01 10:14:29 +08:00
Robin Ward b58867b6e9 FEATURE: New 'Reviewable' model to make reviewable items generic
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.

Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
2019-03-28 12:45:10 -04:00
Guo Xiang Tan d2a7f29595 Revert "REFACTOR: remove unnecessary parentheses attempt 2 (follow-up on 154f503d)"
This reverts commit 7db2dc717e.

Commit breaks post edits for regular users.
2019-03-14 08:15:40 +08:00
Neil Lalonde 7db2dc717e REFACTOR: remove unnecessary parentheses attempt 2 (follow-up on 154f503d) 2019-03-13 15:33:46 -04:00
Neil Lalonde e9ba4c74f7 Revert "REFACTOR: remove unnecessary parentheses"
Specs fail
2019-03-12 15:00:58 -04:00
Régis Hanol 154f503d2e
REFACTOR: remove unnecessary parentheses 2019-03-12 17:13:21 +01:00
Dan Ungureanu b28b418363
FIX: Various improvements to post notices.
- Notices are visible only by poster and trust level 2+ users.
- Notices are not generated for non-human or staged users.
- Notices are deleted when post is deleted.
2019-03-11 11:19:58 +02:00
Dan Ungureanu 35942f7c7c
FEATURE: Special call-out for new / returning posters. (#7115) 2019-03-08 10:48:35 +02:00
Gerhard Schlager e9ec5238fc DEV: Remove ignored columns 2019-02-08 12:12:38 +01:00
Sam 384135845b FEATURE: introduce ultra_low priority queue
This commit introduces an ultra low priority queue for post rebakes. This
way rebakes can never interfere with regular sidekiq processing for cases
where we perform a large scale rebake.

Additionally it allows Post.rebake_old to be run with rate_limiter: false
to avoid triggering the limiter when rebaking. This is handy for cases
where you want to just force the full rebake and not wait for it to trickle
2019-01-17 14:53:19 +11:00
Bianca Nenciu 7d84648d11 FEATURE: Remove full quotes only from new posts. (#6862) 2019-01-17 13:24:32 +11:00
Robin Ward 95f263995d FIX: Previous annotations were broken 2019-01-11 14:30:19 -05:00
Robin Ward a3839495e0 Update annotations 2019-01-11 12:19:43 -05:00
Sam e08a3f719c FEATURE: push post rebake regular task to low priority queue
This allows us to run regular rebakes without starving the normal queue.

It additionally adds the ability to specify queue with `Jobs.enqueue` so
we can specifically queue a job with lower priority using the `queue` arg.
2019-01-09 08:57:20 +11:00
Sam 70269c7c97 FEATURE: tighter limits on per cluster post rebakes
We have the periodical job that regularly will rebake old posts. This is
used to trickle in update to cooked markdown. The problem is that each rebake
can issue multiple background jobs (post process and pull hotlinked images)

Previously we had no per-cluster limit so cluster running 100s of sites could
flood the sidekiq queue with rebake related jobs.

New system introduces a hard limit of 300 rebakes per 15 minutes across a
cluster to ensure the sidekiq job is not dominated by this.

We also reduced `rebake_old_posts_count` to 80, which is a safer default.
2019-01-04 09:24:46 +11:00
Gerhard Schlager e8053d6e7d FIX: Polls didn't work in imported posts
Imports skip validation of posts, but polls are only created during the validation phase.
2019-01-02 15:26:57 +01:00