Commit Graph

364 Commits

Author SHA1 Message Date
Guo Xiang Tan faea594436 DEV: Extract common regexps for multisite. 2019-07-29 19:01:36 +08:00
Guo Xiang Tan 8a64b0c8e8 Revert "DEV: Remove unused kwarg and properly check for local missing uploads."
This reverts commit 97769f3d02.

The code is confusing but this change is quite risky. Defer for now
until we can look at it properly.
2019-07-29 14:35:34 +08:00
Guo Xiang Tan 97769f3d02 DEV: Remove unused kwarg and properly check for local missing uploads. 2019-07-29 14:21:06 +08:00
Vinoth Kannan ad04ce9f43 FIX: remove post upload record creation inside 'find_missing_uploads' method. 2019-07-19 01:44:08 +05:30
Vinoth Kannan b1ca64487a FIX: multisite upload urls must have either db name or the word 'short-url'. 2019-06-25 01:19:58 +05:30
Guo Xiang Tan 73a45048a0 FIX: `Upload#short_url` generates incorrect URL when extension is `nil`. 2019-06-19 09:10:50 +08:00
Vinoth Kannan 788f995f30 FIX: skip external urls which has upload url in query string.
Add spec tests for post.each_upload_url method. e8fafbc123
2019-06-11 19:55:02 +05:30
Bianca Nenciu 227c45107d FEATURE: Implement Onebox for posts including polls. (#7539) 2019-05-29 17:05:52 +02:00
Guo Xiang Tan f0620e7118 FEATURE: Support `[description|attachment](upload://<short-sha>)` in MD take 2.
Previous attempt was missing `post_uploads` records.
2019-05-29 09:26:32 +08:00
Vinoth Kannan 0e677daaee FIX: include posts with data-orig-src attribute in have_uploads scope query. 2019-05-16 16:39:38 +05:30
Vinoth Kannan 56ada8374f DEV: wrap find_missing_uploads method in distributed mutex
And skip posts with deleted topics.\ne8fafbc123170dd1f7d2a8adea4e7810585d3e76
2019-05-16 15:17:53 +05:30
Vinoth Kannan 636b75fa16 REFACTOR: remove duplicate reject loop and implicit return
e8fafbc123
2019-05-16 10:04:04 +05:30
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Vinoth Kannan 8c07c272f2 make rubocop happy. 2019-05-09 05:25:44 +05:30
Vinoth Kannan 87cd4701b8 FEATURE: option to skip posts with ignored missing uploads 2019-05-09 05:11:15 +05:30
Guo Xiang Tan 152238b4cf DEV: Prefer `public_send` over `send`. 2019-05-07 09:33:21 +08:00
Robin Ward 31e100530f FEATURE: Flag count in post menu
This change shows a notification number besides the flag icon in the
post menu if there is reviewable content associated with the post.
Additionally, if there is pending stuff to review, the icon has a red
background.

We have also removed the list of links below a post with the flag
status. A reviewer is meant to click the number beside the flag icon to
view the flags. As a consequence of losing those links, we've removed
the ability to undo or ignore flags below a post.
2019-05-06 16:13:31 -04:00
Sam Saffron f8eddd40ad PERF: remove avg_time calculations and regular jobs from posts and topics
After careful analysis of large data-sets it became apparent that avg_time
had no impact whatsoever on "best of" topic scoring. Calculating avg_time
was a very costly operation especially on large databases.

We have some longer term plans of introducing other weighting that is read
time based into our scoring for "best of" and "top" topics, but in the
interim to stop a large amount of work that is not achieving any value we
are removing the jobs.

Column removal will follow once we decide on a new replacement metric.
2019-05-06 15:59:01 +10:00
Vinoth Kannan e8fafbc123 List and restore missing post uploads from S3 inventory. 2019-05-04 01:16:20 +05:30
Rafael dos Santos Silva 1fdeec564b PERF: Move `where` clause up to speed up CalculateAvgTime daily job (#7462)
Cuts down affected posts earlier in the query, so the generated plan
deals with less rows, and runs faster.

https://meta.discourse.org/t/post-calculate-avg-time-taking-up-a-long-time/49750/13?u=falco
2019-04-30 13:34:46 +10:00
Sam Saffron 45285f1477 DEV: remove update_attributes which is deprecated in Rails 6
See: https://github.com/rails/rails/pull/31998

update_attributes is a relic of the past, it should no longer be used.
2019-04-29 17:32:25 +10:00
Dan Ungureanu 57d1dea8a2
FEATURE: Let staff add custom post notices. (#7377) 2019-04-19 17:53:58 +03:00
Vinoth Kannan 8e40c35eb8 FIX: 'have_uploads' scope should include all uploads without multisite 'upload_path' prefix 2019-04-15 01:54:55 +05:30
Guo Xiang Tan d705dd8bb8 Update annotations. 2019-04-11 12:37:24 +08:00
Vinoth Kannan b0600e52b6 FIX: use 'freeze' method again to fix 'cant modify frozen string' error
8d5c900142
2019-04-10 18:30:59 +05:30
Vinoth Kannan 8d5c900142 DEV: add unique missing uploads index in post custom fields
https://review.discourse.org/t/feature-send-missing-post-uploads-stat-to-prometheus/2609/6?u=vinothkannans
2019-04-10 18:09:35 +05:30
Vinoth Kannan d0fe42e2ef FIX: should look through posts for image markdown
Downloaded onebox images only included in the cooked HTML content. So we have to check 'post.cooked' instead of 'raw'. bfdd0fe64c
2019-04-10 13:52:35 +05:30
Guo Xiang Tan c82a929025 PERF: Add `index_for_rebake_old` to `posts`.
The index becomes smaller over time and is much faster.

Follow up to 4791d992dc.
2019-04-09 13:59:46 +08:00
Guo Xiang Tan 28d117898f Update annotations. 2019-04-09 13:27:32 +08:00
Guo Xiang Tan 6525613b89 PERF: Use joins for `Post.for_mailing_list` instead of `NOT IN`.
Joining on the topics table and then filtering the topics against a large set of
ids is so much slower than doing a join on a sub query.
2019-04-06 07:58:55 +08:00
Guo Xiang Tan 8794d940d3 Revert `update_columns` -> `update!` when rebaking/post-processing post.
`update!` goes through validation which means old posts that doesn't
adhere to the existing validations will raise an error.
2019-04-01 16:29:00 +08:00
Guo Xiang Tan cfd507822f
PERF: Improve quality of `PostSearchData#raw_data`. (#7275)
This commit fixes the follow quality issue with `PostSearchData#raw_data`:

1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.

`Post#cooked`:

```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```

2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.

From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.

`Post#cooked`

```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```

In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
2019-04-01 10:14:29 +08:00
Robin Ward b58867b6e9 FEATURE: New 'Reviewable' model to make reviewable items generic
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.

Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
2019-03-28 12:45:10 -04:00
Guo Xiang Tan d2a7f29595 Revert "REFACTOR: remove unnecessary parentheses attempt 2 (follow-up on 154f503d)"
This reverts commit 7db2dc717e.

Commit breaks post edits for regular users.
2019-03-14 08:15:40 +08:00
Neil Lalonde 7db2dc717e REFACTOR: remove unnecessary parentheses attempt 2 (follow-up on 154f503d) 2019-03-13 15:33:46 -04:00
Neil Lalonde e9ba4c74f7 Revert "REFACTOR: remove unnecessary parentheses"
Specs fail
2019-03-12 15:00:58 -04:00
Régis Hanol 154f503d2e
REFACTOR: remove unnecessary parentheses 2019-03-12 17:13:21 +01:00
Dan Ungureanu b28b418363
FIX: Various improvements to post notices.
- Notices are visible only by poster and trust level 2+ users.
- Notices are not generated for non-human or staged users.
- Notices are deleted when post is deleted.
2019-03-11 11:19:58 +02:00
Dan Ungureanu 35942f7c7c
FEATURE: Special call-out for new / returning posters. (#7115) 2019-03-08 10:48:35 +02:00
Gerhard Schlager e9ec5238fc DEV: Remove ignored columns 2019-02-08 12:12:38 +01:00
Sam 384135845b FEATURE: introduce ultra_low priority queue
This commit introduces an ultra low priority queue for post rebakes. This
way rebakes can never interfere with regular sidekiq processing for cases
where we perform a large scale rebake.

Additionally it allows Post.rebake_old to be run with rate_limiter: false
to avoid triggering the limiter when rebaking. This is handy for cases
where you want to just force the full rebake and not wait for it to trickle
2019-01-17 14:53:19 +11:00
Bianca Nenciu 7d84648d11 FEATURE: Remove full quotes only from new posts. (#6862) 2019-01-17 13:24:32 +11:00
Robin Ward 95f263995d FIX: Previous annotations were broken 2019-01-11 14:30:19 -05:00
Robin Ward a3839495e0 Update annotations 2019-01-11 12:19:43 -05:00
Sam e08a3f719c FEATURE: push post rebake regular task to low priority queue
This allows us to run regular rebakes without starving the normal queue.

It additionally adds the ability to specify queue with `Jobs.enqueue` so
we can specifically queue a job with lower priority using the `queue` arg.
2019-01-09 08:57:20 +11:00
Sam 70269c7c97 FEATURE: tighter limits on per cluster post rebakes
We have the periodical job that regularly will rebake old posts. This is
used to trickle in update to cooked markdown. The problem is that each rebake
can issue multiple background jobs (post process and pull hotlinked images)

Previously we had no per-cluster limit so cluster running 100s of sites could
flood the sidekiq queue with rebake related jobs.

New system introduces a hard limit of 300 rebakes per 15 minutes across a
cluster to ensure the sidekiq job is not dominated by this.

We also reduced `rebake_old_posts_count` to 80, which is a safer default.
2019-01-04 09:24:46 +11:00
Gerhard Schlager e8053d6e7d FIX: Polls didn't work in imported posts
Imports skip validation of posts, but polls are only created during the validation phase.
2019-01-02 15:26:57 +01:00
Vinoth Kannan 2b006c0429 FEATURE: Invalidate broken images cache on Rebuild HTML action 2018-12-26 23:22:07 +05:30
Arpit Jalan bb67ca9d21 FIX: use correct post object when logging exception 2018-12-19 22:17:37 +05:30
Guo Xiang Tan dcf9c6da59 DEV: Don't publish post messages to non-human users. 2018-12-06 08:24:13 +08:00