Commit Graph

383 Commits

Author SHA1 Message Date
Guo Xiang Tan cfd507822f
PERF: Improve quality of `PostSearchData#raw_data`. (#7275)
This commit fixes the follow quality issue with `PostSearchData#raw_data`:

1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.

`Post#cooked`:

```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```

2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.

From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.

`Post#cooked`

```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```

In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
2019-04-01 10:14:29 +08:00
Robin Ward b58867b6e9 FEATURE: New 'Reviewable' model to make reviewable items generic
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.

Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
2019-03-28 12:45:10 -04:00
Guo Xiang Tan d2a7f29595 Revert "REFACTOR: remove unnecessary parentheses attempt 2 (follow-up on 154f503d)"
This reverts commit 7db2dc717e.

Commit breaks post edits for regular users.
2019-03-14 08:15:40 +08:00
Neil Lalonde 7db2dc717e REFACTOR: remove unnecessary parentheses attempt 2 (follow-up on 154f503d) 2019-03-13 15:33:46 -04:00
Neil Lalonde e9ba4c74f7 Revert "REFACTOR: remove unnecessary parentheses"
Specs fail
2019-03-12 15:00:58 -04:00
Régis Hanol 154f503d2e
REFACTOR: remove unnecessary parentheses 2019-03-12 17:13:21 +01:00
Dan Ungureanu b28b418363
FIX: Various improvements to post notices.
- Notices are visible only by poster and trust level 2+ users.
- Notices are not generated for non-human or staged users.
- Notices are deleted when post is deleted.
2019-03-11 11:19:58 +02:00
Dan Ungureanu 35942f7c7c
FEATURE: Special call-out for new / returning posters. (#7115) 2019-03-08 10:48:35 +02:00
Gerhard Schlager e9ec5238fc DEV: Remove ignored columns 2019-02-08 12:12:38 +01:00
Sam 384135845b FEATURE: introduce ultra_low priority queue
This commit introduces an ultra low priority queue for post rebakes. This
way rebakes can never interfere with regular sidekiq processing for cases
where we perform a large scale rebake.

Additionally it allows Post.rebake_old to be run with rate_limiter: false
to avoid triggering the limiter when rebaking. This is handy for cases
where you want to just force the full rebake and not wait for it to trickle
2019-01-17 14:53:19 +11:00
Bianca Nenciu 7d84648d11 FEATURE: Remove full quotes only from new posts. (#6862) 2019-01-17 13:24:32 +11:00
Robin Ward 95f263995d FIX: Previous annotations were broken 2019-01-11 14:30:19 -05:00
Robin Ward a3839495e0 Update annotations 2019-01-11 12:19:43 -05:00
Sam e08a3f719c FEATURE: push post rebake regular task to low priority queue
This allows us to run regular rebakes without starving the normal queue.

It additionally adds the ability to specify queue with `Jobs.enqueue` so
we can specifically queue a job with lower priority using the `queue` arg.
2019-01-09 08:57:20 +11:00
Sam 70269c7c97 FEATURE: tighter limits on per cluster post rebakes
We have the periodical job that regularly will rebake old posts. This is
used to trickle in update to cooked markdown. The problem is that each rebake
can issue multiple background jobs (post process and pull hotlinked images)

Previously we had no per-cluster limit so cluster running 100s of sites could
flood the sidekiq queue with rebake related jobs.

New system introduces a hard limit of 300 rebakes per 15 minutes across a
cluster to ensure the sidekiq job is not dominated by this.

We also reduced `rebake_old_posts_count` to 80, which is a safer default.
2019-01-04 09:24:46 +11:00
Gerhard Schlager e8053d6e7d FIX: Polls didn't work in imported posts
Imports skip validation of posts, but polls are only created during the validation phase.
2019-01-02 15:26:57 +01:00
Vinoth Kannan 2b006c0429 FEATURE: Invalidate broken images cache on Rebuild HTML action 2018-12-26 23:22:07 +05:30
Arpit Jalan bb67ca9d21 FIX: use correct post object when logging exception 2018-12-19 22:17:37 +05:30
Guo Xiang Tan dcf9c6da59 DEV: Don't publish post messages to non-human users. 2018-12-06 08:24:13 +08:00
Régis Hanol aea2d8bbeb FIX: properly secure poll message bus
Co-authored-by: Sam <sam.saffron@gmail.com>
2018-12-05 21:27:49 +01:00
Vinoth Kannan 227a49bb32 FEATURE: automatically hide non-TL4 posts when flagged by a TL4 user 2018-10-11 17:11:46 +05:30
Penar Musaraj 34516c72bd
FIX: Recover public actions (likes) when recovering a post (#6412) 2018-10-02 11:25:08 -04:00
Sam 33541c4096 FEATURE: unconditionally omit no-follow for staff
Previously TL2 and below staff would have links
no-followed which was never intended
2018-09-17 12:02:20 +10:00
Guo Xiang Tan d4b05d7bc5 Always link post to uploads in post process.
The operation is cheap anyway so no point skipping.
2018-09-06 14:08:03 +08:00
Guo Xiang Tan 434035f167 FIX: Link post to uploads in `PostCreator`.
* This ensures that uploads are linked to their post on creation
  instead of a background job which may be delayed if Sidekiq
  is facing difficulties.
2018-09-06 11:18:11 +08:00
Gerhard Schlager 409ee66839 Add optional "ignore_case" parameter to posts:remap rake task 2018-08-23 14:49:17 +02:00
Gerhard Schlager 14af90df5b UX: Stop putting usernames in edit reason when changing post owner 2018-08-20 12:28:04 +02:00
Neil Lalonde fd29ecb91a UX: include a flag reason in the post-deleted-by-staff-because-of-flags message 2018-07-30 16:45:46 -04:00
David Taylor 0d0d78841b
FIX: Remove `plugin.enabled?` checks at initialization time (#6166)
Checking `plugin.enabled?` while initializing plugins causes issues in two ways:
  - An application restart is required for changes to take effect. A load-balanced multi-server environment could behave very weirdly if containers restart at different times.
  - In a multisite environment, it takes the `enabled?` setting from the default site. Changes on that site affect all other sites in the cluster.

Instead, `plugin.enabled?` should be checked at runtime, in the context of a request. This commit removes `plugin.enabled?` from many `instance.rb` methods.

I have added a working `plugin.enabled?` implementation for methods that actually affect security/functionality:
  - `post_custom_fields_whitelist`
  - `whitelist_staff_user_custom_field`
  - `add_permitted_post_create_param`
2018-07-25 16:44:09 +01:00
Guo Xiang Tan 214dac05de Update annotations. 2018-07-16 14:19:07 +08:00
Sam 574d447254 FIX: don't attempt to bump draft sequence if no editor
Rare case on old installs
2018-07-11 17:06:49 +10:00
OsamaSayegh f2cc05c6c6 FIX: ignore self-quotes from the same post when saving (#6082) 2018-07-10 16:17:28 +08:00
Guo Xiang Tan 96aca6d7e6
Remove legacy vote post action code. (#6009) 2018-07-09 16:54:18 +08:00
Patrick Gansterer 28dd7fb562 FEATURE: Create hidden posts for received spam emails (#6010)
* Add possibility to add hidden posts with PostCreator

* FEATURE: Create hidden posts for received spam emails

Spamchecker usually have 3 results: HAM, SPAM and PROBABLY_SPAM
SPAM gets usually directly rejected and needs no further handling.
HAM is good message and usually gets passed unmodified.
PROBABLY_SPAM gets an additional header to allow further processing.
This change addes processing capabilities for such headers and marks
new posts created as hidden when received via email.
2018-07-05 11:07:46 +02:00
Guo Xiang Tan f7d22bad90 FEATURE: Forced summary mode for megalodon topics.
This is mainly done for performance reasons and megalodon
topics are usually a byproduct of imports where site setting
limits are not respected.
2018-06-21 14:00:20 +08:00
Guo Xiang Tan ac80360bea PERF: Help postgres make use of index in `Post.summary`. 2018-06-21 13:29:16 +08:00
Guo Xiang Tan 6ddd214476 FIX: `Post#summary` returning posts from other topics. 2018-06-21 12:00:54 +08:00
Sam cb824a6b33 DEV: remove all calls to SqlBuilder use DB.build instead
This is part of the migration to mini_sql, SqlBuilder.new is being
deprecated and replaced with DB.build
2018-06-20 17:53:49 +10:00
Sam 5f64fd0a21 DEV: remove exec_sql and replace with mini_sql
Introduce new patterns for direct sql that are safe and fast.

MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API

- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder

See more at: https://github.com/discourse/mini_sql
2018-06-19 16:13:36 +10:00
Jeff Wong 68e4e6a575 FIX: staged users are still tl0 but do not trigger spam if 1 week old. 2018-06-18 17:20:04 -07:00
Jeff Wong 9e55767f6a FIX: don't punish a user for being previously staged for spam flags. 2018-06-15 12:25:25 -07:00
Vinoth Kannan a6303073a0 Strip images from cooked for topic excerpt 2018-06-11 14:43:53 +05:30
Sam 89ad2b5900 DEV: Rails 5.2 upgrade and global gem upgrade
This updates tests to use latest rails 5 practice
and updates ALL dependencies that could be updated

Performance testing shows that performance has not regressed
if anything it is marginally faster now.
2018-06-07 14:21:33 +10:00
Sam df815d6c0e DEV: prefer using ordering in relation over default scope 2018-05-29 09:34:12 +10:00
Gerhard Schlager ae6236d090 FIX: Changing owner of deleted reply didn't work 2018-05-16 17:03:09 +02:00
Gerhard Schlager ed4c0c4a63 FEATURE: Add option to delete all replies of flagged post 2018-04-24 11:08:05 -04:00
Neil Lalonde 8fc1289172 move topic excerpt code to one method to DRY it up and for extensibility 2018-04-17 15:08:21 -04:00
Guo Xiang Tan 142571bba0 Remove use of `rescue nil`.
* `rescue nil` is a really bad pattern to use in our code base.
  We should rescue errors that we expect the code to throw and
  not rescue everything because we're unsure of what errors the
  code would throw. This would reduce the amount of pain we face
  when debugging why something isn't working as expexted. I've
  been bitten countless of times by errors being swallowed as a
  result during debugging sessions.
2018-04-02 13:52:51 +08:00
Guo Xiang Tan 2f65393706 REFACTOR: Use `Topic#private_message?` to reduce duplication. 2018-03-05 15:39:22 +08:00
Gerhard Schlager c22e56499a FIX: Allow changing post owner even when validations fail 2018-02-27 15:46:20 +01:00
Guo Xiang Tan 226ace1643 Update annotations. 2018-02-20 14:28:58 +08:00
Robin Ward 5466389f4e FIX: Consider oneboxes links wrt to `min_trust_level_to_post_links` 2018-02-08 18:27:40 -05:00
Robin Ward 6b04967e2f FEATURE: Staff members can lock posts
Locking a post prevents it from being edited. This is useful if the user
has posted something which has been edited out, and the staff members don't
want them to be able to edit it back in again.
2018-01-26 14:01:30 -05:00
Sam 0081de30a5 PERF: conserve memory while rebaking posts 2018-01-05 09:54:42 +11:00
Sam c30ccceade correct params 2017-12-27 13:51:16 +11:00
Sam 0c834515a9 FIX: only attempt old rebakes a maximum of 3 times 2017-12-27 12:44:41 +11:00
Gerhard Schlager 727a45185d FIX: regex should behave the same in Ruby and Postgres 2017-12-21 11:26:56 +01:00
Régis Hanol b91f83eb7d Ignore auto-quote/reply when counting replies 2017-12-15 00:38:14 +01:00
Sam f18dda2adc FEATURE: full rebake of all old posts
This limits to 100 post per 15 minutes, so it will take a while.

This will pick up CommonMark and a large amount of onebox fixes.
2017-12-15 10:28:25 +11:00
Régis Hanol 092c976d7c FIX: prevent 💥 when selecting replies to posts quoting themselves 2017-12-15 00:23:51 +01:00
Régis Hanol 5db3d39b05 FIX: Post.reply_ids should also handle quotes 2017-12-14 00:43:48 +01:00
Régis Hanol 1b4483c942 FEATURE: Added 'select +below' and 'select +all replies' options to selecting posts 2017-12-13 22:12:06 +01:00
Arpit Jalan daeb7694bc update annotations 2017-12-05 21:03:20 +05:30
Robin Ward 77f90876d3 REFACTOR: Track manual locked user levels separately from groups 2017-11-27 11:23:44 -05:00
Régis Hanol 678e28794a FIX: properly handle too large & broken images in posts 2017-11-16 15:45:07 +01:00
Neil Lalonde c7d7cb940c FIX: dashboard posts report was including posts in daily data, but not in totals 2017-11-02 18:46:28 -04:00
Gerhard Schlager 4205c1ad2b FIX: postprocessing ignored cook method 2017-10-20 10:26:45 +02:00
Robin Ward 838568cbc3 Refactor flag types for more customization 2017-10-19 13:55:23 -04:00
Kyle Zhao 0342324b47 FEATURE: support regex in rake post:remap (#5201) 2017-10-04 11:47:53 +11:00
Robin Ward 00b190af75 Revert "A safe way to create class variables in a multisite environment."
The approach taken by this interface was flawed. We need a better
solution.
2017-09-29 11:06:12 -04:00
Guo Xiang Tan 30fa5379ce Merge pull request #5195 from henrik/patch-1
100.years.ago -> Time.at(0)
2017-09-28 15:47:06 +08:00
Robin Ward 3e13becf33 A safe way to create class variables in a multisite environment.
This should allow plugins to set class variables that will not
stomp on other plugins.
2017-09-27 13:00:47 -04:00
Robin Ward d7c37d9369 Add front end service for staff controls 2017-09-25 12:25:14 -04:00
Guo Xiang Tan 23b787e0a6 Require dependency otherwise it causes Sidekiq to lock up in development. 2017-09-25 13:48:59 +08:00
Guo Xiang Tan 77d4c4d8dc Fix all the errors to get our tests green on Rails 5.1. 2017-09-25 13:48:58 +08:00
Henrik Nyh 77cc33b231 100.years.ago -> Time.at(0)
A bit more robust ;)
2017-09-19 08:48:50 +01:00
Régis Hanol 797936d2c5 FIX: don't leak whisper count in user card 2017-09-14 20:08:16 +02:00
Gerhard Schlager f3d3129113 FIX: Use default locale for edit reason when owner of post gets changed 2017-09-14 17:17:37 +02:00
David Taylor 1def49cf6c SECURITY: Only publish PM reply messagebus notifications to allowed users 2017-09-08 14:12:15 -07:00
Erick Guan 1146772deb Fix: unlinked topic search model (#5044) 2017-08-15 11:46:57 -04:00
Régis Hanol 4f09a5a7a5 Add 'Post.permitted_create_params' to allow plugins to add new params when creating a post 2017-08-12 04:10:45 +02:00
Neil Lalonde 15a74d6d3e FIX: don't enforce newuser_spam_host_threshold on private messages 2017-08-10 17:19:08 -04:00
Guo Xiang Tan 5012d46cbd Add rubocop to our build. (#5004) 2017-07-28 10:20:09 +09:00
Sam 52ae63d5d7 FIX: when searching PMs also search group PMs
Users belonging to a group could not search for PMs unless explicitly added
to the PM unless admin
2017-05-11 15:59:03 -04:00
Guo Xiang Tan 982e3d04f6 PERF: Allow memory to be freed instead of fetching all the objects into memory at once.
```
MemoryProfiler.report do
  Jobs::UserEmail.new.execute(type: :mailing_list, user_id: user.id)
end.pretty_print
```

Before:
```
Total allocated: 180096119 bytes (1962025 objects)
Total retained:  2194 bytes (16 objects)

allocated memory by gem
-----------------------------------
  66979096  activerecord-4.2.8
  43507184  nokogiri-1.7.1
  43365188  mail-2.6.4
   5960201  activesupport-4.2.8
   5056267  discourse/lib
   4835284  rack-mini-profiler-0.10.1
   3825817  arel-6.0.4
   2186088  i18n-0.8.1
   1719330  discourse/app
```

After:
```
Total allocated: 161935975 bytes (1473940 objects)
Total retained:  2234 bytes (17 objects)

allocated memory by gem
-----------------------------------
  45430264  activerecord-4.2.8
  43568627  nokogiri-1.7.1
  43430754  mail-2.6.4
  11233878  rack-mini-profiler-0.10.1
   5260825  activesupport-4.2.8
   5054491  discourse/lib
   2186088  i18n-0.8.1
   1822494  arel-6.0.4
```
2017-05-03 17:01:57 +08:00
Robin Ward bf9c4a7828 FEATURE: secure_email site setting to prevent data going out in email 2017-04-26 13:05:56 -04:00
Robin Ward 27a73c73f9 FIX: Error when calculating geometric mean of 0 for read timings 2017-03-27 12:45:34 -04:00
Guo Xiang Tan d449f782a3 Revert "FIX: Don't skip callbacks when rebaking posts."
This reverts commit 06c651f8c9.

If site settings are changed, there is a chance that the post
will fail PostValidator's validations.
2017-02-01 10:52:15 +08:00
Guo Xiang Tan 06c651f8c9 FIX: Don't skip callbacks when rebaking posts. 2017-01-25 17:47:13 +08:00
Guo Xiang Tan 32846aad2a FIX: Toggling post's wiki status should not create a new version. 2017-01-20 15:42:33 +08:00
Neil Lalonde e8307ac24c FIX: mailing list mode digest emails included whispers 2017-01-13 13:46:33 -05:00
Sam 2f6a4cc6de remove UserActionObserver, replace with after_save and service
interestingly there was some left over dead code from when stars
existed in the topic_users table
2016-12-22 16:46:53 +11:00
Sam 0a78ae739d Remove SearchObserver, aim is to remove all observers
rails-observers gem is mostly unmaintained and is a pain to carry forward
new implementation contains significantly less magic as a bonus
2016-12-22 13:13:14 +11:00
Sam c04d4171ff FIX: whisper no longer experimental
- Regular users are not notified of whispers
- Regular users no longer have "stuck" topics in unread
- Additional tracking for staff highest post number
- Remove a bunch of unused columns in topics table
2016-12-02 17:03:31 +11:00
Arpit Jalan 382803cb05 FEATURE: include post image in OpenGraph image tag 2016-10-31 15:11:33 +05:30
Guo Xiang Tan efea296c7a FIX: Do not cook post if `Post#raw` has not been changed. 2016-10-24 12:02:38 +08:00
Arpit Jalan a590f35982 FEATURE: allow changing post owners without creating post revision 2016-08-19 23:34:21 +05:30
Robin Ward 19959c6092 Remove unneccessary `return` 2016-08-15 12:58:16 -04:00
Robin Ward aef954784a FIX: `nofollow` was being added during post processing when it shouldn't 2016-08-12 15:35:13 -04:00
Régis Hanol e55e2aff94 FIX: FirstReplyByEmail badge wasn't granted
DEPRECATED: PostProcess badge trigger
2016-08-10 19:24:01 +02:00