Commit Graph

909 Commits

Author SHA1 Message Date
Guo Xiang Tan d151425353
PERF: Delete search data of posts from trashed topics periodically. (#7302)
This keeps both the index and table smaller.
2019-04-03 10:10:41 +08:00
Guo Xiang Tan feb731bffd FIX: Regenerate optimized images instead of migrating from old scheme.
`OptimizedImage.migrate_to_new_scheme` was optimizing optimized images
which we don't need to do. Regnerating the optimized image is way easier.
2019-04-03 09:45:02 +08:00
Guo Xiang Tan d8704c11ca PERF: Better use of index when queueing a topci for search reindex.
Also move `Search::INDEX_VERSION` to `SearchIndexer` which is where the
version is actually being used.
2019-04-02 09:53:37 +08:00
Guo Xiang Tan aa2311a7b0 FIX: Don't reindex posts belonging to a deleted topic for search.
Posts belonging to a deleted topic can't be index for search so we need
to avoid loading those post ids.
2019-04-02 07:36:53 +08:00
Guo Xiang Tan 3fc5dbb045 FIX: Don't attempt to reindex posts that have an empty raw.
If the post ids keep loading, we might end up in a situations where
we're always loading the same post ids over and over again without
indexing anything new.

Follow up to daeda80ada.
2019-04-02 07:13:33 +08:00
Robin Ward 76669bb5a6 FIX: Don't refer to pending review items as flags
They could be queued posts or users, and the notice should reflect that
properly.
2019-04-01 14:46:56 -04:00
Guo Xiang Tan 8794d940d3 Revert `update_columns` -> `update!` when rebaking/post-processing post.
`update!` goes through validation which means old posts that doesn't
adhere to the existing validations will raise an error.
2019-04-01 16:29:00 +08:00
Guo Xiang Tan cfd507822f
PERF: Improve quality of `PostSearchData#raw_data`. (#7275)
This commit fixes the follow quality issue with `PostSearchData#raw_data`:

1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.

`Post#cooked`:

```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```

2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.

From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.

`Post#cooked`

```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```

In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
2019-04-01 10:14:29 +08:00
Guo Xiang Tan daeda80ada
FIX: Don't index posts with empty `Post#raw` for search. (#7263)
* DEV: Remove unnecessary join in `Jobs::ReindexSearch`.

* FIX: Don't index posts with empty `Post#raw` for search.
2019-04-01 10:06:27 +08:00
Tarek Khalil b1cb95fc23
FEATURE: Introduce ignore duration selection (#7266)
* FEATURE: Introducing new UI for tracking User's ignored or muted states
2019-03-29 10:14:53 +00:00
Robin Ward b58867b6e9 FEATURE: New 'Reviewable' model to make reviewable items generic
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.

Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
2019-03-28 12:45:10 -04:00
David Taylor 95d5819218 FIX: Re-download hotlinked optimized images (#7249)
* FIX: Download local images, even if download remote is disabled
2019-03-27 21:31:12 +01:00
Guo Xiang Tan b58c965aad FIX: Destroy optimized image if attempting to migrate to new scheme fails. 2019-03-26 14:37:34 +08:00
Neil Lalonde 399e937a38 FIX: prevent sending multiple summary emails due to Sidekiq delays 2019-03-22 12:34:34 -04:00
David Taylor a9d5ffbe3d FIX: Prevent critical emails bypassing disable, and improve email test logic
- The test_email job is removed, because it was always being run synchronously (not in sidekiq)
- 34b29f62 added a bypass for critical emails, to match the spec. This removes the bypass, and removes the spec.
- This adapts the specs for 72ffabf6, so that they check for emails being sent
- This reimplements c2797921, allowing test emails to be sent even when emails are disabled
2019-03-22 17:28:43 +08:00
Vinoth Kannan 1e3cb7575d DEV: Update webhook event attributes even when an error raised 2019-03-21 20:45:21 +05:30
Vinoth Kannan 4c6bfb9b39 DEV: Don't destroy webhook in case of error 2019-03-21 18:34:54 +05:30
Tarek Khalil a31a35b334 FEATURE: Ignored user notification behaviour should be as a muted user (#7227) 2019-03-21 12:15:34 +01:00
Tarek Khalil fed2dd9148 FEATURE: Add scheduled job to purge expired ignored users (#7211) 2019-03-20 11:01:43 +01:00
Régis Hanol 31e06dbcd2 FIX: SES webhook wasn't parsing the message 2019-03-19 11:40:19 +01:00
Guo Xiang Tan 4020c87680 DEV: Refactor tests for `Jobs::CleanUpInactiveUsers`.
* Remove use of 0 in favor of `TrustLevel.levels[:newuser]`.
* Consolidate two tests into a single one.
* Test that disabling the feature works.
* Avoid loading full ActiveRecord object in test when we only need to
know the existence of the record.
2019-03-19 09:57:21 +08:00
Bianca Nenciu 2347661a74 FEATURE: Clean up inactive users. (#7172) 2019-03-18 16:25:15 +01:00
Bianca Nenciu 5114ef958a FIX: Do not trigger post alerts for empty posts. (#7138) 2019-03-15 17:58:43 +01:00
Penar Musaraj 9334d2f4f7
FEATURE: add more granular user option levels for email notifications (#7143)
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by

* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
2019-03-15 10:55:11 -04:00
Tarek Khalil bd6d31c9ec
FEATURE: Add `IgnoredUsersSummary` daily job (#7144)
* FEATURE: Add `IgnoredUsersSummary` daily job

## Why?

This is part of the [Ability to ignore a user feature](https://meta.discourse.org/t/ability-to-ignore-a-user/110254/8).

We want to:

1. Send an automatic group PM that goes out to moderators
2. When {x} users have Ignored the same user, threshold defined by a site setting, default of 5
3. Only send this message every X days which is defined by another site setting
2019-03-14 22:51:43 +00:00
Robin Ward fa5a158683 REFACTOR: Move `queue_jobs` out of `SiteSetting`
It is not a setting, and only relevant in specs. The new API is:

```
Jobs.run_later!        # jobs will be thrown on the queue
Jobs.run_immediately!  # jobs will run right away, avoid the queue
```
2019-03-14 10:47:38 -04:00
Guo Xiang Tan b0c8fdd7da FIX: Properly support defaults for upload site settings. 2019-03-13 16:36:57 +08:00
Bianca Nenciu c6ed86220e FIX: Notify on tag change. (#7119) 2019-03-12 18:09:34 +01:00
Guo Xiang Tan 34b29f62db DEV: Remove the use of stubs and mocks in `Jobs::UserEmail` tests.
We can only be sure that an email is sent when we get a mailer in
`ActionMailer::Deliveries`. A couple of tests were actually incorrect
because it didn't flow through our email sender where there are more
conditions in determining whether an email is sent or not.
2019-03-12 09:39:16 +08:00
Dan Ungureanu 32bae48fd3 DEV: Use User#human? User#bot? (#7140) 2019-03-12 07:58:14 +08:00
David Taylor 0a4562253e DEV: Add 'starting' event to sidekiq log when interval logging enabled 2019-03-08 10:56:36 +00:00
David Taylor e2510d79cc DEV: Improve thread-safety of sidekiq logging 2019-03-08 10:31:49 +00:00
David Taylor 9db05a895a DEV: Add job_id to sidkiq log
This makes it easier to correlate 'pending' logs from the same job
2019-03-08 09:16:13 +00:00
Gerhard Schlager 1121514799 UX: Localize date format in "new user of the month" message 2019-03-06 21:58:25 +01:00
David Taylor df474bceee DEV: Further sidekiq logging stability improvements
- Open the log file in "append" mode. This avoids issues if the file does not exist (and matches standard rails log behavior)
- Correctly parse the interval logging environment variable
2019-03-06 12:50:15 +00:00
David Taylor fe62de68dd DEV: Correct sidekiq logging to avoid thread leak 2019-03-06 10:11:31 +00:00
David Taylor 8b30ed5b7a DEV: Serialize the job parameters in sidekiq logs
Otherwise this can lead to some very large data structures when processing the logs later
2019-03-05 17:44:49 +00:00
David Taylor 8963f1af30
FEATURE: Optional detailed performance logging for Sidekiq jobs (#7091)
By default, this does nothing. Two environment variables are available:

- `DISCOURSE_LOG_SIDEKIQ`

  Set to `"1"` to enable logging. This will log all completed jobs to `log/rails/sidekiq.log`, along with various db/redis/network statistics. This is useful to track down poorly performing jobs.

- `DISCOURSE_LOG_SIDEKIQ_INTERVAL`

  (seconds) Check running jobs periodically, and log their current duration. They will appear in the logs with `status:pending`. This is useful to track down jobs which take a long time, then crash sidekiq before completing.
2019-03-05 11:19:11 +00:00
Régis Hanol 326d892f5e Aadd 'secondary_emails' field in users export
FIX: escape_comma wasn't working in CSV exports
FIX: group_names field wasn't properly serialized
2019-02-27 10:12:20 +01:00
Régis Hanol e3a23116d2
FIX: don't update gravatar if the user has no email 2019-02-20 22:34:43 +01:00
Régis Hanol fc14847c14 PERF: only require aws-sdk-sns gem when it's being used 2019-02-14 11:08:21 +01:00
Vinoth Kannan 484bd82278 FIX: Add onceoff job to remove double quotes from s3 etags 2019-02-14 05:19:41 +05:30
Régis Hanol 4d674acc25 FEATURE: AWS SNS bounce notifications webhooks 2019-02-13 21:26:40 +01:00
Vinoth Kannan b4f713ca52
FEATURE: Use amazon s3 inventory to manage upload stats (#6867) 2019-02-01 10:10:48 +05:30
Robin Ward 78ddc82952 FIX: Respect min_flags_staff_visibility for new flags too
There was a situation where if:

* There were new flags to review that met the visibility threshold

AND

* There were old flags that *didn't* meet the threshold

THEN

a pending flags notification would be sent out. This fixes that case.
Staff should not be notified of flags if they do not meet the threshold
and are old.
2019-01-25 11:27:43 -05:00
Robin Ward 96b2585a91 REFACTOR: Remove unncessary stubs from pending flags reminder
They seem to be calculated fine by the application, and stubbing
makes the tests more brittle and prone to regression.
2019-01-24 13:45:58 -05:00
Robin Ward f32de88dfc FIX: Don't notify of pending flags if min_flags_staff_visibility not met 2019-01-22 11:01:18 -05:00
Guo Xiang Tan 95e3841974 FIX: Remove old reference and use `MiniScheduler::Stat`. 2019-01-18 16:36:11 +08:00
Sam 384135845b FEATURE: introduce ultra_low priority queue
This commit introduces an ultra low priority queue for post rebakes. This
way rebakes can never interfere with regular sidekiq processing for cases
where we perform a large scale rebake.

Additionally it allows Post.rebake_old to be run with rate_limiter: false
to avoid triggering the limiter when rebaking. This is handy for cases
where you want to just force the full rebake and not wait for it to trickle
2019-01-17 14:53:19 +11:00
Bianca Nenciu 7d84648d11 FEATURE: Remove full quotes only from new posts. (#6862) 2019-01-17 13:24:32 +11:00