Commit Graph

56 Commits

Author SHA1 Message Date
Krzysztof Kotlarek bd301c3f08
FIX: improve performance of UserStat.ensure_consistency (#21044) (#21121)
Optimize `UserStatpost_read_count` calculation.

In addition, tests were updated to fail when code is not evaluated. Creation of PostTiming was updating `post_read_count`. Count it has to be reset to ensure that ensure_consitency correctly calculates result.

Extracting users seen in the last hour to separate Common Table Expression reduces the amount of processed rows.

Before
```
Update on user_stats  (cost=267492.07..270822.95 rows=2900 width=174) (actual time=12606.121..12606.127 rows=0 loops=1)
  ->  Hash Join  (cost=267492.07..270822.95 rows=2900 width=174) (actual time=12561.814..12603.689 rows=10 loops=1)
        Hash Cond: (user_stats.user_id = x.user_id)
        Join Filter: (x.c <> user_stats.posts_read_count)
        Rows Removed by Join Filter: 67
        ->  Seq Scan on user_stats  (cost=0.00..3125.34 rows=75534 width=134) (actual time=0.014..39.173 rows=75534 loops=1)
        ->  Hash  (cost=267455.80..267455.80 rows=2901 width=48) (actual time=12558.613..12558.617 rows=77 loops=1)
              Buckets: 4096  Batches: 1  Memory Usage: 39kB
              ->  Subquery Scan on x  (cost=267376.03..267455.80 rows=2901 width=48) (actual time=12168.601..12558.572 rows=77 loops=1)
                    ->  GroupAggregate  (cost=267376.03..267426.79 rows=2901 width=12) (actual time=12168.595..12558.525 rows=77 loops=1)
                          Group Key: pt.user_id
                          ->  Sort  (cost=267376.03..267383.28 rows=2901 width=4) (actual time=12100.490..12352.106 rows=2072830 loops=1)
                                Sort Key: pt.user_id
                                Sort Method: external merge  Disk: 28488kB
                                ->  Nested Loop  (cost=1.28..267209.18 rows=2901 width=4) (actual time=0.040..11528.680 rows=2072830 loops=1)
                                      ->  Nested Loop  (cost=0.86..261390.02 rows=13159 width=8) (actual time=0.030..3492.887 rows=3581648 loops=1)
                                            ->  Index Scan using index_users_on_last_seen_at on users u  (cost=0.42..89.71 rows=28 width=4) (actual time=0.010..0.201 rows=78 loops=1)
                                                  Index Cond: (last_seen_at > '2023-04-11 00:22:49.555537'::timestamp without time zone)
                                            ->  Index Scan using index_post_timings_on_user_id on post_timings pt  (cost=0.44..9287.60 rows=4455 width=8) (actual time=0.081..38.542 rows=45919 loops=78)
                                                  Index Cond: (user_id = u.id)
                                      ->  Index Scan using forum_threads_pkey on topics t  (cost=0.42..0.44 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=3581648)
                                            Index Cond: (id = pt.topic_id)
                                            Filter: ((deleted_at IS NULL) AND ((archetype)::text = 'regular'::text))
                                            Rows Removed by Filter: 0
Planning Time: 0.692 ms
Execution Time: 12612.587 ms
```
After
```
Update on user_stats  (cost=9473.60..12804.30 rows=2828 width=174) (actual time=677.724..677.729 rows=0 loops=1)
  ->  Hash Join  (cost=9473.60..12804.30 rows=2828 width=174) (actual time=672.536..677.706 rows=1 loops=1)
        Hash Cond: (user_stats.user_id = x.user_id)
        Join Filter: (x.c <> user_stats.posts_read_count)
        Rows Removed by Join Filter: 54
        ->  Seq Scan on user_stats  (cost=0.00..3125.34 rows=75534 width=134) (actual time=0.012..23.977 rows=75534 loops=1)
        ->  Hash  (cost=9438.24..9438.24 rows=2829 width=48) (actual time=647.818..647.822 rows=55 loops=1)
              Buckets: 4096  Batches: 1  Memory Usage: 37kB
              ->  Subquery Scan on x  (cost=9381.66..9438.24 rows=2829 width=48) (actual time=647.409..647.805 rows=55 loops=1)
                    ->  HashAggregate  (cost=9381.66..9409.95 rows=2829 width=12) (actual time=647.403..647.786 rows=55 loops=1)
                          Group Key: pt.user_id
                          Batches: 1  Memory Usage: 121kB
                          ->  Nested Loop  (cost=1.86..9367.51 rows=2829 width=4) (actual time=0.056..625.245 rows=120022 loops=1)
                                ->  Nested Loop  (cost=1.44..3692.96 rows=12832 width=8) (actual time=0.047..171.754 rows=217440 loops=1)
                                      ->  Nested Loop  (cost=1.00..254.63 rows=25 width=12) (actual time=0.030..1.407 rows=56 loops=1)
                                            Join Filter: (u.id = user_stats_1.user_id)
                                            ->  Nested Loop  (cost=0.71..243.08 rows=25 width=8) (actual time=0.018..1.207 rows=87 loops=1)
                                                  ->  Index Scan using index_users_on_last_seen_at on users u  (cost=0.42..86.71 rows=27 width=4) (actual time=0.009..0.156 rows=87 loops=1)
                                                        Index Cond: (last_seen_at > '2023-04-11 00:47:07.437568'::timestamp without time zone)
                                                  ->  Index Only Scan using user_stats_pkey on user_stats us  (cost=0.29..5.79 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=87)
                                                        Index Cond: (user_id = u.id)
                                                        Heap Fetches: 87
                                            ->  Index Scan using user_stats_pkey on user_stats user_stats_1  (cost=0.29..0.45 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=87)
                                                  Index Cond: (user_id = us.user_id)
                                                  Filter: (posts_read_count < 10000)
                                                  Rows Removed by Filter: 0
                                      ->  Index Scan using index_post_timings_on_user_id on post_timings pt  (cost=0.44..92.98 rows=4455 width=8) (actual time=0.036..2.492 rows=3883 loops=56)
                                            Index Cond: (user_id = user_stats_1.user_id)
                                ->  Index Scan using forum_threads_pkey on topics t  (cost=0.42..0.44 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=217440)
                                      Index Cond: (id = pt.topic_id)
                                      Filter: ((deleted_at IS NULL) AND ((archetype)::text = 'regular'::text))
                                      Rows Removed by Filter: 0
Planning Time: 1.406 ms
Execution Time: 677.817 ms
```
2023-04-18 10:04:50 +10:00
David Taylor 5a003715d3
DEV: Apply syntax_tree formatting to `app/*` 2023-01-09 14:14:59 +00:00
Krzysztof Kotlarek 09932738e5
FEATURE: whispers available for groups (#17170)
Before, whispers were only available for staff members.

Config has been changed to allow to configure privileged groups with access to whispers. Post migration was added to move from the old setting into the new one.

I considered having a boolean column `whisperer` on user model similar to `admin/moderator` for performance reason. Finally, I decided to keep looking for groups as queries are only done for current user and didn't notice any N+1 queries.
2022-06-30 10:18:12 +10:00
Alan Guo Xiang Tan f618fdf17f
Revert "DEV: Centralize user updates to a single MessageBus channel. (#17058)" (#17115)
This reverts commit 94c3bbc2d1.

At this current point in time, we do not have enough data on whether
this centralisation is the trade-offs of coupling features into a single
channel.
2022-06-17 12:24:15 +08:00
Alan Guo Xiang Tan 94c3bbc2d1
DEV: Centralize user updates to a single MessageBus channel. (#17058)
Introduces an interface to publish user updates on the server side and
helps to reduce the growing number of subscriptions on the client side.
2022-06-13 14:27:43 +08:00
Alan Guo Xiang Tan 7da074d591
DEV: Implement "My Posts" section link for experimental sidebar (#17008) 2022-06-07 10:52:54 +08:00
Alan Guo Xiang Tan 6fb89c153a Revert "DEV: Remove stale ignored_columns from models."
This reverts commit 9f5c8644d0.

Have to revert because the ignored columns have not been dropped.
2022-01-11 11:00:58 +08:00
Alan Guo Xiang Tan 9f5c8644d0 DEV: Remove stale ignored_columns from models. 2022-01-11 10:38:10 +08:00
Bianca Nenciu b1c11d5787
FIX: Select correct topic draft for user (#15234)
The old query could return multiple rows.
2021-12-08 15:23:44 +02:00
Bianca Nenciu 049bc33838
FIX: Update has_topic_draft when draft is updated (#15219)
Current user state regarding the new topic draft was not updated when
the draft was created or destroyed.
2021-12-08 14:40:35 +02:00
Loïc Guitaut a5fbb90df4 FEATURE: Display pending posts on user’s page
Currently when a user creates posts that are moderated (for whatever
reason), a popup is displayed saying the post needs approval and the
total number of the user’s pending posts. But then this piece of
information is kind of lost and there is nowhere for the user to know
what are their pending posts or how many there are.

This patch solves this issue by adding a new “Pending” section to the
user’s activity page when there are some pending posts to display. When
there are none, then the “Pending” section isn’t displayed at all.
2021-11-29 10:26:33 +01:00
Alan Guo Xiang Tan 8226ab1099
PERF: Updating first unread PM for user not respecting limits. (#15056)
In b8c8909a9d, we introduced a regression
where users may have had their `UserStat.first_unread_pm_at` set
incorrectly. This commit introduces a migration to reset `UserStat.first_unread_pm_at` back to
`User#created_at`.

Follow-up to b8c8909a9d.
2021-11-23 12:51:54 +08:00
Alan Guo Xiang Tan 4b4973ee0d
PERF: Reduce records queried in `UserStat.update_first_unread_pm`. (#15016)
The inefficiency here is that we were previously fetching all the
records from `TopicAllowedUser` before filtering against a limited subset of
users based on `User#last_seen_at`.
2021-11-19 15:30:39 +11:00
Jean e7b8e75583
FEATURE: Add post edits count to user activity (#13495) 2021-08-02 10:15:53 -04:00
Bianca Nenciu 300db3d3fa
FIX: Update draft count after creating a post (#13884)
When a post is created, the draft sequence is increased and then older
drafts are automatically executing a raw SQL query. This skipped the
Draft model callbacks and did not update user's draft count.

I fixed another problem related to a raw SQL query from Draft.cleanup!
method.
2021-07-29 17:06:11 +03:00
Bianca Nenciu 760c9a5698
FEATURE: Show draft count in user menu and activity (#13812)
This commit adds the number of drafts a user has next to the "Draft"
label in the user preferences menu and activity tab. The count is
updated via MessageBus when a draft is created or destroyed.
2021-07-27 14:05:33 +03:00
Josh Soref 59097b207f
DEV: Correct typos and spelling mistakes (#12812)
Over the years we accrued many spelling mistakes in the code base. 

This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change 

- comments
- test descriptions
- other low risk areas
2021-05-21 11:43:47 +10:00
Arpit Jalan c6bf70c870
DEV: annotate models (#11047) 2020-10-27 23:42:33 +05:30
David Taylor c0293339b8
PERF: Do not enqueue digest emails when attempted recently (#10849)
Previously, Jobs::EnqueueDigestEmails would enqueue a digest job for every user, even if there are no topics to send. The digest job would exit, no email would send, and last_emailed_at would not change. 30 minutes later, Jobs::EnqueueDigestEmails would run again and re-enqueue jobs for the same users.

120fa8ad introduced a temporary mitigation for this issue, by randomly selecting a subset of those users each time.

This commit adds a new `digest_attempted_at` column to the `user_stats` table. This column is updated every time a digest job completes for a user. Using this, we can avoid scheduling digest jobs for the same user every 30 minutes. This also removes the random user selection in 120fa8ad, and instead prioritizes users who had digests attempted the longest time ago.
2020-10-07 15:30:38 +01:00
Guo Xiang Tan 9b75d95fc6 PERF: Keep track of first unread PM and first unread group PM for user.
This optimization helps to filter away topics so that the joins on
related tables when querying for unread messages is not expensive.
2020-09-09 14:05:41 +08:00
Rafael dos Santos Silva 28669dfeb2
PERF: Faster TL3 promotion replies needed calculation (#10416)
Removing the LIMIT makes PostgreSQL use index_posts_on_user_id_and_created_at
which is much faster overall.

Before: 22 seconds
After: 100 ms
2020-08-12 11:28:34 -03:00
Kane York 869f9b20a2
PERF: Dematerialize topic_reply_count (#9769)
* PERF: Dematerialize topic_reply_count

It's only ever used for trust level promotions that run daily, or compared to 0. We don't need to track it on every post creation.

* UX: Add symbol in TL3 report if topic reply count is capped

* DEV: Drop user_stats.topic_reply_count column
2020-05-14 15:42:00 -07:00
Jarek Radosz 531016f99b
DEV: Add missing indexes to user_profiles (#8691)
* DEV: Update model annotations
* DEV: Add missing indexes to user_profiles

The columns were changed in 24347ace10 (diff-baa5914c0c7cddf3c8b5cd9139e0d091)
2020-01-09 17:08:55 +01:00
David Taylor 9348d2cfdf
PERF: Cache user badge count in user_stats table (#8610)
This means that we no longer need to run a `count()` on the user_badges table when serializing a user
2019-12-30 11:19:59 +00:00
Joffrey JAFFEUX 0d3d2c43a0
DEV: s/\$redis/Discourse\.redis (#8431)
This commit also adds a rubocop rule to prevent global variables.
2019-12-03 10:05:53 +01:00
Vinoth Kannan 2ae87a27fa make rubocop happy 2019-04-08 17:03:26 +05:30
Guo Xiang Tan f0f3deb32b PERF: Simplify query of `UserStat#update_topic_reply_count`.
For a user with alot of posts, we get a 25% speed up.
2019-04-08 17:48:19 +08:00
Guo Xiang Tan a7a5f90e20 Annotate models. 2019-04-05 17:13:12 +08:00
Sam Saffron 5b837f620b correct unread resetting to handle nulls
Note, to avoid race conditions we are setting last_unread to 10 minutes ago
if there is nothing unread.

This is safer in case of in progress transactions
we don't want to lose unread for any window of time.
2019-04-05 13:42:41 +11:00
Sam Saffron 5f896ae8f7 PERF: Keep track of when a users first unread is
This optimisation avoids large scans joining the topics table with the
topic_users table.

Previously when a user carried a lot of read state we would have to join
the entire read state with the topics table. This operation would slow down
home page and every topic page. The more read state you accumulated the
larger the impact.

The optimisation helps people who clean up unread, however if you carry
unread from years ago it will only have minimal impact.
2019-04-05 12:44:45 +11:00
Sam 236c755d62 FIX: do not store key tracking last seen time indefinitely
UserStat has some special logic to keep adding time read if repeat calls
are made in intervals less than 100 seconds. This is called regularly
when we update read timings on a topic.

We only need to cache this key in redis for 100 seconds, however previously
we would keep it forever, 1 key per user. This has potential of bloating
a very large amount of keys for no longer active users in redis.
2018-12-03 08:35:26 +11:00
Guo Xiang Tan 7534042427 DEV: Update annotations. 2018-11-07 11:11:19 +08:00
Arpit Jalan 72be638728 FEATURE: add external details to user fields 2018-09-20 08:10:51 +05:30
Sam 5f64fd0a21 DEV: remove exec_sql and replace with mini_sql
Introduce new patterns for direct sql that are safe and fast.

MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API

- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder

See more at: https://github.com/discourse/mini_sql
2018-06-19 16:13:36 +10:00
Sam b7023da894 PERF: reduce queries required for post timings
- also freezes a bunch of strings
- bypass active record for an exists query
2018-01-17 15:50:41 +11:00
Neil Lalonde b37e40eea9 FEATURE: show read time in last 60 days 2017-11-16 15:46:51 -05:00
Neil Lalonde 24af9b7d97 FIX: when a topic is deleted, update the post count stats of all user who replied 2017-11-02 18:05:23 -04:00
Guo Xiang Tan 5012d46cbd Add rubocop to our build. (#5004) 2017-07-28 10:20:09 +09:00
Sam 0aed2533ac Revert unread optimisation, has too many edge cases 2017-05-26 09:04:13 -04:00
Sam Saffron 7d59ff67b8 adjust qurey to include messages, once everything is read
then mark first_topic_unread_at to be current time
2017-05-25 18:40:32 -04:00
Sam 29fac1ac18 PERF: improve performance of unread queries
Figuring out what unread topics a user has is a very expensive
operation over time.

Users can easily accumulate 10s of thousands of tracking state rows
(1 for every topic they ever visit)

When figuring out what a user has that is unread we need to join
the tracking state records to the topic table. This can very quickly
lead to cases where you need to scan through the entire topic table.

This commit optimises it so we always keep track of the "first" date
a user has unread topics. Then we can easily filter out all earlier
topics from the join.

We use pg functions, instead of nested queries here to assist the
planner.
2017-05-25 15:07:30 -04:00
Régis Hanol cb99f59ec3 reset bounce score when email is successfully changed 2017-02-20 10:37:01 +01:00
Sam 089b1d164c annotate models
(reminder run RAILS_ENV=test bin/annotate once in a while)
2016-05-30 10:45:32 +10:00
Régis Hanol 8e611ec7a1 FEATURE: handle bounced emails 2016-05-02 23:15:32 +02:00
Sam d0ee32f3ce FIX: correct counts on user summary 2016-01-24 16:39:01 +11:00
Sam cd22b6158c PERF: stop mucking with user stats every 15 minutes
(pushed to twice daily)
2014-08-07 14:20:42 +10:00
Sam cb0ecd9ff1 PERF: store topic views in a topic view table
* cut down on storage of the work Topic, 3 times per row (in 2 indexes)
* only store one view per user per topic
* only store one view per ip per topic
2014-08-04 19:07:55 +10:00
Robin Ward 09d7a697bf OPTIMIZATION: Add index to `post_timings` and adjust the query 2014-08-01 13:14:00 -04:00
Sam e907cca62e annotate 2014-07-31 13:14:40 +10:00
Sam 0f9678fe49 FIX: faster update of all badges
Introduced badge triggers, introduced concept of badge that happens due to a post but has the post hidden

Delta badge grant happens once a minute, backed by redis
2014-07-23 11:46:07 +10:00