discourse

Commit Graph

Author	SHA1	Message	Date
Alan Guo Xiang Tan	9812407f76	FIX: Redo Sidekiq monitoring to restart stuck sidekiq processes (#30198 ) This commit reimplements how we monitor Sidekiq processes that are forked from the Unicorn master process. Prior to this change, we rely on `Jobs::Heartbeat` to enqueue a `Jobs::RunHeartbeat` job every 3 minutes. The `Jobs::RunHeartbeat` job then sets a Redis key with a timestamp. In the Unicorn master process, we then fetch the timestamp that has been set by the job from Redis every 30 minutes. If the timestamp has not been updated for more than 30 minutes, we restart the Sidekiq process. The fundamental flaw with this approach is that it fails to consider deployments with multiple hosts and multiple Sidekiq processes. A sidekiq process on a host may be in a bad state but the heartbeat check will not restart the process because the `Jobs::RunHeartbeat` job is still being executed by the working Sidekiq processes on other hosts. In order to properly ensure that stuck Sidekiq processs are restarted, we now rely on the [Sidekiq::ProcessSet](https://github.com/sidekiq/sidekiq/wiki/API#processes) API that is supported by Sidekiq. The API provides us with "near real-time (updated every 5 sec) info about the current set of Sidekiq processes running". The API provides useful information like the hostname, pid and also when Sidekiq last did its own heartbeat check. With that information, we can easily determine if a Sidekiq process needs to be restarted from the Unicorn master process.	2024-12-18 12:48:50 +08:00
Kelv	04ba5baec0	DEV: ensure rebaking works even when some users have inconsistent data (#30261 ) * DEV: add db consistency check for UserEmail * DEV: add db consistency check for UserAvatar * DEV: ignore inconsistent data related to user avatars when deciding whether to rebake old posts Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com> --------- Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>	2024-12-16 19:48:25 +08:00
Alan Guo Xiang Tan	f35128c6ed	DEV: Fix broken sidekiq logging due to `eeb01ea0de` (#30199 )	2024-12-10 17:01:25 +08:00
Alan Guo Xiang Tan	eeb01ea0de	DEV: Remove unnecessary thread in `Jobs::Base::JobInstrumenter` take 2 (#30195 ) This reverts commit `766ff723f8`. Ensure that we create the sidekiq log file first before opening it for logging. This avoids any issue of the log file not being present when we initialize an instance of the `Logger`.	2024-12-10 12:44:56 +08:00
Alan Guo Xiang Tan	766ff723f8	Revert "DEV: Remove unnecessary thread in `Jobs::Base::JobInstrumenter` (#30179 )" (#30193 ) This reverts commit `1670ffe82d`.	2024-12-10 09:24:40 +08:00
Alan Guo Xiang Tan	1670ffe82d	DEV: Remove unnecessary thread in `Jobs::Base::JobInstrumenter` (#30179 ) In `Jobs::Base::JobInstrumenter.raw_log`, we were creating an instance of `Queue` and then pushing messages to the queue before popping it off the queue in a thread. However, this complexity is not necessary when we can just write directly to the logger without much overhead. This is how all logging is done in other parts of the app as well.	2024-12-10 06:29:46 +08:00
Bianca Nenciu	5e734516db	DEV: Drop DISCOURSE_LIVE_SLOTS_SIDEKIQ_LIMIT (#29920 ) This was used to track jobs that may leak memory, but proved to be too noisy and not very useful.	2024-11-26 07:21:14 +11:00
Juan David Martínez Cubillos	08440b0035	DEV: Add tl3_custom_promotions plugin modifier to tl3_promotions.rb (#29834 ) * DEV: Add tl3_custom_promotions plugin modifier to tl3_promotions.rb * added tests * added tests for demotions * changed argument order in test	2024-11-22 15:28:43 -05:00
Bianca Nenciu	250a145361	DEV: Fix undefined variable (#29876 ) Follow up to commit `429cf656e7`.	2024-11-21 20:23:20 +02:00
Bianca Nenciu	429cf656e7	FIX: Use FinalDestination::HTTP to push notifications (#29858 ) Sometimes `Jobs::PushNotification` gets stuck, probably because of the network call. This commit replaces `Excon` with `FinalDestination::HTTP` which is safer.	2024-11-21 14:11:51 +11:00
Angus McLeod	ec7de0fd68	Require permitted scopes when registering a client (#29718 )	2024-11-19 15:28:04 -05:00
Keegan George	eb2992a628	REVERT: Check for features sooner (#29746 )	2024-11-13 11:30:34 -08:00
Keegan George	8dc474952b	DEV: Check for features sooner (#29745 ) This PR updates the Job for checking new features from the features feed from every day to every hour. This allows for easier rollout of features.	2024-11-13 10:08:13 -08:00
Alan Guo Xiang Tan	322a3be2db	DEV: Remove logical OR assignment of constants (#29201 ) Constants should always be only assigned once. The logical OR assignment of a constant is a relic of the past before we used zeitwerk for autoloading and had bugs where a file could be loaded twice resulting in constant redefinition warnings.	2024-10-16 10:09:07 +08:00
Yuvaraj J	65a1e149ad	FIX: Notify mailing list subscribers on category change (#28811 ) cf. https://meta.discourse.org/t/email-notifications-dont-get-sent-on-category-change-for-mailing-list-mode-users/308096	2024-10-11 14:47:39 +02:00
Alan Guo Xiang Tan	c1f25cdf5b	FIX: Unicorn master and Sidekiq reopening logs at the same time (#29137 ) In our production environment, we have been seeing Sidekiq processes getting stuck randomly when a USR1 signal is sent to the Unicorn master process. We have not been able to identify the root cause of why the Sidekiq process gets stuck. We however noticed that when the Unicorn master process receives a USR1 signal, it will reopen the logs for the Unicorn master process first before sending a USR1 signal for the Unicorn worker processes to reopen the logs. We figured that we should do the same for the Sidekiq process as well when a USR1 signal. In this commit, we introduce an arbitrary delay of 1 second before we the Sidekiq process reopens its log files so as to allow enough time for the Unicorn master to finish reopening it logs first. We also do not send reopen logs for the Sidekiq process if the `DISCOURSE_LOG_SIDEKIQ` env is not present because there is no need to reopen any logs.	2024-10-10 08:01:40 +08:00
Ted Johansson	e60876ce49	FIX: Appropriately handle uninstalled problem checks (#28771 ) When running checks, we look to the existing problem check trackers and try to grab their ProblemCheck classes. In some cases this is no longer in the problem check repository, e.g. when the check was part of a plugin that has been uninstalled. In the case where the check was scheduled, this would lead to an error in one of the jobs	2024-09-18 10:11:52 +08:00
Ted Johansson	776b4ec8e2	DEV: Remove old problem check system - Part 1 (#28772 ) We're now using the new, database-backed problem check system. This PR removes parts of the old, Redis-backed system that is now defunct.	2024-09-06 17:00:25 +08:00
Loïc Guitaut	e94707acdf	DEV: Drop `WithServiceHelper` This patch removes the `with_service` helper from the code base. Instead, we can pass a block with actions directly to the `.call` method of a service. This simplifies how to use services: - use `.call` without a block to run the service and get its result object. - use `.call` with a block of actions to run the service and execute arbitrary code depending on the service outcome. It also means a service is now “self-contained” and can be used anywhere without having to include a helper or whatever.	2024-09-05 09:58:20 +02:00
Osama Sayegh	280adda09c	FEATURE: Support designating multiple groups as mods on category (#28655 ) Currently, categories support designating only 1 group as a moderation group on the category. This commit removes the one group limitation and makes it possible to designate multiple groups as mods on a category. Internal topic: t/124648.	2024-09-04 04:38:46 +03:00
Renato Atilio	54d6e52607	FIX: chat mailer log noise (#28616 ) Fixes the log noise caused by a deprecation notice	2024-08-29 11:39:08 -03:00
Bianca Nenciu	95b09dd777	DEV: Log live slots of Sidekiq jobs (#28600 ) Introduce a new log line for Sidekiq jobs that consume more than `DISCOURSE_LIVE_SLOTS_SIDEKIQ_LIMIT` live slots. This is useful to track down jobs that may leak memory. This is enabled only when Sidekiq's job instrumenter is enabled (set `DISCOURSE_LOG_SIDEKIQ` to `1`).	2024-08-29 12:23:27 +03:00
Loïc Guitaut	0636855706	DEV: Allow using an AR relation as a model in services This patch allows using an AR relation as a model in services without fetching associated records. It will just check if the relation is empty or not. In the former case, the execution will stop at that point, as expected.	2024-08-20 16:32:46 +02:00
Martin Brennan	c120c446da	DEV: Cleanup empty method in job (#28395 ) Followup `624dc87321`	2024-08-16 14:10:46 +08:00
Krzysztof Kotlarek	e82e255531	FIX: serialize Flags instead of PostActionType (#28362 ) ### Why? Before, all flags were static. Therefore, they were stored in class variables and serialized by SiteSerializer. Recently, we added an option for admins to add their own flags or disable existing flags. Therefore, the class variable had to be dropped because it was unsafe for a multisite environment. However, it started causing performance problems. ### Solution When a new Flag system is used, instead of using PostActionType, we can serialize Flags and use fragment cache for performance reasons. At the same time, we are still supporting deprecated `replace_flags` API call. When it is used, we fall back to the old solution and the admin cannot add custom flags. In a couple of months, we will be able to drop that API function and clean that code properly. However, because it may still be used, redis cache was introduced to improve performance. To test backward compatibility you can add this code to any plugin ```ruby replace_flags do \|flag_settings\| flag_settings.add( 4, :inappropriate, topic_type: true, notify_type: true, auto_action_type: true, ) flag_settings.add(1001, :trolling, topic_type: true, notify_type: true, auto_action_type: true) end ```	2024-08-14 12:13:46 +10:00
Krzysztof Kotlarek	559c9dfe0a	REVERT: FIX: serialize Flags instead of PostActionType (#28334 )	2024-08-13 18:32:11 +10:00
David Battersby	0954ae70a6	FEATURE: add delay to native push notifications (#28314 ) This change ensures native push notifications respect the site setting for push_notification_time_window_mins. Previously only web push notifications would account for the delay, now we can bring more consistency between Discourse in browser vs Hub, by applying the same delay strategy to both forms of push notifications.	2024-08-13 12:12:05 +04:00
Krzysztof Kotlarek	094052c1ff	FIX: serialize Flags instead of PostActionType (#28259 ) ### Why? Before, all flags were static. Therefore, they were stored in class variables and serialized by SiteSerializer. Recently, we added an option for admins to add their own flags or disable existing flags. Therefore, the class variable had to be dropped because it was unsafe for a multisite environment. However, it started causing performance problems. ### Solution When a new Flag system is used, instead of using PostActionType, we can serialize Flags and use fragment cache for performance reasons. At the same time, we are still supporting deprecated `replace_flags` API call. When it is used, we fall back to the old solution and the admin cannot add custom flags. In a couple of months, we will be able to drop that API function and clean that code properly. However, because it may still be used, redis cache was introduced to improve performance. To test backward compatibility you can add this code to any plugin ```ruby replace_flags do \|flag_settings\| flag_settings.add( 4, :inappropriate, topic_type: true, notify_type: true, auto_action_type: true, ) flag_settings.add(1001, :trolling, topic_type: true, notify_type: true, auto_action_type: true) end ```	2024-08-13 11:22:37 +10:00
Alan Guo Xiang Tan	1a09d6b246	FEATURE: Add `live_slots_(start\|finish)` for Sidekiq perf logging (#28260 ) This information is helpful in debugging memory spikes when Sidekiq processes jobs.	2024-08-07 15:48:24 +08:00
Natalie Tay	624dc87321	DEV: Avoid initializing max_image_size_kb in initializer (#28209 ) Since this might initialize to the default db's values rather than that of the site.	2024-08-02 23:15:14 +08:00
David Battersby	6ec8728ebf	DEV: refactor live notifications setting in user preferences (#28145 ) This change is mainly a refactor of the desktop notifications service to improve readability and have standardised values for tracking state for current user in regards to the Notification API and Push API. Also improves readability when handling push notification jobs, especially in scenarios where the push_notification_time_window_mins site setting is set to 0, which will allow sending push notifications instantly.	2024-08-02 17:25:15 +04:00
锦心	319075e4dd	FIX: Ensure JsLocaleHelper to not output deprecated translations (#28037 ) * FIX: Ensure JsLocaleHelper to obly outputs up-to-date translations The old implementation forgot to filter out deprecated translations, causing these translations to incorrectly override the new locale in the frontend. This commit fills in the forgotten where clause, filtering only the up-to-date part. Related meta topic: https://meta.discourse.org/t/outdated-translation-replacement-causing-missing-translation/314352	2024-07-29 15:21:25 +08:00
Alan Guo Xiang Tan	5a37fa3760	FIX: Fix `Jobs::Onceoff.enqueue_all` undefined method for nilClass error (#28073 ) In development, classes are lazy loaded so `Jobs::Onceoff.onceoff_job_klasses` may not have been set. This is not a problem in production cause stuff is eager loaded. Follow-up to `f4d06f195d`	2024-07-25 15:52:42 +08:00
Alan Guo Xiang Tan	f4d06f195d	PERF: Avoid using `ObjectSpace.each_object` in `Jobs::Onceoff.enqueue_all` (#28072 ) We are investigating a memory leak in Sidekiq and saw the following line when comparing heap dumps over time. `Allocated IMEMO 14775 objects of size 591000/7389528 (in bytes) at: /var/www/discourse/app/jobs/onceoff/onceoff.rb:36` That line in question was doing a `.select { \|klass\| klass < self }` on `ObjectSpace.each_object(Class)`. This for some reason is allocating a whole bunch of `IMEMO` objects which are instruction sequence objects. Instead of diving deeper into why this might be leaking, we can just save our time by switching to an implementation that is more efficient and does not require looping through a ton of objects.	2024-07-25 13:30:56 +08:00
Guhyoun Nam	784c04ea81	FEATURE: Add Mechanism to redeliver all failed webhook events (#27609 ) Background: In order to redrive failed webhook events, an operator has to go through and click on each. This PR is adding a mechanism to retry all failed events to help resolve issues quickly once the underlying failure has been resolved. What is the change?: Previously, we had to redeliver each webhook event. This merge is adding a 'Redeliver Failed' button next to the webhook event filter to redeliver all failed events. If there is no failed webhook events to redeliver, 'Redeliver Failed' gets disabled. If you click it, a window pops up to confirm the operator. Failed webhook events will be added to the queue and webhook event list will show the redelivering progress. Every minute, a job will be ran to go through 20 events to redeliver. Every hour, a job will cleanup the redelivering events which have been stored more than 8 hours.	2024-07-08 15:43:16 -05:00
Keegan George	ea58140032	DEV: Remove summarization code (#27373 )	2024-07-02 08:51:47 -07:00
Sam	61610a61fa	FIX: disallow concurrent downloads of hotlinked images (#27676 )	2024-07-02 10:06:46 +01:00
Jan Cernik	6599b85a75	DEV: Block accidental serialization of entire AR models (#27668 )	2024-07-01 17:08:48 -03:00
Martin Brennan	ffc99253fa	DEV: Resolve TODO comments for martin-brennan I am changing many of these to notes or resolving them as is, most of these I have not actively worked on in years so someone else can work on them when we get to these areas again.	2024-07-01 15:32:30 +10:00
Gabriel Grubba	f3a89620a1	FEATURE: Add WebHookEventsDailyAggregate (#27542 ) * FEATURE: Add WebHookEventsDailyAggregate Add WebHookEventsDailyAggregate model to store daily aggregates of web hook events. Add AggregateWebHooksEvents job to aggregate web hook events daily. Add spec for WebHookEventsDailyAggregate model. * DEV: Update annotations for web_hook_events_daily_aggregate.rb * DEV: Update app/jobs/scheduled/aggregate_web_hooks_events.rb Co-authored-by: Martin Brennan <martin@discourse.org> * DEV: Address review feedback Solves: - https://github.com/discourse/discourse/pull/27542#discussion_r1646961101 - https://github.com/discourse/discourse/pull/27542#discussion_r1646958890 - https://github.com/discourse/discourse/pull/27542#discussion_r1646976808 - https://github.com/discourse/discourse/pull/27542#discussion_r1646979846 - https://github.com/discourse/discourse/pull/27542#discussion_r1646981036 * A11Y: Add translation to retain_web_hook_events_aggregate_days key * FEATURE: Purge old web hook events daily aggregate Solves: https://github.com/discourse/discourse/pull/27542#discussion_r1646961101 * DEV: Update tests for web_hook_events_daily_aggregate Update WebHookEventsDailyAggregate to not use save! at the end Solves: https://github.com/discourse/discourse/pull/27542#discussion_r1646984601 * PERF: Change job query to use WebHook table instead of WebHookEvent table * DEV: Update tests to use `fab!` * DEV: Address code review feedback. Add idempotency to job Add has_many to WebHook * DEV: add test case for job and change job query * DEV: Change AggregateWebHooksEvents job test name --------- Co-authored-by: Martin Brennan <martin@discourse.org>	2024-06-25 13:56:47 -03:00
Alan Guo Xiang Tan	adc824a9bc	FIX: `Jobs::EnsureS3UploadsExistence` broken for multisite (#27401 ) This is a follow-up to `8cf4ed5f88`.	2024-06-10 16:26:39 +08:00
Alan Guo Xiang Tan	8cf4ed5f88	DEV: Introduce hidden `s3_inventory_bucket` site setting (#27304 ) This commit introduces a hidden `s3_inventory_bucket` site setting which replaces the `enable_s3_inventory` and `s3_configure_inventory_policy` site setting. The reason `enable_s3_inventory` and `s3_configure_inventory_policy` site settings are removed is because this feature has technically been broken since it was introduced. When the `enable_s3_inventory` feature is turned on, the app will because configure a daily inventory policy for the `s3_upload_bucket` bucket and store the inventories under a prefix in the bucket. The problem here is that once the inventories are created, there is nothing cleaning up all these inventories so whoever that has enabled this feature would have been paying the cost of storing a whole bunch of inventory files which are never used. Given that we have not received any complains about inventory files inflating S3 storage costs, we think that it is very likely that this feature is no longer being used and we are looking to drop support for this feature in the not too distance future. For now, we will still support a hidden `s3_inventory_bucket` site setting which site administrators can configure via the `DISCOURSE_S3_INVENTORY_BUCKET` env.	2024-06-10 13:16:00 +08:00
Loïc Guitaut	2a28cda15c	DEV: Update to lastest rubocop-discourse	2024-05-27 18:06:14 +02:00
Mark VanLandingham	971b66e440	DEV: Move webhook event header modifier for redelivery-recalucation (#27177 )	2024-05-24 10:37:10 -05:00
Alan Guo Xiang Tan	df16ab0758	FIX: `S3Inventory` to ignore files older than last backup restore date (#27166 ) This commit updates `S3Inventory#files` to ignore S3 inventory files which have a `last_modified` timestamp which are not at least 2 days older than `BackupMetadata.last_restore_date` timestamp. This check was previously only in `Jobs::EnsureS3UploadsExistence` but `S3Inventory` can also be used via Rake tasks so this protection needs to be in `S3Inventory` and not in the scheduled job.	2024-05-24 10:54:06 +08:00
Jeff Wong	3a3ee5e04a	DEV: replace .each with .find_each for paginated queries (#27159 ) Large batches of reviewables may require paginated queries.	2024-05-23 15:42:21 -07:00
Ted Johansson	3137e60653	DEV: Database backed admin notices (#26192 ) This PR introduces a basic AdminNotice model to store these notices. Admin notices are categorized by their source/type (currently only notices from problem check.) They also have a priority.	2024-05-23 09:29:08 +08:00
Régis Hanol	958437e7dd	FIX: send activity summaries based on "last seen" (#27035 ) instead of "last emailed" so that people getting email notifications (from a watched topic for example) also get the activity summaries. Context - https://meta.discourse.org/t/activity-summary-not-sent-if-other-emails-are-sent/293040 Internal Ref - t/125582 Improvement over `95885645d9`	2024-05-22 10:23:03 +02:00
Isaac Janzen	ede0fa5802	DEV: Update bulk-invite logs and PM template (#27057 ) # Preview <img width="754" alt="Screenshot 2024-05-17 at 8 50 03 AM" src="https://github.com/discourse/discourse/assets/50783505/6710234f-0195-42be-b70e-9d57ba48bb4a"> # New Logs ``` [2024-05-17 08:49:54 -0600] Invalid User Field 'backend name' for 'foobarbing@gmail.com' [2024-05-17 08:49:54 -0600] Invalid Email 'test [2024-05-17 08:49:54 -0600] Invalid Email 'this@$@**.com ```	2024-05-17 12:21:21 -06:00
Mark VanLandingham	9264479c27	DEV: Add modifier for webhook event header generation (#27054 )	2024-05-17 09:33:39 -05:00

1 2 3 4 5 ...

1500 Commits