Commit Graph

26 Commits

Author SHA1 Message Date
Alan Guo Xiang Tan 9812407f76
FIX: Redo Sidekiq monitoring to restart stuck sidekiq processes (#30198)
This commit reimplements how we monitor Sidekiq processes that are
forked from the Unicorn master process. Prior to this change, we rely on
`Jobs::Heartbeat` to enqueue a `Jobs::RunHeartbeat` job every 3 minutes.
The `Jobs::RunHeartbeat` job then sets a Redis key with a timestamp. In
the Unicorn master process, we then fetch the timestamp that has been set
by the job from Redis every 30 minutes. If the timestamp has not been
updated for more than 30 minutes, we restart the Sidekiq process. The
fundamental flaw with this approach is that it fails to consider
deployments with multiple hosts and multiple Sidekiq processes. A
sidekiq process on a host may be in a bad state but the heartbeat check
will not restart the process because the `Jobs::RunHeartbeat` job is
still being executed by the working Sidekiq processes on other hosts.

In order to properly ensure that stuck Sidekiq processs are restarted,
we now rely on the [Sidekiq::ProcessSet](https://github.com/sidekiq/sidekiq/wiki/API#processes)
API that is supported by Sidekiq. The API provides us with "near real-time (updated every 5 sec)
info about the current set of Sidekiq processes running". The API
provides useful information like the hostname, pid and also when Sidekiq
last did its own heartbeat check. With that information, we can easily
determine if a Sidekiq process needs to be restarted from the Unicorn
master process.
2024-12-18 12:48:50 +08:00
Alan Guo Xiang Tan 10ff0ee0cc
FIX: Ensure we dispose of MiniRacer::Context before forking daemons (#28361)
This commit updates `Demon::Base#start` to call `Discourse.before_fork`
before forking. According to the docs in `mini_racer`, we need to
"Dispose manually of all MiniRacer::Context objects prior to forking".

This commit is motivated by a segmentation fault which we are seeing in
production when killing a daemon process. Backtrace of the core dump
includes traces of `mini_racer` so we think this is the cause. Note that
we are not 100% sure if this will fix the issue.
2024-08-14 12:45:34 +08:00
Alan Guo Xiang Tan 23c38cbf11
DEV: Log Unicorn worker timeout backtraces to `Rails.logger` (#27257)
This commit introduces the following changes:

1. Introduce the `SignalTrapLogger` singleton which starts a single
   thread that polls a queue to log messages with the specified logger.
   This thread is necessary becasue most loggers cannot be used inside
   the `Signal.trap` context as they rely on mutexes which are not
   allowed within the context.

2. Moves the monkey patch in `freedom_patches/unicorn_http_server_patch.rb` to
   `config/unicorn.config.rb` which is already monkey patching
   `Unicorn::HttpServer`.

3. `Unicorn::HttpServer` will now automatically send a `USR2` signal to
   a unicorn worker 2 seconds before the worker is timed out by the
   Unicorn master.

4. When a Unicorn worker receives a `USR2` signal, it will now log only
   the main thread's backtraces to `Rails.logger`. Previously, it was
   `put`ing the backtraces to `STDOUT` which most people wouldn't read.
   Logging it via `Rails.logger` will make the backtraces easily
   accessible via `/logs`.
2024-06-03 12:51:12 +08:00
David Taylor 6417173082
DEV: Apply syntax_tree formatting to `lib/*` 2023-01-09 12:10:19 +00:00
Peter Zhu c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
David Taylor 1d024f77a6
FEATURE: Allow plugins to register demon processes (#11493)
This allows plugins to call `register_demon_process` with a Class inheriting from Demon::Base. The unicorn master process will take care of spawning, monitoring and restarting the process. This API should be used with extreme caution, but it is significantly cleaner than spawning processes/threads in an `after_initialize` block.

This commit also cleans up the demon spawning logging so that it uses the same format as unicorn worker logging. It also switches to the block form of `fork` to ensure that Demons exit after running, rather than returning execution to where the fork took place.
2020-12-16 09:43:39 +00:00
David Taylor 477538bf2d
DEV: setproctitle on demon processes (#11402)
This makes it easier to identify processes in `ps` output
2020-12-04 09:41:17 +00:00
David Taylor ed6b3b82bd
FIX: Reopen sidekiq log files after rotation (#9429)
Unicorn uses the USR1 to indicate that log files should be reopened. This commit implements the same functionality for our forked sidekiq workers:

- USR1 is intercepted in the unicorn master, and re-issued to all child processes
- USR1 is trapped in the sidekiq processes, and `Unicorn::Util.reopen_logs` is used to re-open log files
2020-04-16 12:13:13 +01:00
Krzysztof Kotlarek 35b1185a08 FIX: Revert Demon::DemonBase back to Demon::Base (#8132)
I introduced DemonBase because I had got some conflict between `demon/base.rb` and `jobs/base.rb`, however, to not rename base class, it is possible to use regex on absolute path in Zeitwerk custom inflector.
2019-10-02 14:54:08 +10:00
Krzysztof Kotlarek 427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
David Taylor e2449f9f23 Revert "Revert "Revert "FIX: Heartbeat check per sidekiq process (#7873)"""
This reverts commit c3497559be.
2019-08-30 11:26:16 +01:00
Sam Saffron c3497559be Revert "Revert "FIX: Heartbeat check per sidekiq process (#7873)""
This reverts commit e805d44965.
We now have mechanisms in place to ensure heartbeat will always
be scheduled even if the scheduler is overloaded per: 098f938b
2019-08-30 10:12:10 +10:00
OsamaSayegh e805d44965 Revert "FIX: Heartbeat check per sidekiq process (#7873)"
This reverts commit 340855da55.
2019-08-27 11:56:23 +00:00
Osama Sayegh 340855da55
FIX: Heartbeat check per sidekiq process (#7873)
* FIX: Heartbeat check per sidekiq process

* Rename method

* Remove heartbeat queues of previous bootups

* Regis feedback

* Refactor before_start

* Update lib/demon/sidekiq.rb

Co-Authored-By: Régis Hanol <regis@hanol.fr>

* Update lib/demon/sidekiq.rb

Co-Authored-By: Régis Hanol <regis@hanol.fr>

* Expire redis keys after 3600 seconds

* Don't use redis to store the list of queues
2019-08-26 09:33:49 +03:00
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Sam cc3fc87dd7 DEV: handle termination cleanly in autospec 2018-06-19 16:13:36 +10:00
Sam fc05164667 demo script for demonizing using fork exec
minor refinements to demon
2018-01-11 13:51:52 +11:00
Guo Xiang Tan 5012d46cbd Add rubocop to our build. (#5004) 2017-07-28 10:20:09 +09:00
Sam 0e4883a8ae correct regression where monitoring thread crashed out
add logging
2015-06-16 11:16:33 +10:00
Sam 861cd5d9b0 FIX: ensure child demon is correctly terminated from parent on stop 2015-06-15 12:36:47 +10:00
Sam 1721872084 cleanup out-of-memory detection and correction code 2015-03-27 15:44:52 +11:00
Sam 58f3fcbc1a BUGFIX: not terminating self correctly on hangups from parent 2014-06-13 11:15:40 +10:00
Sam 7c57d74e85 FEATURE: unicorn sidekiq will restart sidekiq on complete failure.
(checks every 30 minutes for complete failure)
2014-04-23 13:13:18 +10:00
Sam ead7c52a06 Refactor demonizer in prep for unicorn forking
Upgrade sidekiq
2014-04-17 15:58:00 +10:00
Sam f3cc7360e0 BUGFIX: Correct after_fork semantics
After fork SiteSettings was not getting a new process id,
causing site settings not to refresh properly in unicorn

This code also centralizes the logic
2014-03-31 12:34:13 +11:00
Régis Hanol b56b11d96a add qunit to autospec 2013-11-01 23:57:50 +01:00