DEV: Print out backtraces of all threads when spec times out on CI (#25356)

Why this change?

We have been seeing specs time out on GitHub CI but the problem is that
we are unable to debug those timeouts due to a lack of information. This
change seeks to print out the backtraces of all threads right before a
spec times out on CI.

What does this change do?

1. Starts a thread on CI which will wait for a spec to start running.
1. Once a spec starts running, the thread will sleep for
   `PER_SPEC_TIMEOUT_SECONDS -1` seconds.
1. After sleeping, the thread checks if the spec is still running and
   prints out the backtraces of all threads if it is. Otherwise, the
thread does nothing and runs the next loop.
1. At the end of each spec run, we ensure that the thread is in a
   waiting state for the next spec run to start.

Note that there is no need for us to teardown or cleanup the thread
since the process terminates after running all the tests.
This commit is contained in:
Alan Guo Xiang Tan 2024-01-23 06:55:49 +08:00 committed by GitHub
parent c80aca214e
commit 0478b45fd3
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 41 additions and 0 deletions

View File

@ -458,14 +458,55 @@ RSpec.configure do |config|
class SpecTimeoutError < StandardError
end
mutex = Mutex.new
condition_variable = ConditionVariable.new
test_running = false
is_waiting = false
backtrace_logger =
Thread.new do
loop do
mutex.synchronize do
is_waiting = true
condition_variable.wait(mutex)
is_waiting = false
end
sleep PER_SPEC_TIMEOUT_SECONDS - 1
if mutex.synchronize { test_running }
puts "::group::[#{Process.pid}] Threads backtraces 1 second before timeout"
Thread.list.each do |thread|
puts "\n"
thread.backtrace.each { |line| puts line }
puts "\n"
end
puts "::endgroup::"
end
rescue StandardError => e
puts "Error in backtrace logger: #{e}"
end
end
config.around do |example|
Timeout.timeout(
PER_SPEC_TIMEOUT_SECONDS,
SpecTimeoutError,
"Spec timed out after #{PER_SPEC_TIMEOUT_SECONDS} seconds",
) do
mutex.synchronize do
test_running = true
condition_variable.signal
end
example.run
rescue SpecTimeoutError
ensure
mutex.synchronize { test_running = false }
backtrace_logger.wakeup
sleep 0.01 while !mutex.synchronize { is_waiting }
end
end
end