UX: Add a rake task to monitor progress for long rebakes (#27517)

* UX: Add a rake task to monitor progress for long rebakes

When doing mass rebaking, this task will print progress, speed and
expected time to completion (ETC) on a loop. It is meant for rebakes
that take several hours or days.

It will calculate a 10m moving average over the past 6 hours, and print
ETC accordingly.

It also shows Sidekiq stats in case Sidekiq jobs for rebaking were
enqueued separately, instead of using the rebake_posts tasks in this
file (which are 100% synchronous and do not touch Sidekiq at all).

NOTE: only the currently unbaked count at task start time is considered;
this is useful in live communities with lots of traffic, where new posts
might otherwise change the goal posts continuously.

* Satisfy stree
This commit is contained in:
Leonardo Mosquera 2024-07-02 18:52:47 -03:00 committed by GitHub
parent ea58140032
commit a86590ffd6
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 55 additions and 0 deletions

View File

@ -207,6 +207,61 @@ def remap_posts(find, type, ignore_case, replace = "")
i
end
desc "monitor rebaking progress for the current unbaked post count; Ctrl-C to exit"
task "posts:monitor_rebaking_progress", [:csv] => [:environment] do |_, args|
if args[:csv]
puts "utc_time_now,remaining_to_bake,baked_in_last_period,etc_in_days,sidekiq_enqueued,sidekiq_scheduled"
end
# remember last ID right now so the goal post isn't constantly moved by new posts being created
last_id_as_of_now = Post.where(baked_version: nil).order("id desc").first&.id
if last_id_as_of_now.nil?
warn "no posts to bake; all done"
exit
end
report_time_in_mins = 10
window_size_in_hs = 6
deltas = []
last = nil
while true
now = Post.where("id <= ? and baked_version is null", last_id_as_of_now).count
if last
delta_now = last - now
deltas.unshift delta_now
deltas = deltas.take((window_size_in_hs * 60) / report_time_in_mins)
average = deltas.reduce(:+).to_f / deltas.length.to_f / report_time_in_mins.to_f
etc_days = sprintf("%.2f", (now.to_f / average) / 60.0 / 24.0)
else
last = now
etc_days = 999 # fake initial value so that the column is 100% valid floats
end
s = Sidekiq::Stats.new
if args[:csv]
puts [Time.now.utc.iso8601, now, last - now, etc_days, s.enqueued, s.scheduled_size].join(",")
else
puts [
Time.now.utc.iso8601,
"unbaked old posts remaining: #{now}",
"baked in last period: #{last - now}",
"ETC based on #{window_size_in_hs}h avg: #{etc_days} days",
"SK enqueued: #{s.enqueued}",
"SK scheduled: #{s.scheduled_size}",
"waiting #{report_time_in_mins}min",
].join(" - ")
end
last = now
sleep report_time_in_mins * 60
end
end
desc "Remap all posts matching specific string"
task "posts:remap", %i[find replace type ignore_case] => [:environment] do |_, args|
require "highline/import"