FIX: First pass to improve efficiency of secure uploads rake task (#9284)

Get rid of harmful each loop over uploads to update. Instead we put all the unique access control posts for the uploads into a map for fast access (vs using the slow .find through array) and look up the post when it is needed when looping through the uploads in batches.

On a Discourse instance with ~93k uploads, a simplified version of the old method takes > 1 minute, and a simplified version of the new method takes ~18s and uses a lot less memory.
This commit is contained in:
Martin Brennan 2020-03-26 15:59:57 +10:00 committed by GitHub
parent efd5fb665b
commit 6f978bc95c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 13 additions and 8 deletions

View File

@ -710,13 +710,6 @@ task "uploads:secure_upload_analyse_and_update" => :environment do
# secure based on site settings.
uploads_to_update = Upload.includes(:posts, :optimized_images).joins(:post_uploads)
# we do this to avoid a heavier post query, and to make sure we only
# get unique posts AND include deleted posts (unscoped)
unique_access_control_posts = Post.unscoped.select(:id, :topic_id).includes(topic: :category).where(id: uploads_to_update.pluck(:access_control_post_id).uniq)
uploads_to_update.each do |upload|
upload.access_control_post = unique_access_control_posts.find { |post| post.id == upload.access_control_post_id }
end
puts "There are #{uploads_to_update.count} upload(s) that could be marked secure.", ""
# Simply mark all these uploads as secure if login_required because no anons will be able to access them
@ -840,8 +833,20 @@ def determine_upload_security_and_posts_to_rebake(uploads_to_update, mark_secure
uploads_to_adjust_acl_for = []
posts_to_rebake = {}
# we do this to avoid a heavier post query, and to make sure we only
# get unique posts AND include deleted posts (unscoped)
unique_access_control_posts = {}
Post.unscoped.select(:id, :topic_id)
.includes(topic: :category)
.where(id: uploads_to_update.pluck(:access_control_post_id).uniq).find_each do |post|
unique_access_control_posts[post.id] = post
end
i = 0
uploads_to_update.find_each(batch_size: 50) do |upload_to_update|
uploads_to_update.find_each do |upload_to_update|
# fetch the post out of the already populated map to avoid n1s
upload_to_update.access_control_post = unique_access_control_posts[upload_to_update.access_control_post_id]
# we just need to determine the post security here so the ACL is set to the correct thing,
# because the update_upload_ACL method uses upload.secure?