FIX: fix normalize_raw method for nil inputs in migration scripts (#22304)

Various migration scripts define a normalize_raw method to do custom processing of post contents before storing it in the Post.raw and other fields.

They normally do not handle nil inputs, but it's a relatively common occurrence in data dumps.

Since this method is used from various points in the migration script, as it stands, the experience of using a migration script is that it will fail multiple times at different points, forcing you to fix the data or apply logic hacks every time then restarting.

This PR generalizes handling of nil input by returning a <missing> string.

Pros:

    no more messy repeated crashes + restarts
    consistency

Cons:

    it might hide data issues
        OTOH we can't print a warning on that method because it will flood the console since it's called from inside loops.

* FIX: zendesk import script: support nil inputs in normalize_raw
* FIX: return '<missing>' instead of empty string; do it for all methods
This commit is contained in:
Leonardo Mosquera 2023-06-29 13:22:47 -03:00 committed by GitHub
parent f2fe5bc84e
commit c83914e2e5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 10 additions and 0 deletions

View File

@ -160,6 +160,8 @@ class ImportScripts::Bespoke < ImportScripts::Base
end
def normalize_raw!(raw)
return "<missing>" if raw.blank?
# purple and #1223f3
raw.gsub!(/\[color=[#a-z0-9]+\]/i, "")
raw.gsub!(%r{\[/color\]}i, "")

View File

@ -196,6 +196,8 @@ class ImportScripts::Jive < ImportScripts::Base
end
def normalize_raw!(raw)
return "<missing>" if raw.blank?
raw = raw.dup
raw = raw[5..-6]

View File

@ -468,6 +468,8 @@ class ImportScripts::Yammer < ImportScripts::Base
end
def normalize_raw(raw)
return "<missing>" if raw.blank?
raw = raw.gsub('\n', "")
raw.gsub!(/\[\[user:(\d+)\]\]/) do
u = Regexp.last_match(1)

View File

@ -211,6 +211,8 @@ class ImportScripts::Zendesk < ImportScripts::Base
end
def normalize_raw(raw)
return "<missing>" if raw.blank?
raw = raw.gsub('\n', "")
raw = ReverseMarkdown.convert(raw)
raw

View File

@ -345,6 +345,8 @@ class ImportScripts::ZendeskApi < ImportScripts::Base
end
def normalize_raw(raw, user_id)
return "<missing>" if raw.blank?
raw = raw.gsub('\n', "")
raw = ReverseMarkdown.convert(raw)