In a multisite Discourse reported that no backup is running after 60 seconds because the Redis key expired. Also, the thread that listens for a shutdown signal stopped running immediately because it didn't detect a running operation.
This also adds an optional `ticket` parameter to `Backuper` which allows identifying the backup in `backup_complete` and `backup_failed` events. Both events contain the logs as payload and moving some methods around ensures that all errors are included in the logs.
Discourse automatically sends a private message after backup or
restore finished. The private message used to contain the log inline
even when it was very long. A very long log can create issues because
the length of the post will be over the maximum allowed length of a
post. When that happens, Discourse will try to create an upload with
the logs. If that fails, it will trim the log and inline it.
This also ensures that restoring a backup works when it was created with the wrong upload paths in the time between ab4c0a4970 (shortly after v2.6.0.beta1) and this fix.
Follow up to [FIX: Empty backup names with unicode site titles][1]
- Use .presence - "It's cleaner"
- Update spec to use System.system_user so it is more readable
[1]: c8661674d4
In order for this to work the Backuper stores a couple of site settings
in the new backup_metadata table, because the old setting values might
not be available on restore anymore.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
tar exits with status 1 when uploads are modified or deleted by a sidekiq job, so we need to treat it like status 0.
According to the documentation it should be safe to ignore status 1 ("Some files differ"):
> If tar was given `--create', `--append' or `--update' option, this exit code means that some files were changed while being archived and so the resulting archive does not contain the exact copy of the file set.
Status 2 ("Fatal error") still results in an exception.
* Dashboard doesn't timeout anymore when Amazon S3 is used for backups
* Storage stats are now a proper report with the same caching rules
* Changing the backup_location, s3_backup_bucket or creating and deleting backups removes the report from the cache
* It shows the number of backups and the backup location
* It shows the used space for the correct backup location instead of always showing used space on local storage
* It shows the date of the last backup as relative date
* Logs exceptions during the cleanup phase, but doesn't stop executing subsequent cleanup tasks.
* Notifies the user at the end of the cleanup phase, so that the log contains possible errors during that phase.
* `rescue nil` is a really bad pattern to use in our code base.
We should rescue errors that we expect the code to throw and
not rescue everything because we're unsure of what errors the
code would throw. This would reduce the amount of pain we face
when debugging why something isn't working as expexted. I've
been bitten countless of times by errors being swallowed as a
result during debugging sessions.