Commit Graph

169 Commits

Author SHA1 Message Date
Martin Brennan c3cd2389fe SECURITY: use strict JSON parsing when parsing backup metadata 2020-01-15 11:24:41 +01:00
Gerhard Schlager e474cda321 REFACTOR: Restoring of backups and migration of uploads to S3 2020-01-14 11:41:35 +01:00
Vinoth Kannan 3b7f5db5ba
FIX: parallel spec system needs a dedicated upload folder for each worker. (#8547) 2019-12-18 11:21:57 +05:30
David Taylor 481efebe76
DEV: Update backup/restore pipeline to avoid `cd` (#8347) 2019-11-13 15:52:28 +00:00
Justin DiRose c3f06943c7
FIX: Account for empty uploads directory upon backup restore (#8262)
This commit fixes a case where backup restores would fail if the uploads/default directory is empty.
2019-10-30 09:33:07 -05:00
Krzysztof Kotlarek f530378df3 FIX: Restore for non-multisite is not raising an error on reconnect step (#8237)
That commit introduced a bug to the system: f69dacf979

Restore works fine for multisite, however, stopped working for non-multisite.

Reason for that was that `establish_connection` method got a check if the multisite instance is available:
```
    def self.instance
      @instance
    end

    def self.establish_connection(opts)
      @instance.establish_connection(opts) if @instance
    end
```
However, the reload method don't have that check
```
    def self.reload
      @instance = new(instance.config_filename)
    end
```

To solve it, let's ensure we are in a multisite environment before call reload
2019-10-24 11:46:22 +11:00
Krzysztof Kotlarek f69dacf979 FIX: Reconnect in restore process connects to correct DB (#8218)
Simplified flow of restore is like that
```
migrate_database
reconnect
extract_uploads
```

Problem with incorrect current database started with this fix https://github.com/discourse/discourse/commit/025d4ee91f4

Dump task is reconnecting to default database https://github.com/rails/rails/blob/master/activerecord/lib/active_record/railties/databases.rake#L429

And then, we are trying to reconnect to the original database with that code:
```
def reconnect_database
  log "Reconnecting to the database..."
  RailsMultisite::ConnectionManagement::establish_connection(db: @current_db)
end
```

This reconnect is not switching us back to correct database because of that check
https://github.com/discourse/rails_multisite/blob/master/lib/rails_multisite/connection_management.rb#L181
Basically, it finds existing handler and it thinks that we are connected to correct DB and this step can be skipped.

To solve it, we can reload RailsMultisite::ConnectionManagement which creates a new instance of that class
https://github.com/discourse/rails_multisite/blob/master/lib/rails_multisite/connection_management.rb#L38
2019-10-23 17:23:50 +11:00
Roman Rizzi 01bc465db8
DEV: Split max decompressed setting for themes and backups (#8179) 2019-10-11 14:38:10 -03:00
Roman Rizzi 5357ab3324
SECURITY: Safely decompress backups when restoring. (#8166)
* SECURITY: Safely decompress backups when restoring.

* Fix tests and update theme_controller_spec to work with zip files instead of .tar.gz
2019-10-09 11:41:16 -03:00
Krzysztof Kotlarek 427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Gerhard Schlager 1f118b1309 FEATURE: Allow plugins to manipulate site settings during backup restore 2019-08-22 22:41:26 +02:00
Gerhard Schlager d686318133 FIX: Prevent failed remaps during restores
Additional changes:
* Verbose logging of remaps during restores
* Exclude the backup_metadata table from restores
2019-08-12 17:15:01 +02:00
Gerhard Schlager 7cb51d0e40 FIX: Create readonly functions during backup
Temporarily recreate already dropped functions in the discourse_functions schema in order to allow restoring of backups which still reference dropped functions.
2019-08-09 11:39:46 +02:00
Michael Brown 31f583855a DEV: pull static check out of loop
* followup to 08b28680
* as per https://review.discourse.org/t/4713/2
2019-07-23 17:18:16 -04:00
Gerhard Schlager 68b082e1a4 FIX: Ensure that jobs don't run immediately after migrate_to_s3 2019-07-23 17:42:12 +02:00
Gerhard Schlager b73bd7fc1b FIX: Always backup local uploads in addition to files stored on S3 2019-07-19 15:13:05 +02:00
Blake Erickson b0c92bb0b9 REFACTOR: Clean up parameterized title
Follow up to [FIX: Empty backup names with unicode site titles][1]

- Use .presence - "It's cleaner"
- Update spec to use System.system_user so it is more readable

[1]: c8661674d4
2019-07-18 15:49:16 -06:00
Blake Erickson c8661674d4 FIX: Empty backup names with unicode site titles
If a site title contains unicode it may end up with an empty backup
filename because of the rails `parameterize` method we are calling.

This fix ensures that the backup filenames default to "discourse" if the
parameterized site title is empty.

Bug reported [here][1].

[1]: https://meta.discourse.org/t/backup-checksum-and-backup-name-missing-when-unicode-site-name/123192?u=blake
2019-07-17 17:07:10 -06:00
Michael Brown 08b286808a
FIX: backups taken by pg_dump >= 11 are nonportable (#7893) 2019-07-15 18:07:44 -04:00
Gerhard Schlager 5f0d38341e FIX: Remapping during restore was wrong for CDN URLs 2019-07-09 17:34:41 +02:00
Gerhard Schlager 4c1b8c7559 FIX: Remap differently when backup comes from multisite 2019-07-09 16:11:32 +02:00
Gerhard Schlager a65a9a85d5 FEATURE: Remap uploads during restore when S3 or CDN changes
In order for this to work the Backuper stores a couple of site settings
in the new backup_metadata table, because the old setting values might
not be available on restore anymore.
2019-07-09 14:04:16 +02:00
Gerhard Schlager f2dc59d61f FEATURE: Add hidden setting to include S3 uploads in backups 2019-07-09 14:04:16 +02:00
Penar Musaraj f00275ded3 FEATURE: Support private attachments when using S3 storage (#7677)
* Support private uploads in S3
* Use localStore for local avatars
* Add job to update private upload ACL on S3
* Test multisite paths
* update ACL for private uploads in migrate_to_s3 task
2019-06-06 13:27:24 +10:00
Gerhard Schlager f7a2648694 FEATURE: Migrate uploads to S3 during restore 2019-06-04 15:47:36 +02:00
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Gerhard Schlager 1ddd4a44d5 FIX: Wrong color palette after backup restore 2019-05-07 17:02:57 +02:00
Gerhard Schlager 2487e01c73 FIX: Optimized site icons were missing after backup restore 2019-05-07 17:02:57 +02:00
Guo Xiang Tan ebca588fd0 DEV: Remove unused line of code. 2019-05-02 16:54:10 +08:00
Gerhard Schlager 3aca070311 FIX: Restoring backup shouldn't change disable_emails from "yes" to "non-staff" 2019-04-16 11:48:07 +02:00
Gerhard Schlager 78f8114989 FEATURE: Allow discourse script to skip disabling of emails after restore 2019-03-07 21:49:33 +01:00
Joffrey JAFFEUX 42df20e4f0
typo (#7065) 2019-02-25 16:36:22 +01:00
Gerhard Schlager dc961fecb9 FIX: Outgoing emails were not disabled after restoring backup 2019-02-25 16:07:24 +01:00
Gerhard Schlager 6a8007e5fb FEATURE: Improve handling of backup storage errors 2019-02-20 15:16:49 +01:00
Guo Xiang Tan 8cd4ceba49 DEV: Remove unnecessary `Sidekiq.unpause!` during backup. 2019-02-19 14:01:13 +08:00
Gerhard Schlager 99ad61afb7 FEATURE: Trigger an event after a backup restore 2019-02-18 11:48:03 +01:00
Gerhard Schlager b087719340 FEATURE: Setting for excluding optimized images from backups 2019-02-13 11:10:51 +01:00
Gerhard Schlager 9eb7dea0f1 FEATURE: Setting for compression level of upload in backups 2019-02-12 15:50:31 +01:00
Gerhard Schlager 220944a38a FIX: Unpause sidekiq before adding uploads to backup
tar exits with status 1 when uploads are modified or deleted by a sidekiq job, so we need to treat it like status 0.

According to the documentation it should be safe to ignore status 1 ("Some files differ"):

> If tar was given `--create', `--append' or `--update' option, this exit code means that some files were changed while being archived and so the resulting archive does not contain the exact copy of the file set.

Status 2 ("Fatal error") still results in an exception.
2019-02-12 13:50:50 +01:00
Gerhard Schlager 5bb955dcb7 FIX: Allow restore when latest migration is a post_migration 2019-02-08 17:37:05 +01:00
Gerhard Schlager bdbf77dc38 FIX: Unpause Sidekiq before uploading backup to S3
No need to pause Sidekiq longer than really needed. Uploads to S3 can take a long time.
2019-02-05 21:22:25 +01:00
Vinoth Kannan b4f713ca52
FEATURE: Use amazon s3 inventory to manage upload stats (#6867) 2019-02-01 10:10:48 +05:30
Gerhard Schlager 45b056b615 FIX: Do not show backups stored in subfolder of bucket 2019-01-24 22:28:03 +01:00
Gerhard Schlager c94a2bc69b FIX: Raise or log error when deleting of backup fails 2019-01-24 22:26:50 +01:00
Joffrey JAFFEUX f9648de897
DEV: upgrades from Ember 2.13 to Ember 3.5.1 (#6808)
Co-Authored-By: Bianca Nenciu <nbianca@users.noreply.github.com>
Co-Authored-By: David Taylor <david@taylorhq.com>
2019-01-10 11:06:01 +01:00
Gerhard Schlager 0bc1fa8aa4 FEATURE: Don't create PM for successful automatic backups 2018-12-20 13:34:24 +01:00
Gerhard Schlager 1a8ca68ea3 FEATURE: Improve backup stats on admin dashboard
* Dashboard doesn't timeout anymore when Amazon S3 is used for backups
* Storage stats are now a proper report with the same caching rules
* Changing the backup_location, s3_backup_bucket or creating and deleting backups removes the report from the cache
* It shows the number of backups and the backup location
* It shows the used space for the correct backup location instead of always showing used space on local storage
* It shows the date of the last backup as relative date
2018-12-17 11:35:11 +01:00
Gerhard Schlager 7e1f20b07f FIX: Create CORS rule on S3 only before a backup upload 2018-12-17 00:15:37 +01:00
Gerhard Schlager 99117d664c FEATURE: Multisite support for S3 backup store (#6700) 2018-12-05 10:10:39 +08:00
Sam 1b4f2029d7 FIX: clear theme cache when restoring
Previously old themes may be cached incorrectly, this also forces
a rebake of old themes to ensure version can compile cleanly
2018-11-20 13:37:58 +11:00
Guo Xiang Tan 84d4c81a26 FEATURE: Support backup uploads/downloads directly to/from S3.
This reverts commit 3c59106bac.
2018-10-15 09:43:31 +08:00
Guo Xiang Tan 3c59106bac Revert "FEATURE: Support backup uploads/downloads directly to/from S3."
This reverts commit c29a4dddc1.

We're doing a beta bump soon so un-revert this after that is done.
2018-10-11 11:08:23 +08:00
Gerhard Schlager c29a4dddc1 FEATURE: Support backup uploads/downloads directly to/from S3. 2018-10-11 10:38:43 +08:00
Gerhard Schlager 469a2c36ed FIX: Always unpause Sidekiq after backup and restore
* Logs exceptions during the cleanup phase, but doesn't stop executing subsequent cleanup tasks.
* Notifies the user at the end of the cleanup phase, so that the log contains possible errors during that phase.
2018-09-19 20:35:43 +02:00
Guo Xiang Tan 212ee15804 FIX: Create `BaseDropper` functions in a different schema.
https://meta.discourse.org/t/error-when-restore-db-backup/93145/25?u=tgxworld
2018-08-23 12:52:21 +08:00
Régis Hanol 8f1db615db FIX: don't break restore if function does not exist 2018-07-30 22:11:38 +02:00
Guo Xiang Tan 6740631fdb TEMPFIX: Fix broken restores. 2018-07-27 12:48:16 +08:00
Joffrey JAFFEUX 578c8e861b
FIX: refreshes disk_space on backup create/destroy (#6169) 2018-07-25 08:26:30 -04:00
Arpit Jalan 7590128d38 fix typo 2018-07-04 12:01:15 +05:30
Guo Xiang Tan 0af159546a FIX: `BackupRestore::Backuper#remove_tar_leftovers` not cleaning up files.
Wildcard is sanitized when passed to `system()`.
2018-07-04 13:58:39 +08:00
Sam 5f64fd0a21 DEV: remove exec_sql and replace with mini_sql
Introduce new patterns for direct sql that are safe and fast.

MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API

- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder

See more at: https://github.com/discourse/mini_sql
2018-06-19 16:13:36 +10:00
Guo Xiang Tan 5da7c2a4ad FIX: Restorer wasn't rolling back if restore fails.
* This only applies to backup file taken with
  pg_dump 10.3+ and pg_dump 9.5.12+.
2018-04-06 09:43:32 +08:00
Guo Xiang Tan 142571bba0 Remove use of `rescue nil`.
* `rescue nil` is a really bad pattern to use in our code base.
  We should rescue errors that we expect the code to throw and
  not rescue everything because we're unsure of what errors the
  code would throw. This would reduce the amount of pain we face
  when debugging why something isn't working as expexted. I've
  been bitten countless of times by errors being swallowed as a
  result during debugging sessions.
2018-04-02 13:52:51 +08:00
Michael Brown 63a1e9b60a backup restorer: tidy pg_dump schema portability logic, add test 2018-03-20 10:32:39 +08:00
Guo Xiang Tan da8e15f954 FIX: Restorer was not extracting the patch version in dump file. 2018-03-16 11:09:56 +08:00
Michael Brown 90291318eb restorer: clarify logging 2018-03-15 12:14:08 -04:00
Guo Xiang Tan 5ef75c9c61 Improve grep pattern in restorer. 2018-03-09 15:48:12 +08:00
Guo Xiang Tan 766b41d9f4 Fix version check in restorer. 2018-03-09 15:01:10 +08:00
Guo Xiang Tan 8fd47314d9 FIX: Restore process for dump taken with `pg_dump` 10.3+.
* Since we can no longer restore into a different schema,
  we will move tables in the public schema into the backup schema
  first before restoring the dump file which goes into the public
  schema. The downside to this approach is that we will increase
  the downtime experienced during the restore process. Downtime
  would equal the duration of restoring the dump file.
2018-03-09 13:24:58 +08:00
Guo Xiang Tan a89f3160a5 Add new config to ensure backup/restore connects to PG directly.
* In `pg_dump` 10.3+ and 9.5.12+, in
  it does a `SELECT pg_catalog.set_config('search_path', '', false)`
  which changes the state of the current connection. This is known
  to be problematic with Pgbouncer which reuses connections. As such,
  we'll always try to connect directly to PG directly during
  the backup/restore process.
2018-03-09 10:28:03 +08:00
Will Jordan a41446a502 single quote password in restore command
> Followup to #3283. Quotes passwords passed to shell for backup restore.
2018-03-01 12:08:35 -08:00
Sam 88a4ec5f1b FIX: stop forking regular backup jobs 2017-12-21 09:00:48 +11:00
Guo Xiang Tan 5012d46cbd Add rubocop to our build. (#5004) 2017-07-28 10:20:09 +09:00
Leo McArdle d0b027d88d FEATURE: phase 1 of supporting multiple email addresses 2017-07-20 11:22:27 +09:00
Guo Xiang Tan b70d4da858 FIX: Only invite admins when automatic backup fails. 2017-06-15 14:04:22 +08:00
Guo Xiang Tan f6060bfbf6 Invite admins to automatic backups failure topic.
https://meta.discourse.org/t/if-automatic-backup-fails-there-should-be-a-warning/64461
2017-06-14 15:01:11 +09:00
Guo Xiang Tan 5ce8d7a8c5 Log all errors during clean up as well. 2017-06-14 11:03:50 +09:00
Jay Pfaffman 83110a1a81 FIX: allow tar to finish if files change during backup 2017-06-07 13:31:02 -07:00
Guo Xiang Tan a4deb0e47d Fix typo. 2017-03-24 20:59:34 +08:00
Guo Xiang Tan e7c972ac89 FIX: Don't use backticks that take in inputs. 2017-03-17 15:33:51 +08:00
Guo Xiang Tan b49bf889f6 SECURITY: Disallow symlinks when restoring uploads. 2017-03-17 14:27:01 +08:00
Guo Xiang Tan 7139538286 Fix typo. 2016-09-21 16:04:41 +08:00
Guo Xiang Tan 0bf7519a8a FIX: `tar --list` against a `.tar.gz` file takes too long.
This resulted in requests being blocked for an extended amount
of time when initializing the restorer.
2016-09-16 17:11:14 +08:00
Guo Xiang Tan 68637f2164 FIX: Uploads being restored into the wrong directory for multisite. 2016-09-16 14:26:06 +08:00
Guo Xiang Tan f63a797e39 SECUIRTY: Escape input made to system calls. 2016-09-16 11:58:14 +08:00
Guo Xiang Tan 8f36290c05 FIX: No need to list all the files. 2016-09-16 11:57:35 +08:00
Guo Xiang Tan 7e80810de1 FIX: Raise an error if metadata is not extracted correctly. 2016-08-25 17:20:32 +08:00
Guo Xiang Tan 3e4b02bbd4 FIX: Make sure constant reflects the right backup extenstion. 2016-08-24 10:28:23 +08:00
Guo Xiang Tan 8539f02b5e FIX: Backuper should return the full path. 2016-08-08 07:49:37 +08:00
Guo Xiang Tan adc8336949 Make sure we track restore/backlog success logs as well. 2016-08-03 16:23:47 +08:00
Guo Xiang Tan b860d1b254 FIX: Ensure uploads directory exists. 2016-08-03 16:23:47 +08:00
Guo Xiang Tan 0a942dbc73 FEATURE: Avoid creating an archive for database only backups. 2016-08-03 16:23:46 +08:00
Guo Xiang Tan 441b98579a FIX: Ensure that our restorer is backwards compatible. 2016-08-02 09:19:56 +08:00
Guo Xiang Tan 76e57ddef3 FIX: Log errors in `ensure` block of restorer. 2016-07-26 10:24:01 +08:00
Guo Xiang Tan 03aa13b2bb FEATURE: Work with compressed version of `pg_dump` during backup and restore. 2016-07-26 10:24:01 +08:00
Guo Xiang Tan 1adfa0a4b5 FEATURE: Add SiteSetting to disable readonly mode during backup. 2016-07-19 17:44:04 +08:00
Guo Xiang Tan b981041f6f Make sure we log failures in `ensure` block. 2016-07-15 11:36:47 +08:00
Neil Lalonde 91e4af0d3d FIX: restore of a backup from an older Discourse version can create new tables in the wrong schema, leading to UndefinedTable errors 2016-07-12 16:26:45 -04:00
Arpit Jalan 166d753bd3 FIX: delete PostgreSQL dump before gzipping archive (#4323) 2016-07-12 14:23:26 +02:00
Arpit Jalan ed53a24dbe FIX: backup was failing on large instances (#4319) 2016-07-11 08:36:20 +01:00