Commit Graph

203 Commits

Author SHA1 Message Date
Matt Palmer bd9c919e06
FIX: don't use etags for post-upload verification (#21923)
They don't work for server-side encryption with customer keys, and so instead we just use Content-MD5 to ensure there was no corruption in transit, which is the best we can do.

See also: https://meta.discourse.org/t/s3-uploads-incompatible-with-server-side-encryption/266853
2023-07-07 09:53:49 +02:00
Emmanuel Ferdman 722180edba
DEV: Typo in an output message of uploads.rake (#22209)
Signed-off-by: emmanuel-ferdman <emmanuelferdman@gmail.com>
2023-06-21 12:00:26 +08:00
Ted Johansson b837459e1d
DEV: Add both safe and unsafe Discourse.store.download methods (#21498)
* DEV: Add both safe and unsafe Discourse.store.download methods

* DEV: Update call sites that can use the safe store download method
2023-05-11 17:27:27 +08:00
Daniel Waterworth 666536cbd1
DEV: Prefer \A and \z over ^ and $ in regexes (#19936) 2023-01-20 12:52:49 -06:00
Martin Brennan 4d2a95ffe6
FIX: Query UploadReference in UploadSecurity for existing uploads (#19917)
This fixes a longstanding issue for sites with the
secure_uploads setting enabled. What would happen is a scenario
like this, since we did not check all places an upload could be
linked to whenever we used UploadSecurity to check whether an
upload should be secure:

* Upload is created and used for site setting, set to secure: false
  since site setting uploads should not be secure. Let's say favicon
* Favicon for the site is used inside a post in a private category,
  e.g. via a Onebox
* We changed the secure status for the upload to true, since it's been
  used in a private category and we don't check if it's originator
  was a public place
* The site favicon breaks :'(

This was a source of constant consternation. Now, when an upload is _not_
being created, and we are checking if an existing upload should be
secure, we now check to see what the first record in the UploadReference
table is for that upload. If it's something public like a site setting,
then we will never change the upload to `secure`.
2023-01-20 10:24:52 +10:00
David Taylor 6417173082
DEV: Apply syntax_tree formatting to `lib/*` 2023-01-09 12:10:19 +00:00
Jarek Radosz e54a3d5ea9
DEV: Add `START_ID` to `uploads:downsize` task (#18992) 2022-11-11 22:51:48 +01:00
Jarek Radosz dc8a7e74f4
FIX: Allow attr updates of over-size-limit uploads (#18986) 2022-11-11 17:56:11 +01:00
Jarek Radosz bc22fe4fdf
DEV: Convert the downsizing script to a rake task (#18976)
…to make it testable!
2022-11-11 13:00:44 +01:00
Daniel Waterworth 167181f4b7
DEV: Quote values when constructing SQL (#18827)
All of these cases should already be safe, but still good to quote for
"defense in depth".
2022-11-01 14:05:13 -05:00
Jan Cernik 08476f17ff
FEATURE: Add dark mode option for category logos (#18460)
Adds a new upload field for a second dark mode category logo. 
This alternative will be used when the browser is in dark mode (similar to the global site setting for a dark logo).
2022-10-07 11:00:44 -04:00
Martin Brennan 8ebd5edd1e
DEV: Rename secure_media to secure_uploads (#18376)
This commit renames all secure_media related settings to secure_uploads_* along with the associated functionality.

This is being done because "media" does not really cover it, we aren't just doing this for images and videos etc. but for all uploads in the site.

Additionally, in future we want to secure more types of uploads, and enable a kind of "mixed mode" where some uploads are secure and some are not, so keeping media in the name is just confusing.

This also keeps compatibility with the `secure-media-uploads` path, and changes new
secure URLs to be `secure-uploads`.

Deprecated settings:

* secure_media -> secure_uploads
* secure_media_allow_embed_images_in_emails -> secure_uploads_allow_embed_images_in_emails
* secure_media_max_email_embed_image_size_kb -> secure_uploads_max_email_embed_image_size_kb
2022-09-29 09:24:33 +10:00
tshenry d1a15d4f8d
FIX: Should be UploadReference instead of UploadReferences (#17361)
This fixes a couple of typos that were introduced in 9db8f00
2022-07-06 11:40:54 -07:00
Gerhard Schlager 96e87af605
FIX: Rake tasks related to uploads were broken (#17085) 2022-06-14 11:05:03 +02:00
Bianca Nenciu 9db8f00b3d
FEATURE: Create upload_references table (#16146)
This table holds associations between uploads and other models. This can be used to prevent removing uploads that are still in use.

* DEV: Create upload_references
* DEV: Use UploadReference instead of PostUpload
* DEV: Use UploadReference for SiteSetting
* DEV: Use UploadReference for Badge
* DEV: Use UploadReference for Category
* DEV: Use UploadReference for CustomEmoji
* DEV: Use UploadReference for Group
* DEV: Use UploadReference for ThemeField
* DEV: Use UploadReference for ThemeSetting
* DEV: Use UploadReference for User
* DEV: Use UploadReference for UserAvatar
* DEV: Use UploadReference for UserExport
* DEV: Use UploadReference for UserProfile
* DEV: Add method to extract uploads from raw text
* DEV: Use UploadReference for Draft
* DEV: Use UploadReference for ReviewableQueuedPost
* DEV: Use UploadReference for UserProfile's bio_raw
* DEV: Do not copy user uploads to upload references
* DEV: Copy post uploads again after deploy
* DEV: Use created_at and updated_at from uploads table
* FIX: Check if upload site setting is empty
* DEV: Copy user uploads to upload references
* DEV: Make upload extraction less strict
2022-06-09 09:24:30 +10:00
Martin Brennan faf5b4d3e9
PERF: Speed up secure media and ACL sync rake tasks (#16849)
Incorporates learnings from /t/64227:

* Changes the code to set access control posts in the rake
  task to be an efficient UPDATE SQL query.
  The original version was timing out with 312017 post uploads,
  the new query took ~3s to run.
* Changes the code to mark uploads as secure/not secure in
  the rake task to be an efficient UPDATE SQL query rather than
  using UploadSecurity. This took a very long time previously,
  and now takes only a few seconds.
* Spread out ACL syncing for uploads into jobs with batches of
  100 uploads at a time, so they can be parallelized instead
  of having to wait ~1.25 seconds for each ACL to be changed
  in S3 serially.

One issue that still remains is post rebaking. Doing this serially
is painfully slow. We have a way to do this in sidekiq via PeriodicalUpdates
but this is limited by max_old_rebakes_per_15_minutes. It would
be better to fan this rebaking out into jobs like we did for the
ACL sync, but that should be done in another PR.
2022-05-23 13:14:11 +10:00
Loïc Guitaut 008b700a3f DEV: Upgrade to Rails 7
This patch upgrades Rails to version 7.0.2.4.
2022-04-28 11:51:03 +02:00
Jarek Radosz 2fc70c5572
DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
Gerhard Schlager 27f1630b01
DEV: Try to download missing uploads from origin URL (#15629) 2022-01-19 11:05:58 +01:00
Peter Zhu c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
Arpit Jalan dec7e19da3
FIX: fix error message for fix_missing_s3 rake task (#13661) 2021-07-07 19:59:03 +05:30
Arpit Jalan 236d6d91b2
FIX: `fix_missing_s3` task fails on failed upload (take 2) (#13660)
ref: 935aadbfdd
2021-07-07 18:53:43 +05:30
Arpit Jalan 935aadbfdd
FIX: do not stop `fix_missing_s3` task if saving an upload failed (#13658)
This commit logs an error and moves to next upload when saving a single
upload record fails when running `uploads:fix_missing_s3` task.
2021-07-07 16:57:24 +05:30
Gerhard Schlager 820068ddaf
FIX: `fix_missing_s3` rake task could fail due to missing upload (#13479) 2021-06-22 17:00:55 +02:00
Josh Soref 59097b207f
DEV: Correct typos and spelling mistakes (#12812)
Over the years we accrued many spelling mistakes in the code base. 

This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change 

- comments
- test descriptions
- other low risk areas
2021-05-21 11:43:47 +10:00
Arpit Jalan 626b8465ba
FIX: do not validate uploads when running `uploads:fix_missing_s3` task (#13096) 2021-05-20 15:29:09 +10:00
Arpit Jalan 130160537c
FEATURE: add support for "skip_validations" option in UploadCreator (#13094)
FIX: do not validate uploads when running `uploads:fix_missing_s3` task
2021-05-19 20:54:52 +05:30
Gerhard Schlager acc73f2989
FIX: `uploads:fix_missing_s3` rake task used wrong SHA1 (#12495) 2021-03-25 11:35:29 +01:00
Martin Brennan f49e3e5731
DEV: Add security_last_changed_at and security_last_changed_reason to uploads (#11860)
This PR adds security_last_changed_at and security_last_changed_reason to uploads. This has been done to make it easier to track down why an upload's secure column has changed and when. This necessitated a refactor of the UploadSecurity class to provide reasons why the upload security would have changed.

As well as this, a source is now provided from the location which called for the upload's security status to be updated as they are several (e.g. post creator, topic security updater, rake tasks, manual change).
2021-01-29 09:03:44 +10:00
Jarek Radosz 650e3d18c4
DEV: Add a note to S3 migration task (#11738)
A followup to #11703
2021-01-18 17:12:47 +01:00
Michael K Johnson 2a23e54632
FIX: remove migrate_from_s3 task that silently corrupts data (#11703)
Transient errors in migration are ignored, silently corrupting
data, and the migration is incomplete and misses many sources of
uploads, which will lead to an incorrect expectation of independence
from the remote object storage after announcing that the migration
was successful, regardles of whether transient errors permanently
corrupted the data.

Remove this migration until such time as it is re-written to
follow the same pattern has the migration to s3, moving the
core logic out of the task.
2021-01-17 22:33:29 +01:00
Neil Lalonde f1743ff69c
FIX: error "unknown attribute verified" in uploads rake tasks
The new column is verification_status.
verified: nil is now verification_status: unchecked
2020-09-21 10:47:52 -04:00
Martin Brennan 80268357e7
DEV: Change upload verified column to be integer (#10643)
Per review https://review.discourse.org/t/dev-add-verified-to-uploads-and-fill-in-s3-inventory-10406/14180

Change the verified column for Upload to a verified_status integer column, to avoid having NULL as a weird implicit status.
2020-09-17 13:35:29 +10:00
Sam Saffron 80e17f5f9e
DEV: Remove staff bypass on fix missing
After thinking about it, I worry that this will potentially leave a site
setting set when people hit ctrl-c ... feels a tiny bit risky, so leaving
it out.
2020-08-28 12:35:35 +10:00
Sam Saffron ed9323043f
DEV: Allow all uploads when fixing missing s3
If we do not do this we can not properly re-download files
2020-08-28 12:28:59 +10:00
Sam Saffron f7314f8ab4
DEV: Improve s3 upload image analysis
- Introduces uploads:delete_missing_s3 which can be used to "give up" and
delete broken records from the database

- Fixes a bug in fix_missing_s3 - crashing on deleted posts

- Adds more info to analyze_missing_s3
2020-08-27 11:50:07 +10:00
Sam Saffron b60c39fdf0
DEV: search more carefully for missing uploads
rake uploads:analyze_missing_s3 was not looking at all places, amend it so
it looks in all the places where uploads could exist.
2020-08-26 17:48:42 +10:00
Sam Saffron 2a7490149c
DEV: don't fail if in uploads:fix_missing_s3 when fix fails
Previously a single error on a file like invalid extension could fail the
entire rake task
2020-08-18 17:55:49 +10:00
Sam Saffron 24fe08230f
FEATURE: ensure posts are rebaked when missing is fixed
This ensures any corrupt optimized images are removed and re-created
2020-08-18 15:37:24 +10:00
Sam Saffron 5be26c7d24
DEV: add error handling in case download fails 2020-08-13 13:48:23 +10:00
Sam Saffron 5011435ec7
DEV: do not correct sha when correctly uploads 2020-08-13 11:52:57 +10:00
Sam Saffron f48fa30ecd
DEV: fix_missing_s3 attempts to re-download if missing
unverified uploads get re-downloaded and corrected if they exist in the task
2020-08-13 11:22:14 +10:00
David Taylor 451b9b245f
DEV: Rename new upload rake tasks
These tasks are s3-specific, so update the names to make that very clear
2020-08-12 23:26:13 +01:00
Sam Saffron ec173a72d9
DEV: add more counts to analyze missing 2020-08-12 17:28:58 +10:00
Sam Saffron 97fe68e980
DEV: look for avatars when analyzing missing uploads 2020-08-12 14:04:21 +10:00
Sam Saffron 990cd8c2e4
FEATURE: introduce tasks for dealing with legacy broken uploads
`rake uploads:analyze_missing` can be used get rich information regarding
uploads missing from s3 (where verified is false)

`rake uploads:fix_missing` is a work in progress task for automatically
correcting certain historic issues. At the moment it simply rebakes all
posts with missing uploads, but it will improve over time
2020-08-12 13:33:04 +10:00
Michael K Johnson 81e6bc7a0f
FEATURE: Add uploads:batch_migrate_from_s3 task to limit total posts migrated at once (#9933)
Allow limiting the number of migrations to do at once, both to do migrations that
have impact limited to multiple off-peak usage hours to reduce user impact from
a migration, and to allow tests that do only a very small number for test
purposes. ("Give me a ping, Vasili. One ping only, please.")
2020-06-04 09:48:11 +10:00
Martin Brennan 6f978bc95c
FIX: First pass to improve efficiency of secure uploads rake task (#9284)
Get rid of harmful each loop over uploads to update. Instead we put all the unique access control posts for the uploads into a map for fast access (vs using the slow .find through array) and look up the post when it is needed when looping through the uploads in batches.

On a Discourse instance with ~93k uploads, a simplified version of the old method takes > 1 minute, and a simplified version of the new method takes ~18s and uses a lot less memory.
2020-03-26 15:59:57 +10:00
Martin Brennan efd5fb665b
DEV: Fix flaky time sensitive uploads.rake specs (#9283)
Also fix issues in spec where certain uploads were not considered secure
2020-03-26 13:31:39 +10:00
Martin Brennan 0388653a4d
DEV: Upload and secure media retroactive rake task improvements (#9027)
* Add uploads:sync_s3_acls rake task to ensure the ACLs in S3 are the correct (public-read or private) setting based on upload security

* Improved uploads:disable_secure_media to be more efficient and provide better messages to the user.

* Rename uploads:ensure_correct_acl task to uploads:secure_upload_analyse_and_update as it does more than check the ACL

* Many improvements to uploads:secure_upload_analyse_and_update

* Make sure that upload.access_control_post is unscoped so deleted posts are still fetched, because they still affect the security of the upload.

* Add escape hatch for capture_stdout in the form of RAILS_ENABLE_TEST_STDOUT. If provided the capture_stdout code will be ignored, so you can see the output if you need.
2020-03-03 10:03:58 +11:00