Commit Graph

210 Commits

Author SHA1 Message Date
Martin Brennan 1a3b9a7352
DEV: Secure upload rake task improvements (#29484)
This commit changes the uploads:secure_upload_analyse_and_update
and uploads:disable_secure_uploads to no longer rebake affected
posts inline. This just took way too long, and if the task stalled
you couldn't be sure if the rest of it completed.

Instead, we can update the baked_version of affected posts and
utilize our PeriodicalUpdates job to gradually rebake them. I added
warnings about increasing the site setting rebake_old_posts_count and
the global setting max_old_rebakes_per_15_minutes before doing this
as well.

For good measure, the affected post IDs are written to a JSON file too.
2024-10-31 13:33:11 +10:00
Martin Brennan ffc99253fa
DEV: Resolve TODO comments for martin-brennan
I am changing many of these to notes or resolving them as is,
most of these I have not actively worked on in years so someone
else can work on them when we get to these areas again.
2024-07-01 15:32:30 +10:00
Alan Guo Xiang Tan dc55b645b2
DEV: Allow site administrators to mark S3 uploads with a missing status (#27222)
This commit introduces the following changes which allows a site
administrator to mark `Upload` records with the `s3_file_missing`
verification status which will result in the `Upload` record being ignored when
`Discourse.store.list_missing_uploads` is ran on a site where S3 uploads
are enabled and `SiteSetting.enable_s3_inventory` is set to `true`.

1. Introduce `s3_file_missing` to `Upload.verification_statuses`
2. Introduce `Upload.mark_invalid_s3_uploads_as_missing` which updates
   `Upload#verification_status` of all `Upload` records from `invalid_etag` to `s3_file_missing`.
3. Introduce `rake uploads:mark_invalid_s3_uploads_as_missing` Rake task
   which allows a site administrator to change `Upload` records with
`invalid_etag` verification status to the `s3_file_missing`
verificaton_status.
4. Update `S3Inventory` to ignore `Upload` records with the
   `s3_file_missing` verification status.
2024-05-30 08:37:38 +08:00
misaka4e21 ba357dd6cc FIX: ignore SVGs when regenerating missing optimized images.
When running `rake uploads:regenerate_missing_optimized`,
a `Discourse::InvalidAccess` will be raised if an SVG
file is being processed as `OptimizedImage.prepend_decoder!`
doesn't support the svg extension. This commit simply copies
the original SVG file as the thumbnail, just like currently
`OptimizedImage.create_for` does.
2024-05-07 14:39:04 +02:00
Martin Brennan 0d0dbd391a
DEV: Rename with_secure_uploads? to should_secure_uploads? on Post (#26549)
This method name is a bit confusing; with_secure_uploads implies
it may return a block or something with the uploads of the post,
and has_secure_uploads implies that it's checking whether the post
is linked to any secure uploads.

should_secure_uploads? communicates the true intent of this method --
which is to say whether uploads attached to this post should be
secure or not.
2024-04-09 13:23:11 +10:00
Sérgio Saquetim 0cfc42e0e6
FEATURE: Add dark mode option for category backgrounds (#24003)
Adds a new upload field for a dark mode category background that will be used as an alternative when Discourse is using a dark mode theme.
2023-10-20 12:48:06 +00:00
Martin Brennan c532f6eb3d
FEATURE: Secure uploads in PMs only (#23398)
This adds a new secure_uploads_pm_only site setting. When secure_uploads
is true with this setting, only uploads created in PMs will be marked
secure; no uploads in secure categories will be marked as secure, and
the login_required site setting has no bearing on upload security
either.

This is meant to be a stopgap solution to prevent secure uploads
in a single place (private messages) for sensitive admin data exports.
Ideally we would want a more comprehensive way of saying that certain
upload types get secured which is a hybrid/mixed mode secure uploads,
but for now this will do the trick.
2023-09-06 09:39:09 +10:00
Matt Palmer bd9c919e06
FIX: don't use etags for post-upload verification (#21923)
They don't work for server-side encryption with customer keys, and so instead we just use Content-MD5 to ensure there was no corruption in transit, which is the best we can do.

See also: https://meta.discourse.org/t/s3-uploads-incompatible-with-server-side-encryption/266853
2023-07-07 09:53:49 +02:00
Emmanuel Ferdman 722180edba
DEV: Typo in an output message of uploads.rake (#22209)
Signed-off-by: emmanuel-ferdman <emmanuelferdman@gmail.com>
2023-06-21 12:00:26 +08:00
Ted Johansson b837459e1d
DEV: Add both safe and unsafe Discourse.store.download methods (#21498)
* DEV: Add both safe and unsafe Discourse.store.download methods

* DEV: Update call sites that can use the safe store download method
2023-05-11 17:27:27 +08:00
Daniel Waterworth 666536cbd1
DEV: Prefer \A and \z over ^ and $ in regexes (#19936) 2023-01-20 12:52:49 -06:00
Martin Brennan 4d2a95ffe6
FIX: Query UploadReference in UploadSecurity for existing uploads (#19917)
This fixes a longstanding issue for sites with the
secure_uploads setting enabled. What would happen is a scenario
like this, since we did not check all places an upload could be
linked to whenever we used UploadSecurity to check whether an
upload should be secure:

* Upload is created and used for site setting, set to secure: false
  since site setting uploads should not be secure. Let's say favicon
* Favicon for the site is used inside a post in a private category,
  e.g. via a Onebox
* We changed the secure status for the upload to true, since it's been
  used in a private category and we don't check if it's originator
  was a public place
* The site favicon breaks :'(

This was a source of constant consternation. Now, when an upload is _not_
being created, and we are checking if an existing upload should be
secure, we now check to see what the first record in the UploadReference
table is for that upload. If it's something public like a site setting,
then we will never change the upload to `secure`.
2023-01-20 10:24:52 +10:00
David Taylor 6417173082
DEV: Apply syntax_tree formatting to `lib/*` 2023-01-09 12:10:19 +00:00
Jarek Radosz e54a3d5ea9
DEV: Add `START_ID` to `uploads:downsize` task (#18992) 2022-11-11 22:51:48 +01:00
Jarek Radosz dc8a7e74f4
FIX: Allow attr updates of over-size-limit uploads (#18986) 2022-11-11 17:56:11 +01:00
Jarek Radosz bc22fe4fdf
DEV: Convert the downsizing script to a rake task (#18976)
…to make it testable!
2022-11-11 13:00:44 +01:00
Daniel Waterworth 167181f4b7
DEV: Quote values when constructing SQL (#18827)
All of these cases should already be safe, but still good to quote for
"defense in depth".
2022-11-01 14:05:13 -05:00
Jan Cernik 08476f17ff
FEATURE: Add dark mode option for category logos (#18460)
Adds a new upload field for a second dark mode category logo. 
This alternative will be used when the browser is in dark mode (similar to the global site setting for a dark logo).
2022-10-07 11:00:44 -04:00
Martin Brennan 8ebd5edd1e
DEV: Rename secure_media to secure_uploads (#18376)
This commit renames all secure_media related settings to secure_uploads_* along with the associated functionality.

This is being done because "media" does not really cover it, we aren't just doing this for images and videos etc. but for all uploads in the site.

Additionally, in future we want to secure more types of uploads, and enable a kind of "mixed mode" where some uploads are secure and some are not, so keeping media in the name is just confusing.

This also keeps compatibility with the `secure-media-uploads` path, and changes new
secure URLs to be `secure-uploads`.

Deprecated settings:

* secure_media -> secure_uploads
* secure_media_allow_embed_images_in_emails -> secure_uploads_allow_embed_images_in_emails
* secure_media_max_email_embed_image_size_kb -> secure_uploads_max_email_embed_image_size_kb
2022-09-29 09:24:33 +10:00
tshenry d1a15d4f8d
FIX: Should be UploadReference instead of UploadReferences (#17361)
This fixes a couple of typos that were introduced in 9db8f00
2022-07-06 11:40:54 -07:00
Gerhard Schlager 96e87af605
FIX: Rake tasks related to uploads were broken (#17085) 2022-06-14 11:05:03 +02:00
Bianca Nenciu 9db8f00b3d
FEATURE: Create upload_references table (#16146)
This table holds associations between uploads and other models. This can be used to prevent removing uploads that are still in use.

* DEV: Create upload_references
* DEV: Use UploadReference instead of PostUpload
* DEV: Use UploadReference for SiteSetting
* DEV: Use UploadReference for Badge
* DEV: Use UploadReference for Category
* DEV: Use UploadReference for CustomEmoji
* DEV: Use UploadReference for Group
* DEV: Use UploadReference for ThemeField
* DEV: Use UploadReference for ThemeSetting
* DEV: Use UploadReference for User
* DEV: Use UploadReference for UserAvatar
* DEV: Use UploadReference for UserExport
* DEV: Use UploadReference for UserProfile
* DEV: Add method to extract uploads from raw text
* DEV: Use UploadReference for Draft
* DEV: Use UploadReference for ReviewableQueuedPost
* DEV: Use UploadReference for UserProfile's bio_raw
* DEV: Do not copy user uploads to upload references
* DEV: Copy post uploads again after deploy
* DEV: Use created_at and updated_at from uploads table
* FIX: Check if upload site setting is empty
* DEV: Copy user uploads to upload references
* DEV: Make upload extraction less strict
2022-06-09 09:24:30 +10:00
Martin Brennan faf5b4d3e9
PERF: Speed up secure media and ACL sync rake tasks (#16849)
Incorporates learnings from /t/64227:

* Changes the code to set access control posts in the rake
  task to be an efficient UPDATE SQL query.
  The original version was timing out with 312017 post uploads,
  the new query took ~3s to run.
* Changes the code to mark uploads as secure/not secure in
  the rake task to be an efficient UPDATE SQL query rather than
  using UploadSecurity. This took a very long time previously,
  and now takes only a few seconds.
* Spread out ACL syncing for uploads into jobs with batches of
  100 uploads at a time, so they can be parallelized instead
  of having to wait ~1.25 seconds for each ACL to be changed
  in S3 serially.

One issue that still remains is post rebaking. Doing this serially
is painfully slow. We have a way to do this in sidekiq via PeriodicalUpdates
but this is limited by max_old_rebakes_per_15_minutes. It would
be better to fan this rebaking out into jobs like we did for the
ACL sync, but that should be done in another PR.
2022-05-23 13:14:11 +10:00
Loïc Guitaut 008b700a3f DEV: Upgrade to Rails 7
This patch upgrades Rails to version 7.0.2.4.
2022-04-28 11:51:03 +02:00
Jarek Radosz 2fc70c5572
DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
Gerhard Schlager 27f1630b01
DEV: Try to download missing uploads from origin URL (#15629) 2022-01-19 11:05:58 +01:00
Peter Zhu c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
Arpit Jalan dec7e19da3
FIX: fix error message for fix_missing_s3 rake task (#13661) 2021-07-07 19:59:03 +05:30
Arpit Jalan 236d6d91b2
FIX: `fix_missing_s3` task fails on failed upload (take 2) (#13660)
ref: 935aadbfdd
2021-07-07 18:53:43 +05:30
Arpit Jalan 935aadbfdd
FIX: do not stop `fix_missing_s3` task if saving an upload failed (#13658)
This commit logs an error and moves to next upload when saving a single
upload record fails when running `uploads:fix_missing_s3` task.
2021-07-07 16:57:24 +05:30
Gerhard Schlager 820068ddaf
FIX: `fix_missing_s3` rake task could fail due to missing upload (#13479) 2021-06-22 17:00:55 +02:00
Josh Soref 59097b207f
DEV: Correct typos and spelling mistakes (#12812)
Over the years we accrued many spelling mistakes in the code base. 

This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change 

- comments
- test descriptions
- other low risk areas
2021-05-21 11:43:47 +10:00
Arpit Jalan 626b8465ba
FIX: do not validate uploads when running `uploads:fix_missing_s3` task (#13096) 2021-05-20 15:29:09 +10:00
Arpit Jalan 130160537c
FEATURE: add support for "skip_validations" option in UploadCreator (#13094)
FIX: do not validate uploads when running `uploads:fix_missing_s3` task
2021-05-19 20:54:52 +05:30
Gerhard Schlager acc73f2989
FIX: `uploads:fix_missing_s3` rake task used wrong SHA1 (#12495) 2021-03-25 11:35:29 +01:00
Martin Brennan f49e3e5731
DEV: Add security_last_changed_at and security_last_changed_reason to uploads (#11860)
This PR adds security_last_changed_at and security_last_changed_reason to uploads. This has been done to make it easier to track down why an upload's secure column has changed and when. This necessitated a refactor of the UploadSecurity class to provide reasons why the upload security would have changed.

As well as this, a source is now provided from the location which called for the upload's security status to be updated as they are several (e.g. post creator, topic security updater, rake tasks, manual change).
2021-01-29 09:03:44 +10:00
Jarek Radosz 650e3d18c4
DEV: Add a note to S3 migration task (#11738)
A followup to #11703
2021-01-18 17:12:47 +01:00
Michael K Johnson 2a23e54632
FIX: remove migrate_from_s3 task that silently corrupts data (#11703)
Transient errors in migration are ignored, silently corrupting
data, and the migration is incomplete and misses many sources of
uploads, which will lead to an incorrect expectation of independence
from the remote object storage after announcing that the migration
was successful, regardles of whether transient errors permanently
corrupted the data.

Remove this migration until such time as it is re-written to
follow the same pattern has the migration to s3, moving the
core logic out of the task.
2021-01-17 22:33:29 +01:00
Neil Lalonde f1743ff69c
FIX: error "unknown attribute verified" in uploads rake tasks
The new column is verification_status.
verified: nil is now verification_status: unchecked
2020-09-21 10:47:52 -04:00
Martin Brennan 80268357e7
DEV: Change upload verified column to be integer (#10643)
Per review https://review.discourse.org/t/dev-add-verified-to-uploads-and-fill-in-s3-inventory-10406/14180

Change the verified column for Upload to a verified_status integer column, to avoid having NULL as a weird implicit status.
2020-09-17 13:35:29 +10:00
Sam Saffron 80e17f5f9e
DEV: Remove staff bypass on fix missing
After thinking about it, I worry that this will potentially leave a site
setting set when people hit ctrl-c ... feels a tiny bit risky, so leaving
it out.
2020-08-28 12:35:35 +10:00
Sam Saffron ed9323043f
DEV: Allow all uploads when fixing missing s3
If we do not do this we can not properly re-download files
2020-08-28 12:28:59 +10:00
Sam Saffron f7314f8ab4
DEV: Improve s3 upload image analysis
- Introduces uploads:delete_missing_s3 which can be used to "give up" and
delete broken records from the database

- Fixes a bug in fix_missing_s3 - crashing on deleted posts

- Adds more info to analyze_missing_s3
2020-08-27 11:50:07 +10:00
Sam Saffron b60c39fdf0
DEV: search more carefully for missing uploads
rake uploads:analyze_missing_s3 was not looking at all places, amend it so
it looks in all the places where uploads could exist.
2020-08-26 17:48:42 +10:00
Sam Saffron 2a7490149c
DEV: don't fail if in uploads:fix_missing_s3 when fix fails
Previously a single error on a file like invalid extension could fail the
entire rake task
2020-08-18 17:55:49 +10:00
Sam Saffron 24fe08230f
FEATURE: ensure posts are rebaked when missing is fixed
This ensures any corrupt optimized images are removed and re-created
2020-08-18 15:37:24 +10:00
Sam Saffron 5be26c7d24
DEV: add error handling in case download fails 2020-08-13 13:48:23 +10:00
Sam Saffron 5011435ec7
DEV: do not correct sha when correctly uploads 2020-08-13 11:52:57 +10:00
Sam Saffron f48fa30ecd
DEV: fix_missing_s3 attempts to re-download if missing
unverified uploads get re-downloaded and corrected if they exist in the task
2020-08-13 11:22:14 +10:00
David Taylor 451b9b245f
DEV: Rename new upload rake tasks
These tasks are s3-specific, so update the names to make that very clear
2020-08-12 23:26:13 +01:00