discourse

Commit Graph

Author	SHA1	Message	Date
Martin Brennan	b500949ef6	FEATURE: Initial implementation of direct S3 uploads with uppy and stubs (#13787 ) This adds a few different things to allow for direct S3 uploads using uppy. These changes are still not the default. There are hidden `enable_experimental_image_uploader` and `enable_direct_s3_uploads` settings that must be turned on for any of this code to be used, and even if they are turned on only the User Card Background for the user profile actually uses uppy-image-uploader. A new `ExternalUploadStub` model and database table is introduced in this pull request. This is used to keep track of uploads that are uploaded to a temporary location in S3 with the direct to S3 code, and they are eventually deleted a) when the direct upload is completed and b) after a certain time period of not being used. ### Starting a direct S3 upload When an S3 direct upload is initiated with uppy, we first request a presigned PUT URL from the new `generate-presigned-put` endpoint in `UploadsController`. This generates an S3 key in the `temp` folder inside the correct bucket path, along with any metadata from the clientside (e.g. the SHA1 checksum described below). This will also create an `ExternalUploadStub` and store the details of the temp object key and the file being uploaded. Once the clientside has this URL, uppy will upload the file direct to S3 using the presigned URL. Once the upload is complete we go to the next stage. ### Completing a direct S3 upload Once the upload to S3 is done we call the new `complete-external-upload` route with the unique identifier of the `ExternalUploadStub` created earlier. Only the user who made the stub can complete the external upload. One of two paths is followed via the `ExternalUploadManager`. 1. If the object in S3 is too large (currently 100mb defined by `ExternalUploadManager::DOWNLOAD_LIMIT`) we do not download and generate the SHA1 for that file. Instead we create the `Upload` record via `UploadCreator` and simply copy it to its final destination on S3 then delete the initial temp file. Several modifications to `UploadCreator` have been made to accommodate this. 2. If the object in S3 is small enough, we download it. When the temporary S3 file is downloaded, we compare the SHA1 checksum generated by the browser with the actual SHA1 checksum of the file generated by ruby. The browser SHA1 checksum is stored on the object in S3 with metadata, and is generated via the `UppyChecksum` plugin. Keep in mind that some browsers will not generate this due to compatibility or other issues. We then follow the normal `UploadCreator` path with one exception. To cut down on having to re-upload the file again, if there are no changes (such as resizing etc) to the file in `UploadCreator` we follow the same copy + delete temp path that we do for files that are too large. 3. Finally we return the serialized upload record back to the client There are several errors that could happen that are handled by `UploadsController` as well. Also in this PR is some refactoring of `displayErrorForUpload` to handle both uppy and jquery file uploader errors.	2021-07-28 08:42:25 +10:00
Gerhard Schlager	157f10db4c	FEATURE: Use path from existing URL of uploads and optimized images (#13177 ) Discourse shouldn't dynamically calculate the path of uploads and optimized images after a file has been stored on disk or S3. Otherwise it might calculate the wrong path if the SHA1 or extension stored in the database doesn't match the actual file path.	2021-05-27 17:42:25 +02:00
Jarek Radosz	64ce12a758	FIX: `OptimizedImage#filesize` (#10095 ) `OptimizedImage#filesize` calls `Discourse.store.download` with an OptimizedImage as an argument. It would in turn attempt to call `#original_filename` and `#secure?` on that object. Both would fail as these methods do not exist on OptimizedImage, only on Upload. We didn't know about these issues because: 1. `#calculate_filesize` is not called often, because the filesize is saved on OptimizedImage creation, so it's used mostly for manual filesize recalculation 2. we were using `rescue nil` which swallows all errors	2020-07-06 17:01:29 +02:00
Jarek Radosz	3d55f2e3b7	FIX: Improvements and fixes to the image downsizing script (#9950 ) Fixed bugs, added specs, extracted the upload downsizing code to a class, added support for non-S3 setups, changed it so that images aren't downloaded twice. This code has been tested on production and successfully resized ~180k uploads. Includes: * DEV: Extract upload downsizing logic * DEV: Add support for non-S3 uploads * DEV: Process only images uploaded by users * FIX: Incorrect usage of `count` and `exist?` typo * DEV: Spec S3 image downsizing * DEV: Avoid downloading images twice * DEV: Update filesizes earlier in the process * DEV: Return false on invalid upload * FIX: Download images that currently above the limit (If the image size limit is decreased, then there was no way to resize those images that now fall outside the allowed size range) * Update script/downsize_uploads.rb (Co-authored-by: Régis Hanol <regis@hanol.fr>)	2020-06-11 14:47:59 +02:00
Jarek Radosz	27ad562ff5	DEV: Rubocop fix	2020-06-01 06:07:07 +02:00
Jarek Radosz	7df688d108	FIX: Handle files removed between `glob` and `mtime`	2020-06-01 05:50:50 +02:00
Sam Saffron	0cbaa8d813	FEATURE: extend duration allowed for download Previously we would raise a warning in the logs if downloading a file (from s3) takes longer than 60 seconds. At scale this happens reasonably frequently. 1. Raised the duration to 3 minutes 2. Pulled the resizing mutex out of the downloading mutex so we have less and clearer error logs	2020-05-15 12:45:47 +10:00
Sam Saffron	d0d5a138c3	DEV: stop freezing frozen strings We have the `# frozen_string_literal: true` comment on all our files. This means all string literals are frozen. There is no need to call #freeze on any literals. For files with `# frozen_string_literal: true` ``` puts %w{a b}[0].frozen? => true puts "hi".frozen? => true puts "a #{1} b".frozen? => true puts ("a " + "b").frozen? => false puts (-("a " + "b")).frozen? => true ``` For more details see: https://samsaffron.com/archive/2018/02/16/reducing-string-duplication-in-ruby	2020-04-30 16:48:53 +10:00
Jarek Radosz	c1c211365a	FIX: Improve clearing store cache (#9568 ) 1. Shorter 2. Simpler 3. Doesn't depend on external binaries 4. Doesn't fail on large amounts of files 5. Hopefully eliminates flaky spec errors	2020-04-28 17:24:04 +02:00
David Taylor	ba616ffb50	DEV: Use a tmp directory for storing uploads in tests (#9554 ) This avoids development-mode upload files from polluting the test environment	2020-04-28 14:03:04 +01:00
Jarek Radosz	63a4aa65ff	DEV: Ignore `ls` errors when clearing FileStore cache (#8780 ) A race condition issue is possible when multiple thread/processes are calling this method. `ls` prints out to stderr "cannot access '...': No such file or directory" if any of the files it's currently trying to list are being removed by the `xargs rm -rf` in an another process. That doesn't affect the result, but it did raise an error before this change. Tested on a production instance where the original issue was observed. Co-Authored-By: Régis Hanol <regis@hanol.fr>	2020-01-27 02:59:54 +01:00
Martin Brennan	7c32411881	FEATURE: Secure media allowing duplicated uploads with category-level privacy and post-based access rules (#8664 ) ### General Changes and Duplication * We now consider a post `with_secure_media?` if it is in a read-restricted category. * When uploading we now set an upload's secure status straight away. * When uploading if `SiteSetting.secure_media` is enabled, we do not check to see if the upload already exists using the `sha1` digest of the upload. The `sha1` column of the upload is filled with a `SecureRandom.hex(20)` value which is the same length as `Upload::SHA1_LENGTH`. The `original_sha1` column is filled with the _real_ sha1 digest of the file. * Whether an upload `should_be_secure?` is now determined by whether the `access_control_post` is `with_secure_media?` (if there is no access control post then we leave the secure status as is). * When serializing the upload, we now cook the URL if the upload is secure. This is so it shows up correctly in the composer preview, because we set secure status on upload. ### Viewing Secure Media * The secure-media-upload URL will take the post that the upload is attached to into account via `Guardian.can_see?` for access permissions * If there is no `access_control_post` then we just deliver the media. This should be a rare occurrance and shouldn't cause issues as the `access_control_post` is set when `link_post_uploads` is called via `CookedPostProcessor` ### Removed We no longer do any of these because we do not reuse uploads by sha1 if secure media is enabled. * We no longer have a way to prevent cross-posting of a secure upload from a private context to a public context. * We no longer have to set `secure: false` for uploads when uploading for a theme component.	2020-01-16 13:50:27 +10:00
Vinoth Kannan	3b7f5db5ba	FIX: parallel spec system needs a dedicated upload folder for each worker. (#8547 )	2019-12-18 11:21:57 +05:30
Vinoth Kannan	d3e7768ea8	Revert "FIX: parallel spec system needs needs a dedicated upload folder for each worker. (#8372 )" This reverts commit `42e5176bc3`.	2019-11-19 15:02:18 +05:30
Vinoth Kannan	42e5176bc3	FIX: parallel spec system needs needs a dedicated upload folder for each worker. (#8372 )	2019-11-19 13:16:20 +05:30
Penar Musaraj	102909edb3	FEATURE: Add support for secure media (#7888 ) This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access. A few notes: - the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads - the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured - upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status - when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error - when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3	2019-11-18 11:25:42 +10:00
David Taylor	1998be3b27	DEV: Raise errors when cleaning the download cache, and fix for macOS (#8319 ) POSIX's `head` specification states: "The application shall ensure that the number option-argument is a positive decimal integer" Negative values are supported on GNU `head`, so this works in the discourse docker image. However, in some environments (e.g. macOS), the system `head` version fails with a negative `n` parameter. This commit does two things: Checks the status at each stage of the pipe, so it cannot fail silently Flip the `ls` command to list in descending time order, and use `tail -n +501` instead of `head -n -500`. The visible result is that macOS users no longer see head: illegal line count -- -500 printed throughout the test suite.	2019-11-08 15:34:03 +00:00
Vinoth Kannan	b7830680b6	DEV: use cdn url to download the external uploads to local.	2019-06-06 19:17:19 +05:30
David Taylor	ef660d5a3e	FIX: Return consistent character encodings when downloading S3 uploads Net::HTTP always returns ASCII-8BIT encoding. File.read auto-detects the encoding. This leads to an encoding inconsistency between a fresh download, and a cached download. This commit ensures all downloaded files are treated equally, by always returning the cached version from the filesystem, even during initial download. One symptom of this problem is during theme exports: https://meta.discourse.org/t/116907 Related ruby ticket: https://bugs.ruby-lang.org/issues/2567	2019-05-17 11:27:00 +01:00
Sam Saffron	30990006a9	DEV: enable frozen string literal on all files This reduces chances of errors where consumers of strings mutate inputs and reduces memory usage of the app. Test suite passes now, but there may be some stuff left, so we will run a few sites on a branch prior to merging	2019-05-13 09:31:32 +08:00
Guo Xiang Tan	b0c8fdd7da	FIX: Properly support defaults for upload site settings.	2019-03-13 16:36:57 +08:00
Vinoth Kannan	f94c0283b2	FIX: Use correct version when generating file path for optimized image (#6871 )	2019-01-11 18:35:38 +05:30
Régis Hanol	5381096bfd	PERF: new 'migrate_to_s3' rake task	2018-12-26 17:34:49 +01:00
Rishabh	05a4f3fb51	FEATURE: Multisite support for S3 image stores (#6689 ) * FEATURE: Multisite support for S3 image stores * Use File.join to concatenate all paths & fix linting on multisite/s3_store_spec.rb	2018-11-29 12:11:48 +08:00
Vinoth Kannan	fd272eee44	FEATURE: Make uploads:missing task compatible with s3 uploads	2018-11-27 00:54:51 +05:30
Régis Hanol	448e2fe1a2	FIX: properly delete files in the download cache	2018-07-04 18:18:39 +02:00
Régis Hanol	5f4f617689	FIX: cache_file storage cleanup logic was wrong https://meta.discourse.org/t/68296	2018-01-18 17:00:04 +01:00
Guo Xiang Tan	8cc8010564	Maintain backwards compatibility before `Jobs::MigrateUploadExtensions` runs.	2017-08-03 11:56:55 +09:00
Neil Lalonde	83011045c8	fix rubocop offenses	2017-07-31 11:59:16 -04:00
Jakub Macina	677267ae78	Add onceoff job for uploads migration of column extension. Simplify filetype search and related rspec tests.	2017-07-12 17:19:27 +02:00
Jakub Macina	8c445e9f17	Fix backend code for searching by a filetype as a combination of uploads and topic links. Add rspec test for extracting file extension in upload.	2017-07-06 19:19:31 +02:00
Robin Ward	cdbe027c1c	Refactor `FileHelper` to use keyword arguments.	2017-05-24 13:54:26 -04:00
Guo Xiang Tan	3378ee223f	FIX: Incorrect path being passed to `S3Store#remove_file`.	2016-08-15 11:35:30 +08:00
Guo Xiang Tan	1779a9634a	FIX: Make sure we raise an error when method is not implemented.	2016-08-12 11:43:57 +08:00
Régis Hanol	5169bcdb6e	FIX: httpshttps ultra secure URLs	2016-06-30 16:55:01 +02:00
Robin Ward	f155ff8270	FIX: Tests would fail if your test db's optimized image ids were high	2015-10-16 17:08:41 -04:00
Régis Hanol	81a699e2b0	better support for mixed content	2015-06-01 17:49:58 +02:00
Régis Hanol	56f077db69	FIX: optimized images fail if source is remote and S3 is disabled	2015-06-01 11:13:56 +02:00
Régis Hanol	5a143c0c6e	storage engines refactor	2015-05-29 18:39:47 +02:00
Régis Hanol	8e7bfd0f29	FIX: automatically growing uploads tree	2015-05-28 01:03:24 +02:00
Sam	a988cd5abe	FIX: redirect to CDN avatar for s3 avatars	2015-05-27 12:02:57 +10:00
Régis Hanol	31e9cafe0e	FEATURE: use original filename when clicking the download link in the lightbox	2014-10-15 19:20:04 +02:00
Régis Hanol	652cc3efba	FEATURE: new rake task to clean up uploads & thumbnails	2014-09-29 18:31:53 +02:00
Régis Hanol	542d54e6bf	BUGFIX: uploads to S3	2014-04-15 13:04:14 +02:00
Régis Hanol	e732aa8a86	BUGFIX: we should not store absolute urls for locally uploaded avatar templates Highly recommended to run: `RAILS_ENV=production bundle exec rake avatars:regenerate` to fix the avatar templates stored in the database.	2014-01-07 17:45:06 +01:00
Régis Hanol	52160179f8	add a tombstone for extra safety	2013-11-27 22:05:11 +01:00
Régis Hanol	37fd7ab574	pull hotlinked images	2013-11-05 19:07:29 +01:00

47 Commits