Commit Graph

174 Commits

Author SHA1 Message Date
Martin Brennan 72f139191e
FIX: S3 store has_been_uploaded? was not taking into account s3 bucket path (#9810)
In some cases, between Discourse forums the hostname of a URL could match if they are hosting S3 files on the same bucket but the S3 bucket path might not. So e.g. https://testbucket.somesite.com/testpath/some/file/url.png vs https://testbucket.somesite.com/prodpath/some/file/url.png. So has_been_uploaded? was returning true for the second URL, even though it may have been uploaded on a different Discourse forum.

This is a very rare case but must be accounted for, because this impacts UrlHelper.is_local which mistakenly thinks the file has already been downloaded and thus allows the URL to be cooked, where we want to return the full URL to be downloaded using PullHotlinkedImages.
2020-05-20 10:40:38 +10:00
Sam Saffron 0cbaa8d813
FEATURE: extend duration allowed for download
Previously we would raise a warning in the logs if downloading
a file (from s3) takes longer than 60 seconds.

At scale this happens reasonably frequently.

1. Raised the duration to 3 minutes

2. Pulled the resizing mutex out of the downloading mutex
so we have less and clearer error logs
2020-05-15 12:45:47 +10:00
Sam Saffron d0d5a138c3
DEV: stop freezing frozen strings
We have the `# frozen_string_literal: true` comment on all our
files. This means all string literals are frozen. There is no need
to call #freeze on any literals.

For files with `# frozen_string_literal: true`

```
puts %w{a b}[0].frozen?
=> true

puts "hi".frozen?
=> true

puts "a #{1} b".frozen?
=> true

puts ("a " + "b").frozen?
=> false

puts (-("a " + "b")).frozen?
=> true
```

For more details see: https://samsaffron.com/archive/2018/02/16/reducing-string-duplication-in-ruby
2020-04-30 16:48:53 +10:00
Jarek Radosz c1c211365a
FIX: Improve clearing store cache (#9568)
1. Shorter
2. Simpler
3. Doesn't depend on external binaries
4. Doesn't fail on large amounts of files
5. Hopefully eliminates flaky spec errors
2020-04-28 17:24:04 +02:00
David Taylor ba616ffb50
DEV: Use a tmp directory for storing uploads in tests (#9554)
This avoids development-mode upload files from polluting the test environment
2020-04-28 14:03:04 +01:00
Gerhard Schlager c6b411f6c1 FIX: Restore to S3 didn't work without env variables
The `uplaods:migrate_to_s3` rake task should always use the environment variables, because you usually don't want to break your site's uploads during the migration. But restoring a backup should work with site settings as well as environment variables, otherwise you can't restore uploads to S3 from the web interface.
2020-04-19 20:24:40 +02:00
Gerhard Schlager baae0e7446 FIX: Infinite loop in migrate_to_s3 rake task 2020-04-19 20:24:40 +02:00
Gerhard Schlager 5bffb033df FIX: The migrate_to_s3 rake task couldn't find the AWS SDK 2020-03-26 16:41:10 +01:00
Gerhard Schlager 93b8b04b06 FIX: Migrating uploads to S3 could miss files
The rake task aborted the migration with "Already migrated" when all upload URLs linked to the correct S3 bucket even though the files didn't exist on S3. By removing the first check we force the rake task to check for the existance of uploads on S3.
2020-03-04 12:50:48 +01:00
Gerhard Schlager 0adab26e45 FIX: Don't count ignored, missing uploads in migration to S3 2020-02-12 16:18:52 +01:00
Jarek Radosz 63a4aa65ff
DEV: Ignore `ls` errors when clearing FileStore cache (#8780)
A race condition issue is possible when multiple thread/processes are calling this method.
`ls` prints out to stderr "cannot access '...': No such file or directory" if any of the files it's currently trying to list are being removed by the `xargs rm -rf` in an another process. That doesn't affect the result, but it did raise an error before this change.

Tested on a production instance where the original issue was observed.

Co-Authored-By: Régis Hanol <regis@hanol.fr>
2020-01-27 02:59:54 +01:00
Martin Brennan 7c32411881
FEATURE: Secure media allowing duplicated uploads with category-level privacy and post-based access rules (#8664)
### General Changes and Duplication

* We now consider a post `with_secure_media?` if it is in a read-restricted category.
* When uploading we now set an upload's secure status straight away.
* When uploading if `SiteSetting.secure_media` is enabled, we do not check to see if the upload already exists using the `sha1` digest of the upload. The `sha1` column of the upload is filled with a `SecureRandom.hex(20)` value which is the same length as `Upload::SHA1_LENGTH`. The `original_sha1` column is filled with the _real_ sha1 digest of the file. 
* Whether an upload `should_be_secure?` is now determined by whether the `access_control_post` is `with_secure_media?` (if there is no access control post then we leave the secure status as is).
* When serializing the upload, we now cook the URL if the upload is secure. This is so it shows up correctly in the composer preview, because we set secure status on upload.

### Viewing Secure Media

* The secure-media-upload URL will take the post that the upload is attached to into account via `Guardian.can_see?` for access permissions
* If there is no `access_control_post` then we just deliver the media. This should be a rare occurrance and shouldn't cause issues as the `access_control_post` is set when `link_post_uploads` is called via `CookedPostProcessor`

### Removed

We no longer do any of these because we do not reuse uploads by sha1 if secure media is enabled.

* We no longer have a way to prevent cross-posting of a secure upload from a private context to a public context.
* We no longer have to set `secure: false` for uploads when uploading for a theme component.
2020-01-16 13:50:27 +10:00
Gerhard Schlager e474cda321 REFACTOR: Restoring of backups and migration of uploads to S3 2020-01-14 11:41:35 +01:00
Vinoth Kannan 3b7f5db5ba
FIX: parallel spec system needs a dedicated upload folder for each worker. (#8547) 2019-12-18 11:21:57 +05:30
Vinoth Kannan d3e7768ea8 Revert "FIX: parallel spec system needs needs a dedicated upload folder for each worker. (#8372)"
This reverts commit 42e5176bc3.
2019-11-19 15:02:18 +05:30
Vinoth Kannan 42e5176bc3
FIX: parallel spec system needs needs a dedicated upload folder for each worker. (#8372) 2019-11-19 13:16:20 +05:30
Penar Musaraj 102909edb3 FEATURE: Add support for secure media (#7888)
This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access. 

A few notes: 

- the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads
- the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured
- upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status
- when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error
- when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3
2019-11-18 11:25:42 +10:00
Penar Musaraj 067696df8f DEV: Apply Rubocop redundant return style 2019-11-14 15:10:51 -05:00
David Taylor 1998be3b27
DEV: Raise errors when cleaning the download cache, and fix for macOS (#8319)
POSIX's `head` specification states: "The application shall ensure that the number option-argument is a positive decimal integer"

Negative values are supported on GNU `head`, so this works in the discourse docker image. However, in some environments (e.g. macOS), the system `head` version fails with a negative `n` parameter.

This commit does two things:

Checks the status at each stage of the pipe, so it cannot fail silently
Flip the `ls` command to list in descending time order, and use `tail -n +501` instead of `head -n -500`.

The visible result is that macOS users no longer see head: illegal line count -- -500 printed throughout the test suite.
2019-11-08 15:34:03 +00:00
Daniel Waterworth 55a1394342 DEV: pluck_first
Doing .pluck(:column).first is a very common pattern in Discourse and in
most cases, a limit cause isn't being added. Instead of adding a limit
clause to all these callsites, this commit adds two new methods to
ActiveRecord::Relation:

pluck_first, equivalent to limit(1).pluck(*columns).first

and pluck_first! which, like other finder methods, raises an exception
when no record is found
2019-10-21 12:08:20 +01:00
Gerhard Schlager 24877a7b8c FIX: Correctly encode non-ASCII filenames in HTTP header
Backport of fix from Rails 6: 890485cfce
2019-08-07 19:10:50 +02:00
Rafael dos Santos Silva 606c0ed14d
FIX: S3 uploads were missing a cache-control header (#7902)
Admins still need to run the rake task to fix the files who where uploaded previously.
2019-08-06 14:55:17 -03:00
Gerhard Schlager f2dc59d61f FEATURE: Add hidden setting to include S3 uploads in backups 2019-07-09 14:04:16 +02:00
Penar Musaraj 03805e5a76
FIX: Ensure lightbox image download has correct content disposition in S3 (#7845) 2019-07-04 11:32:51 -04:00
Vinoth Kannan b7830680b6 DEV: use cdn url to download the external uploads to local. 2019-06-06 19:17:19 +05:30
Penar Musaraj f00275ded3 FEATURE: Support private attachments when using S3 storage (#7677)
* Support private uploads in S3
* Use localStore for local avatars
* Add job to update private upload ACL on S3
* Test multisite paths
* update ACL for private uploads in migrate_to_s3 task
2019-06-06 13:27:24 +10:00
Guo Xiang Tan a3938f98f8 Revert changes to `FileStore::S3Store#path_for` in f0620e7118.
There are some places in the code base that assumes the method should
return nil.
2019-05-29 18:39:07 +08:00
Guo Xiang Tan f0620e7118 FEATURE: Support `[description|attachment](upload://<short-sha>)` in MD take 2.
Previous attempt was missing `post_uploads` records.
2019-05-29 09:26:32 +08:00
Penar Musaraj 7c9fb95c15 Temporarily revert "FEATURE: Support `[description|attachment](upload://<short-sha>)` in MD. (#7603)"
This reverts commit b1d3c678ca.

We need to make sure post_upload records are correctly stored.
2019-05-28 16:37:01 -04:00
Guo Xiang Tan b1d3c678ca FEATURE: Support `[description|attachment](upload://<short-sha>)` in MD. (#7603) 2019-05-28 11:18:21 -04:00
David Taylor ef660d5a3e FIX: Return consistent character encodings when downloading S3 uploads
Net::HTTP always returns ASCII-8BIT encoding. File.read auto-detects the encoding. This leads to an encoding inconsistency between a fresh download, and a cached download. This commit ensures all downloaded files are treated equally, by always returning the cached version from the filesystem, even during initial download.

One symptom of this problem is during theme exports: https://meta.discourse.org/t/116907

Related ruby ticket: https://bugs.ruby-lang.org/issues/2567
2019-05-17 11:27:00 +01:00
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Guo Xiang Tan 5b934cb33d FIX: Error when trying to move the same file to tombstone.
If an optimized image is destroyed when a previous similar optimized
image is already placed in the tombstone, `FileUtils.move` will blow up.
2019-04-24 16:47:36 +08:00
Guo Xiang Tan 243fb8d9ad Fix the build. 2019-03-13 17:39:07 +08:00
Guo Xiang Tan b0c8fdd7da FIX: Properly support defaults for upload site settings. 2019-03-13 16:36:57 +08:00
Régis Hanol 664e90bd17 FIX: ensure local images use local CDN when uploads are stored on S3
When the S3 store was enabled, we were only applying the S3 CDN.
So all images stored locally, like the emojis, were never put on the local CDN.

Fixed a bunch of CookedPostProcessor test by adding a call to 'optimize_urls'
in order to get final URLs.

I also removed the unnecessary PrettyText.add_s3_cdn method since this is already
handled in the CookedPostProcessor.
2019-02-20 19:24:38 +01:00
Vinoth Kannan 563b953224 DEV: Add 'backfill_etags_' to the method name since it also backfilling the etags 2019-02-19 21:54:35 +05:30
Vinoth Kannan 0472bd4adc FIX: Remove 'backfill_etags' keyword argument from 'uploads:missing' rake task
And etags backfilling code is optimized
2019-02-15 00:34:35 +05:30
Vinoth Kannan 7b5931013a Update rake task to backfill etags from s3 inventory 2019-02-14 05:18:06 +05:30
Vinoth Kannan b4f713ca52
FEATURE: Use amazon s3 inventory to manage upload stats (#6867) 2019-02-01 10:10:48 +05:30
Vinoth Kannan f94c0283b2
FIX: Use correct version when generating file path for optimized image (#6871) 2019-01-11 18:35:38 +05:30
Vinoth Kannan 75dbb98cca FEATURE: Add S3 etag value to uploads table (#6795) 2019-01-04 14:16:22 +08:00
Régis Hanol 5381096bfd PERF: new 'migrate_to_s3' rake task 2018-12-26 17:34:49 +01:00
Rishabh cae5ba7356 FIX: Ensure that multisite s3 uploads are tombstoned correctly (#6769)
* FIX: Ensure that multisite uploads are tombstoned into the correct paths

* Move multisite specs to spec/multisite/s3_store_spec.rb
2018-12-19 13:32:32 +08:00
Rishabh 503ae1829f FIX: All multisite upload paths should start with /uploads/default/.. (#6707) 2018-12-03 12:04:14 +08:00
Rishabh 871d4543cc FIX: Use File.join for relative_base_url, fix spec 2018-11-29 09:49:56 +05:30
Rishabh 05a4f3fb51 FEATURE: Multisite support for S3 image stores (#6689)
* FEATURE: Multisite support for S3 image stores

* Use File.join to concatenate all paths & fix linting on multisite/s3_store_spec.rb
2018-11-29 12:11:48 +08:00
Vinoth Kannan bcdf5b2f47 DEV: improve missing uploads query and skip checking file size 2018-11-27 02:21:33 +05:30
Vinoth Kannan 4ccf9d28eb Remove trailing whitespaces 2018-11-27 01:15:29 +05:30
Vinoth Kannan fd272eee44 FEATURE: Make uploads:missing task compatible with s3 uploads 2018-11-27 00:54:51 +05:30
Guo Xiang Tan ce6a0a5e9e FIX: Moving upload to tombstone should update modification time.
A upload created a long time ago will be nuked from the tombstone
immediately if it gets deleted.
2018-09-18 10:48:29 +08:00
Guo Xiang Tan e1b16e445e Rename `FileHelper.is_image?` -> `FileHelper.is_supported_image?`. 2018-09-12 09:22:28 +08:00
Guo Xiang Tan 8496537590 Add `RECOVER_FROM_S3` to `uploads:list_posts_with_broken_images` rake task. 2018-09-10 15:14:30 +08:00
Sam 5d96809abd FIX: improve support for subfolder S3 CDN 2018-08-22 12:31:13 +10:00
Sam f5142861e5 Revert "Revert "FIX: upload URLs from S3 on subfolder installs""
This reverts commit 26c96e97e5.

We have no choice but to run this code
2018-08-22 11:31:33 +10:00
Sam 26c96e97e5 Revert "FIX: upload URLs from S3 on subfolder installs"
This reverts commit 357df2ff4f.
2018-08-22 10:51:40 +10:00
Neil Lalonde 357df2ff4f FIX: upload URLs from S3 on subfolder installs 2018-08-21 14:58:55 -04:00
Guo Xiang Tan aafff740d2 Add `FileStore::S3Store#copy_file`. 2018-08-08 11:30:34 +08:00
Andrew Schleifer dba22bbde2 rollback changes
This reverts:
* 1baba84c438e "fix s3 subfolders harder"
* ea5e57938edf "fix test for absolute_base_url change"
2018-07-06 17:16:40 -05:00
Andrew Schleifer 52e9f49ec1 fix s3 subfolders harder
specifically, include the folder in absolute_base_url
2018-07-06 16:28:40 -05:00
Régis Hanol 448e2fe1a2 FIX: properly delete files in the download cache 2018-07-04 18:18:39 +02:00
Andrew Schleifer 4be0e31459 fix s3_cdn_url when the s3 bucket contains a folder 2018-05-23 15:51:02 -05:00
Régis Hanol 5f4f617689 FIX: cache_file storage cleanup logic was wrong
https://meta.discourse.org/t/68296
2018-01-18 17:00:04 +01:00
Sam 70bb2aa426 FEATURE: allow specifying s3 config via globals
This refactors handling of s3 so it can be specified via GlobalSetting

This means that in a multisite environment you can configure s3 uploads
without actual sites knowing credentials in s3

It is a critical setting for situations where assets are mirrored to s3.
2017-10-06 16:20:01 +11:00
Guo Xiang Tan 8cc8010564 Maintain backwards compatibility before `Jobs::MigrateUploadExtensions` runs. 2017-08-03 11:56:55 +09:00
Neil Lalonde 83011045c8 fix rubocop offenses 2017-07-31 11:59:16 -04:00
Neil Lalonde 5d528f0d15 Merge pull request #4958 from dmacjam/search_posts_by_filetype
FEATURE: Search posts by filetype
2017-07-31 11:55:34 -04:00
Guo Xiang Tan 5012d46cbd Add rubocop to our build. (#5004) 2017-07-28 10:20:09 +09:00
Neil Lalonde d8c27e3871 Merge branch 'master' into search_posts_by_filetype 2017-07-25 14:41:20 -04:00
Jakub Macina 677267ae78 Add onceoff job for uploads migration of column extension. Simplify filetype search and related rspec tests. 2017-07-12 17:19:27 +02:00
Jakub Macina 8c445e9f17 Fix backend code for searching by a filetype as a combination of uploads and topic links. Add rspec test for extracting file extension in upload. 2017-07-06 19:19:31 +02:00
Régis Hanol 5d63a7f4a6 FIX: pull hotlinked images even when they have no extension 2017-06-13 13:27:05 +02:00
Robin Ward cdbe027c1c Refactor `FileHelper` to use keyword arguments. 2017-05-24 13:54:26 -04:00
Guo Xiang Tan f534f041a0 FIX: Ensure directory exists. 2017-04-07 15:50:17 +08:00
Matt Palmer da7a44064b Fix purge_tombstone for the brave new world of secure command execution 2017-03-22 10:27:07 +11:00
Guo Xiang Tan e7c972ac89 FIX: Don't use backticks that take in inputs. 2017-03-17 15:33:51 +08:00
Régis Hanol 93dfc87b99 FIX: always set the 'content_type' when storing a file on S3 2016-10-17 19:16:29 +02:00
Guo Xiang Tan 3141c179f7 REFACTOR: Get bucket name from S3Helper. 2016-08-19 14:08:37 +08:00
Guo Xiang Tan 7ff1f6cb9d Allow custom bucket name for `FileStore::S3Store`. 2016-08-16 15:25:42 +08:00
Guo Xiang Tan 205be0d044 Remove unused require. 2016-08-15 21:58:55 +08:00
Guo Xiang Tan 0433163866 FEATURE: Support subfolders in `SiteSetting.s3_backup_bucket`. 2016-08-15 16:14:51 +08:00
Guo Xiang Tan aa5de3c40a FEATURE: Support subfolders in S3 bucket name.
This commit also fixes a bug where s3 uploads are not
moved to a tombstone folder when removed.
2016-08-15 13:07:41 +08:00
Guo Xiang Tan 3378ee223f FIX: Incorrect path being passed to `S3Store#remove_file`. 2016-08-15 11:35:30 +08:00
Guo Xiang Tan 1779a9634a FIX: Make sure we raise an error when method is not implemented. 2016-08-12 11:43:57 +08:00
Hu Ming f8a12d4940 Add support for AWS cn (#4327) 2016-07-14 16:56:09 +02:00
Régis Hanol 5169bcdb6e FIX: httpshttps ultra secure URLs 2016-06-30 16:55:01 +02:00
Régis Hanol 1caaf5208f move tombstone under 'uploads/' for easier deployment 2016-05-30 09:46:27 +02:00
Guo Xiang Tan 26084397c1
FIX: Check if file exists upfront. 2016-05-23 10:25:53 +08:00
Rafael dos Santos Silva c2e0eeb089 Separate relative base_url and upload_path
This makes easier to reason about paths
2016-03-10 00:47:18 -03:00
Rafael dos Santos Silva 4d01328bc2 FIX: Image Lightbox on Subfolder Install
Since LocalSotre.is_relative wasn't subfolder aware, images on subfolder install weren't getting lightboxed.
2016-03-02 18:44:15 -03:00
Robin Ward f155ff8270 FIX: Tests would fail if your test db's optimized image ids were high 2015-10-16 17:08:41 -04:00
Luke GB acc05dd3a5 Fix LocalStore.remove_file to not raise if source doesn't exist
FileUtils.move actually ends up raising an "unknown file type" error if the file doesn't exist instead of Errno::ENOENT.

It's possible to rescue this, but in the end it's easier to just ask move not to throw an error, since we're going to throw it away anyway.
2015-07-12 14:14:12 +01:00
Régis Hanol 81a699e2b0 better support for mixed content 2015-06-01 17:49:58 +02:00
Régis Hanol 56f077db69 FIX: optimized images fail if source is remote and S3 is disabled 2015-06-01 11:13:56 +02:00
Régis Hanol 5a143c0c6e storage engines refactor 2015-05-29 18:39:47 +02:00
Régis Hanol 8e7bfd0f29 FIX: automatically growing uploads tree 2015-05-28 01:03:24 +02:00
Régis Hanol 83d2b59fc3 FIX: s3 endpoint when using 'us-east-1' region 2015-05-27 17:50:49 +02:00
Sam a988cd5abe FIX: redirect to CDN avatar for s3 avatars 2015-05-27 12:02:57 +10:00
Régis Hanol a5d93c6705 FIX: undefined method 'max_file_size_kb' 2015-05-26 16:39:41 +02:00
Régis Hanol a797f7c664 FIX: properly handle images when using 's3_cdn_url' 2015-05-26 11:47:33 +02:00