Gerhard Schlager
a597ef7131
PERF: Speed up S3 inventory updates ( #19110 )
...
The UPDATE statement could lock the `uploads` table for a very long time
when the `verification_status` of lots of uploads changed. Splitting up
and simplifying the UPDATE solves that problem.
Also, this change ensures that only the needed data from the inventory
gets inserted into the `TEMP TABLE`. For example, there's no need to
have records for optimized images in that table when the `uploads` table
gets updated.
2022-11-20 21:52:30 +01:00
Peter Zhu
c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 ( #15459 )
...
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
Sam
e0c952290b
FIX: increase inventory lag for s3 to 2 days ( #11606 )
...
Inventory on S3 always lagged, over the past few weeks we are noticing that
1 day of lag is not enough.
We are increasing this to 2, to ensure that we do not get false positive
reports.
2020-12-30 16:05:42 +11:00
Penar Musaraj
9f6c4ad71a
FIX: inconsistency in S3 inventory config ( #11112 )
...
Ensures it matches S3 inventory config generation in our hosting.
2020-11-05 08:39:40 -05:00
Martin Brennan
80268357e7
DEV: Change upload verified column to be integer ( #10643 )
...
Per review https://review.discourse.org/t/dev-add-verified-to-uploads-and-fill-in-s3-inventory-10406/14180
Change the verified column for Upload to a verified_status integer column, to avoid having NULL as a weird implicit status.
2020-09-17 13:35:29 +10:00
Penar Musaraj
06b4ca5dc7
FIX: Mark only uploads as verified/unverified in S3 inventory
2020-09-14 10:21:34 -04:00
David Taylor
bd0a7553c4
DEV: Detect when s3 inventory failure is caused by etag difference ( #10427 )
2020-08-13 09:30:28 +10:00
Martin Brennan
b950b3fb3f
DEV: Add verified to uploads and fill in S3 inventory ( #10406 )
...
When we run the S3 inventory, mark uploads that exist as verified true, those that don't as verified false, and uploads not included in the check / not yet checked as verified nil.
2020-08-11 14:43:51 +10:00
David Taylor
16c65a94f7
PERF: Preload S3 inventory data for multisite clusters
2020-07-29 10:31:55 +01:00
David Taylor
ec4024fe6d
FIX: Keep by_users check in S3 inventory
...
Partial revert of 8515d8fa
- the by_users check is ensuring we don't raise errors for fixtures
2020-07-21 17:19:56 +01:00
David Taylor
8515d8fae5
FIX: Improve S3 inventory logic
...
Previously we considered 'upload rows without etags' to be exempt from the check. This is bad, because older/migrated sites might not have etags on all their uploads. We should consider rows without etags to be broken, since we can't check them against the inventory.
This also removes the `by_users` scope. We need all uploads to be working, even ones created by the system user.
2020-07-21 15:55:53 +01:00
David Taylor
3d65678a13
DEV: Add timestamp columns to optimized_images table ( #10199 )
...
This allows us to filter by created/updated date when comparing to an S3 inventory.
2020-07-14 11:50:33 +01:00
David Taylor
7f2b5a446a
PERF: Remove post_upload recovery in daily EnsureS3UploadsExistence job ( #10173 )
...
This is a very expensive process, and it should only be required in exceptional circumstances. It is possible to run a similar recovery using `rake uploads:recover` (5284d41a8e/lib/upload_recovery.rb (L135-L184)
)
2020-07-06 16:26:40 +01:00
Sam Saffron
38a30a6e96
DEV: correct regression and correct tests
...
etag change in 31976ecf
was incorrect, revert it
Also correct regression in test suite.
2020-07-06 10:56:19 +10:00
Sam Saffron
31976ecfeb
PERF: only update etag when it changes
...
Previously when synchronizing upload etags we would update every single one
regardless of change.
2020-07-06 10:40:04 +10:00
Jarek Radosz
73b04976e5
FIX: Use updated_at in the S3 inventory job ( #8823 )
...
When we change upload's sha1 (e.g. when resizing images) it won't match the data in the most recent S3 inventory index. With this change the uploads that have been updated since the inventory has been generated are ignored.
2020-01-31 11:02:44 +01:00
Vinoth Kannan
3b7f5db5ba
FIX: parallel spec system needs a dedicated upload folder for each worker. ( #8547 )
2019-12-18 11:21:57 +05:30
Osama Sayegh
68708db721
DEV: `S3Inventory#unsorted_files` should always return an array ( #8034 )
2019-08-23 17:59:31 +10:00
Sam Saffron
e53a171916
FIX: hold s3 related distributed locks longer
...
These operations are pretty expensive and can take multiple minutes due to
networking.
Hold distributed mutex for much longer.
2019-08-15 11:48:44 +10:00
Vinoth Kannan
9919ee1900
FIX: remove the tmp inventory files after the s3 uploads check.
2019-08-13 11:52:57 +05:30
Guo Xiang Tan
8a64b0c8e8
Revert "DEV: Remove unused kwarg and properly check for local missing uploads."
...
This reverts commit 97769f3d02
.
The code is confusing but this change is quite risky. Defer for now
until we can look at it properly.
2019-07-29 14:35:34 +08:00
Guo Xiang Tan
97769f3d02
DEV: Remove unused kwarg and properly check for local missing uploads.
2019-07-29 14:21:06 +08:00
Vinoth Kannan
47deb8b3da
FIX: use same id for both original & optimized inventories in multisite setup.
2019-07-25 14:16:47 +05:30
Vinoth Kannan
ad04ce9f43
FIX: remove post upload record creation inside 'find_missing_uploads' method.
2019-07-19 01:44:08 +05:30
Vinoth Kannan
35d6fff69e
PERF: use url instead of file key in temporary inventory table.
2019-06-13 22:03:58 +05:30
David Taylor
ed21128ee6
FIX: Do not change directory when decompressing S3 inventory
...
In sidekiq, jobs are run in multiple threads within the same process. `cd` affects the entire process, so can cause unexpected issues in other running jobs.
2019-06-13 17:13:50 +01:00
Vinoth Kannan
d74ee9dbce
DEV: skip S3 inventory records without correct multisite prefix.
2019-06-08 18:36:06 +05:30
Vinoth Kannan
2941c77abc
FIX: skip upload recovery if file not found in s3
2019-05-21 00:06:36 +05:30
Vinoth Kannan
2a7065c505
FIX: skip uploads without etag in s3 inventory check.
2019-05-20 00:09:52 +05:30
Vinoth Kannan
3172172b52
remove unused local variable
...
ec84c87ddb
2019-05-16 15:39:13 +05:30
Vinoth Kannan
ec84c87ddb
FIX: skip validation while recovering uploads from s3
...
TODO: add tests
2019-05-16 15:37:11 +05:30
Vinoth Kannan
40328f055e
FIX: retrieve original filename from s3 object's content disposition header
2019-05-16 09:47:22 +05:30
Guo Xiang Tan
dd49be27d3
DEV: Fix undefined variable.
...
Follow up to e8fafbc123
.
2019-05-16 11:28:48 +08:00
Vinoth Kannan
f5a217be92
Fix typo in condition value.
2019-05-07 17:09:08 +05:30
Vinoth Kannan
e8fafbc123
List and restore missing post uploads from S3 inventory.
2019-05-04 01:16:20 +05:30
Vinoth Kannan
73418aaf73
DEV: Add bucket folder path to inventory id
2019-05-02 04:35:35 +05:30
Vinoth Kannan
a8f410a9c5
FEATURE: Create new helper method 'Discourse.stats' ( #7388 )
2019-04-17 12:45:04 +05:30
Vinoth Kannan
35431a8ddb
FIX: set missing count in redis even if zero
2019-04-04 20:05:57 +05:30
Vinoth Kannan
df6ef856e6
DEV: save missing s3 uploads count in redis
2019-04-04 19:05:57 +05:30
Guo Xiang Tan
243fb8d9ad
Fix the build.
2019-03-13 17:39:07 +08:00
Vinoth Kannan
da1ff2da2c
FIX: Create and consume temp table inside a transaction ( #7030 )
...
To prevent access issue in pgbouncer which runs in transaction pooling
2019-02-20 13:52:40 +11:00
Vinoth Kannan
563b953224
DEV: Add 'backfill_etags_' to the method name since it also backfilling the etags
2019-02-19 21:54:35 +05:30
Vinoth Kannan
0472bd4adc
FIX: Remove 'backfill_etags' keyword argument from 'uploads:missing' rake task
...
And etags backfilling code is optimized
2019-02-15 00:34:35 +05:30
Vinoth Kannan
b5fbd7385f
FIX: run the rake task only for uploads created before a day from inventory date
2019-02-14 17:53:08 +05:30
Vinoth Kannan
a9a8855739
DEV: Get only matching records to backfill etags
2019-02-14 06:27:18 +05:30
Vinoth Kannan
e2f7db5549
Fix typo
2019-02-14 05:56:30 +05:30
Vinoth Kannan
7b5931013a
Update rake task to backfill etags from s3 inventory
2019-02-14 05:18:06 +05:30
Vinoth Kannan
b8d2549922
FIX: OptimizedImage model doesn't have 'created_at' date column
2019-02-14 03:46:00 +05:30
Vinoth Kannan
426bd810f1
FIX: S3 inventory can have duplicate etags
2019-02-14 03:44:14 +05:30
Vinoth Kannan
1045bbc35b
FIX: S3 inventory data can be splitted into multiple csv files
2019-02-14 03:41:52 +05:30