Commit Graph

211 Commits

Author SHA1 Message Date
Sam Saffron 7f3a30d79f FIX: blank cooked markdown could raise an exception in logs
Previously if somehow a user created a blank markdown document using tag
tricks (eg `<p></p><p></p><p></p><p></p><p></p><p></p>`) and so on, we would
completely strip the document down to blank on post process due to onebox
hack.

Needs a followup cause I am still unclear about the reason for empty p stripping
and it can cause some unclear cases when we re-cook posts.
2020-01-29 11:37:25 +11:00
Martin Brennan ab3bda6cd0
FIX: Mitigate issue where legacy pre-secure hotlinked media would not be redownloaded (#8802)
Basically, say you had already downloaded a certain image from a certain URL
using pull_hotlinked_images and the onebox. The upload would be stored
by its sha as an upload record. Whenever you linked to the same URL again
in a post (e.g. in our case an og:image on review.discourse) we would
would reuse the original upload record because of the sha1.

However when you turned on secure media this could cause problems as
the first post that uses that upload after secure media is enabled
will set the access control post for the upload to the new post.
Then if the post is deleted every single onebox/link to that same image
URL will fail forever with 403 as the secure-media-uploads URL fails
if the access control post has been deleted.

To fix this when cooking posts and pulling hotlinked images, we only
allow using an original upload by URL if its access control post
matches the current post, and if the original_sha1 is filled in,
meaning it was uploaded AFTER secure media was enabled. otherwise
we just redownload the media again to be safe, as the URL will always
be new then.
2020-01-29 10:11:38 +10:00
Martin Brennan 45b37a8bd1
FIX: Resolve pull hotlinked image and broken link issues for secure media URLs (#8777)
When pull_hotlinked_images tried to run on posts with secure media (which had already been downloaded from external sources) we were getting a 404 when trying to download the image because the secure endpoint doesn't allow anon downloads.

Also, we were getting into an infinite loop of pull_hotlinked_images because the job didn't consider the secure media URLs as "downloaded" already so it kept trying to download them over and over.

In this PR I have also refactored secure-media-upload URL checks and mutations into single source of truth in Upload, adding a SECURE_MEDIA_ROUTE constant to check URLs against too.
2020-01-24 11:59:30 +10:00
Martin Brennan 4646a38ae6
FIX: Use presigned URL to avoid 403 when pulling hotlinked images for secure media (#8764)
When we were pulling hotlinked images for oneboxes in the CookedPostProcessor, we were using the direct S3 URL, which returned a 403 error and thus did not set widths and heights of the images. We now cook the URL first based on whether the upload is secure before handing off to FastImage.
2020-01-23 09:31:46 +10:00
Bianca Nenciu 1bccd8eca9
FIX: Remove full nested quotes on direct reply (#8581)
It used to check how many quotes were inside a post, without taking
considering that some quotes can contain other quotes. This commit
selects only top level quotes.

I had to use XPath because I could not find an equivalent CSS
selector.
2019-12-20 10:24:34 +02:00
Dan Ungureanu ebe6fa95be
FIX: Optimize images in Onebox (#8471)
This commit ensures that images in Onebox are being optimized, but not
converted to lightbox too.
2019-12-09 15:39:25 +02:00
Jarek Radosz 02ca6fa6c8 DEV: See if the store is external before checking disk space (#8480)
`available_disk_space` calls `df` which exits with an error if the `uploads` path doesn't exist. That's often the case when the `Discourse.store.external?` is true.

By doing the `external?` check first the `disable_if_low_on_disk_space` does less work and doesn't output any errors to the console.
2019-12-09 12:48:45 +11:00
Jarek Radosz d07f039468 FIX: Secure Upload URLs in lightbox (#8451)
This fixes the following issues:

* The link element on the lightbox which pops open the lightbox was linking to the S3 URL with a private ACL instead of the secure media URL for the image
* Change to use `@post.with_secure_media?` in `CookedPostProcessor` for URL cooking, as in some cases, like when a post is edited and an upload is added, `upload.secure?` can be false which resulted in `srcset` URLs not being cooked correctly to secure media upload urls.
2019-12-05 09:13:09 +10:00
Dan Ungureanu 1e0c2235a3
FIX: Optimize quoted images (#8427)
Only images that were part of a lightbox used to be optimized. This
patch ensures that quoted images are also optimized.
2019-11-29 15:18:42 +02:00
Penar Musaraj 102909edb3 FEATURE: Add support for secure media (#7888)
This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access. 

A few notes: 

- the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads
- the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured
- upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status
- when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error
- when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3
2019-11-18 11:25:42 +10:00
Penar Musaraj 067696df8f DEV: Apply Rubocop redundant return style 2019-11-14 15:10:51 -05:00
Joe ce0bac7a3d FEATURE: fallback to image alt before filename if there's no title in lightboxes (#8286)
* use image alt as a fallback when there's no title

* update spec

we used to check that the overlay information is added when the image has a titie. This adds 2 more scenarios. One where an image has both a title and an alt, in which case the title should be used and alt ignored.

The other is when there's only an alt, it should then be used to generate the overlay
2019-11-04 10:15:14 +11:00
Arpit Jalan 1e9d9d9346
FIX: respect `tl3 links no follow` setting (#8232) 2019-10-22 22:41:04 +05:30
Krzysztof Kotlarek 427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Bianca Nenciu 0d22beb81d
FIX: Improve Onebox detection (#8019)
Follow-up to 7c83d2eeb2.
2019-09-10 13:59:48 +03:00
Bianca Nenciu 7c83d2eeb2 FIX: Award 'First Onebox' badge just for Oneboxed URLs. (#7974) 2019-08-08 18:45:18 +02:00
Sam Saffron 67f5ad5ac0 FEATURE: allow post process mutex to be held longer
Previously we would only hold the post process mutex for 1 minute, that is
not enough when processing a post with lots of images. This raises the bar
to 10 minutes.

It also cleans up error reporting around distributed mutexes expiring. We
used to double report.
2019-08-05 11:57:35 +10:00
Osama Sayegh 65a6f3080e FIX: don't disable download_remote_images_to_local if site uses S3 (#7861) 2019-07-05 13:36:03 +10:00
Penar Musaraj 03805e5a76
FIX: Ensure lightbox image download has correct content disposition in S3 (#7845) 2019-07-04 11:32:51 -04:00
David Taylor e3a9a2d2dd FIX: Avoid infinite loop if disk space is low
We now continue to enqueue the pull_hotlinked_images job for optimized images, even if disk space is low
2019-06-07 14:24:22 +01:00
David Taylor 65b0cafc03 FIX: Always schedule pull_hotlinked_images in cooked_post_processor
The job is now used to pull optimized images, and images from other sites on the same CDN. This needs to run even if download_remote_images is false
2019-06-07 13:08:23 +01:00
Régis Hanol 9756e35956 REVERT: FIX: handle clicks counters in quotes
Not quite a full revert of 7696b92c8c that isn't
actually required.
2019-06-04 11:59:44 +02:00
Guo Xiang Tan f54e4b71b1 DEV: Make `CookedPostProcessor#post_process_images` method private. 2019-05-27 11:28:37 +08:00
Régis Hanol 7696b92c8c FIX: handle clicks counters in full quotes 2019-05-17 14:17:29 +02:00
Régis Hanol fd5c5e326f FIX: remove full quote on direct replies when "typographed"
Use the cooked version of the post and the quote to compare their content in
order to take into account the "typographer" option of the markdown pipeline.
2019-05-15 17:49:29 +02:00
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
David Taylor 2c6b595eed FIX: Process image onebox correctly when image is wrapped in a link
The instagram onebox sometimes surrounds the image with an `<a>` tag, which was breaking the aspect ratio logic, and therefore causing posts to change height on load.
2019-05-10 10:02:40 +01:00
Arpit Jalan 6f5d7f987e FIX: rescue InvalidURIError when removing user ids from links 2019-04-25 12:36:31 +05:30
Dan Ungureanu b706a1b08d FEATURE: Remove user IDs from internal URLs. (#7406) 2019-04-23 12:45:41 +10:00
Maja Komel b0053f3a1c FEATURE: bump onebox version, add styling for new reddit image onebox 2019-04-04 11:24:30 +02:00
Guo Xiang Tan cfd507822f
PERF: Improve quality of `PostSearchData#raw_data`. (#7275)
This commit fixes the follow quality issue with `PostSearchData#raw_data`:

1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.

`Post#cooked`:

```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```

2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.

From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.

`Post#cooked`

```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```

In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
2019-04-01 10:14:29 +08:00
Penar Musaraj 51e08feb7e DEV: Refactor icons used in lightbox HTML
Uses <svg> elements instead of hacky CSS pseudoelements

Adds a migration to mark posts with lightboxes as needing a rebake
2019-03-22 11:52:06 -04:00
Guo Xiang Tan 58b0e945bd
UX: Lightbox support for image uploader. (#7034) 2019-02-21 10:13:37 +08:00
Régis Hanol 664e90bd17 FIX: ensure local images use local CDN when uploads are stored on S3
When the S3 store was enabled, we were only applying the S3 CDN.
So all images stored locally, like the emojis, were never put on the local CDN.

Fixed a bunch of CookedPostProcessor test by adding a call to 'optimize_urls'
in order to get final URLs.

I also removed the unnecessary PrettyText.add_s3_cdn method since this is already
handled in the CookedPostProcessor.
2019-02-20 19:24:38 +01:00
Dan Ungureanu 10dad7d013 FIX: Use CDN for optimized loading images. (#7006)
We missed a few spots in the cooked post processor where images where not loaded using CDN, causing
uneeded load and requests against the server
2019-02-20 13:55:08 +11:00
Bianca Nenciu 7d84648d11 FEATURE: Remove full quotes only from new posts. (#6862) 2019-01-17 13:24:32 +11:00
Sam 35b59cfa78 SECURITY: escape title HTML for inline onebox 2019-01-10 12:02:05 +11:00
Jeff Atwood a74e49c87c use proper typographical × instead of x 2018-12-24 20:33:17 -08:00
Robin Ward 662cfc416b FEATURE: Show a blurry preview when lazy loading images
This generates a 10x10 PNG thumbnail for each lightboxed image.
If Image Lazy Loading is enabled (IntersectionObserver API) then
we'll load the low res version when offscreen. As the image scrolls
in we'll swap it for the high res version.

We use a WeakMap to track the old image attributes. It's much less
memory than storing them as `data-*` attributes and swapping them
back and forth all the time.
2018-12-19 01:57:30 +08:00
Robin Ward e593d68beb Use an options hash instead of boolean parameters 2018-12-19 01:57:30 +08:00
Bianca Nenciu 825ae86857 FEATURE: Remove full quote only if first paragraph. (#6793) 2018-12-18 15:46:20 +01:00
Vinoth Kannan 0d3c1cde90 FIX: Use find_by_id method to prevent record not found exception 2018-12-15 03:19:45 +05:30
Bianca Nenciu 7cac04e1a8 * FEATURE: Adds site setting to let quotes on direct replies.
* DEV: Added test.
* FIX: Do not bump topic when removing full quotes.
2018-12-12 15:42:53 +01:00
Bianca Nenciu 41e184280d FEATURE: Remove full quotes of direct replies. (#6729) 2018-12-07 13:07:11 +01:00
Guo Xiang Tan 56034c733a UX: Strip class when link is not oneboxed due to site setting limits. 2018-11-29 14:33:01 +08:00
Guo Xiang Tan a1e77aa2ed
FEATURE: Reimplement `SiteSetting.max_oneboxes_per_post`. (#6668)
Previously, the site setting was only effective on the client side of
things. Once the site setting was been reached, all oneboxes are not
rendered. This commit changes it such that the site setting is respected
both on the client and server side. The first N oneboxes are rendered and
once the limit has been reached, subsequent oneboxes will not be
rendered.
2018-11-27 16:00:31 +08:00
Penar Musaraj 03deda2147
Upgrade to FontAwesome 5 (take two) (#6673)
* Add missing icons to set

* Revert FA5 revert

 This reverts commit 42572ff

* use new SVG syntax in locales

* Noscript page changes (remove login button, center "powered by" footer text)

* Cast wider net for SVG icons in settings

- include any _icon setting for SVG registry (offers better support for plugin settings)

- let themes store multiple pipe-delimited icons in a setting

- also replaces broken onebox image icon with SVG reference in cooked post processor

* interpolate icons in locales

* Fix composer whisper icon alignment

* Add support for stacked icons

* SECURITY: enforce hostname to match discourse hostname

This ensures that the hostname rails uses for various helpers always matches
the Discourse hostname

* load SVG sprite with pre-initializers

* FIX: enable caching on SVG sprites

* PERF: use JSONP for SVG sprites so they are served from CDN

This avoids needing to deal with CORS for loading of the SVG

Note, added the svg- prefix to the filename so we can quickly tell in
dev tools what the file is

* Add missing SVG sprite JSONP script to CSP

* Upgrade to FA 5.5.0

* Add support for all FA4.7 icons

- adds complete frontend and backend for renamed FA4.7 icons

- improves performance of SvgSprite.bundle and SvgSprite.all_icons

* Fix group avatar flair preview

- adds an endpoint at /svg-sprites/search/:keyword

- adds frontend ajax call that pulls icon in avatar flair preview even when it is not in subset

* Remove FA 4.7 font files
2018-11-26 16:49:57 -05:00
Guo Xiang Tan 565603ad0d Remove unused variable. 2018-11-26 14:45:00 +08:00
Guo Xiang Tan 3188d3506d Re-add option that was removed by mistake in 482013a1d4. 2018-11-26 14:24:23 +08:00
Guo Xiang Tan 482013a1d4 FIX: Group mentions missing after post processing. 2018-11-26 12:57:07 +08:00