discourse

Commit Graph

Author	SHA1	Message	Date
Gerhard Schlager	7fd63b34b1	DEV: Make it obvious that `joined` translation is used by onebox (#20158 ) This also moves the date as interpolation key into the string which makes translation easier.	2023-02-03 10:02:14 +08:00
Bianca Nenciu	c186a46910	SECURITY: Prevent XSS in local oneboxes (#20008 ) Co-authored-by: OsamaSayegh <asooomaasoooma90@gmail.com>	2023-01-25 19:17:21 +02:00
Daniel Waterworth	666536cbd1	DEV: Prefer \A and \z over ^ and $ in regexes (#19936 )	2023-01-20 12:52:49 -06:00
David Taylor	6417173082	DEV: Apply syntax_tree formatting to `lib/*`	2023-01-09 12:10:19 +00:00
Martin Brennan	9d50790530	FIX: Allow svg in oneboxer in certain cases (#19253 ) When doing local oneboxes we sometimes want to allow SVGs in the final preview HTML. The main case currently is for the new cooked hashtags, which include an SVG icon. SVGs will be included in local oneboxes via `ExcerptParser` _only_ if they have the d-icon class, and if the caller for `post.excerpt` specifies the `keep_svg: true` option.	2022-11-30 12:42:15 +10:00
Jarek Radosz	49fa2e93c2	DEV: Clean up twitter onebox code (#18012 )	2022-08-21 19:26:24 +02:00
David Taylor	3c81683955	DEV: Rename `UriHelper.escape_uri` to `.normalized_encode` This is a much better description of its function. It performs idempotent normalization of a URL. If consumers truly need to `encode` a URL (including double-encoding of existing encoded entities), they can use the existing `.encode` method.	2022-08-09 11:55:25 +01:00
Martin Brennan	61b9e3ee30	FIX: InlineOneboxer watched word censor error (#16921 ) In `7328a2bfb0` we changed the InlineOneboxer#onebox_for method to run the title of the onebox through WatchedWord#censor_text. However, it is allowable for the title to be nil, which was causing this error in production: > NoMethodError : undefined method gsub for nil:NilClass We just need to check whether the title is nil before trying to censor it.	2022-05-26 14:01:44 +10:00
Bianca Nenciu	6c8f491dc3	DEV: Allow plugins to register Onebox handlers (#16870 ) This targets only the local Oneboxes and allows plugins to customize regular or inline Oneboxes for routes inside the site.	2022-05-23 20:02:02 +03:00
Osama Sayegh	d15867463f	FEATURE: Site setting for blocking onebox of URLs that redirect (#16881 ) Meta topic: https://meta.discourse.org/t/prevent-to-linkify-when-there-is-a-redirect/226964/2?u=osama. This commit adds a new site setting `block_onebox_on_redirect` (default off) for blocking oneboxes (full and inline) of URLs that redirect. Note that an initial http → https redirect is still allowed if the redirect location is identical to the source (minus the scheme of course). For example, if a user includes a link to `http://example.com/page` and the link resolves to `https://example.com/page`, then the link will onebox (assuming it can be oneboxed) even if the setting is enabled. The reason for this is a user may type out a URL (i.e. the URL is short and memorizable) with http and since a lot of sites support TLS with http traffic automatically redirected to https, so we should still allow the URL to onebox.	2022-05-23 13:52:06 +03:00
Loïc Guitaut	46176b7dd7	DEV: Don’t patch Sanitize::Config Currently we’re reopening the `Sanitize::Config` class (which is part of the `sanitize` gem) to put our custom config for Onebox in it. This is unnecessary as we can simply create a dedicated module to hold our custom configuration.	2022-04-06 17:10:51 +02:00
Osama Sayegh	b0656f3ed0	FIX: Apply onebox blocked domain checks on every redirect (#16150 ) The `blocked onebox domains` setting lets site owners change what sites are allowed to be oneboxed. When a link is entered into a post, Discourse checks the domain of the link against that setting and blocks the onebox if the domain is blocked. But if there's a chain of redirects, then only the final destination website is checked against the site setting. This commit amends that behavior so that every website in the redirect chain is checked against the site setting, and if anything is blocked the original link doesn't onebox at all in the post. The `Discourse-No-Onebox` header is also checked in every response and the onebox is blocked if the header is set to "1". Additionally, Discourse will now include the `Discourse-No-Onebox` header with every response if the site requires login to access content. This is done to signal to a Discourse instance that it shouldn't attempt to onebox other Discourse instances if they're login-only. Non-Discourse websites can also use include that header if they don't wish to have Discourse onebox their content. Internal ticket: t59305.	2022-03-11 09:18:12 +03:00
Natalie Tay	aac9f43038	Only block domains at the final destination (#15689 ) In an earlier PR, we decided that we only want to block a domain if the blocked domain in the SiteSetting is the final destination (/t/59305). That PR used `FinalDestination#get`. `resolve` however is used several places but blocks domains along the redirect chain when certain options are provided. This commit changes the default options for `resolve` to not do that. Existing users of `FinalDestination#resolve` are - `Oneboxer#external_onebox` - our onebox helper `fetch_html_doc`, which is used in amazon, standard embed and youtube - these folks already go through `Oneboxer#external_onebox` which already blocks correctly	2022-01-31 15:35:12 +08:00
David Taylor	03998e0a29	FIX: Use CDN URL for internal onebox avatars (#15077 ) This commit will also trigger a background rebake for all existing posts with internal oneboxes	2021-11-25 12:07:34 +00:00
jbrw	2f28ba318c	FEATURE: Onebox can match engines based on the content_type (#13876 ) * FEATURE: Onebox can match engines based on the content_type `FinalDestination` now returns the `content_type` of a resolved URL. `Oneboxer` passes this value to `Onebox` itself. Onebox engines can now specify a `matches_content_type` regex of content_types that the engine can handle, regardless of the URL. `ImageOnebox` will match URLs with a content type of `image/png`, `jpg`, `gif`, `bmp`, `tif`, etc. This will allow images that exist at a URL without a file type extension to be correctly rendered, assuming a valid `content_type` is returned.	2021-07-30 13:36:30 -04:00
Arpit Jalan	05bdbd9f97	SECURITY: Onebox canonical links bypassing FinalDestination checks (#13605 )	2021-07-01 20:09:29 +05:30
Joffrey JAFFEUX	e50b7e9111	SECURITY: ensures timeouts are correctly used on connect (#13455 )	2021-06-21 17:34:01 +02:00
Bianca Nenciu	d184fe59ca	FEATURE: Censor Oneboxes (#12902 ) Previously onebox content was not passed by the censor regex, meaning you could sneak in censored words via onebox.	2021-06-03 11:39:12 +10:00
jbrw	461a2c334b	FIX: return an empty result if response from Amazon is missing expected attributes (#13173 ) * FIX: return an empty result if response from Amazon is missing attributes Check we have the basic attributes requires to construct a Onebox for Amazon. This is an attempt to handle scenarios where we receive a valid 200-status response from an Amazon request that does not include the data we’re expecting. * Update lib/onebox/engine/amazon_onebox.rb Co-authored-by: Régis Hanol <regis@hanol.fr> Co-authored-by: Régis Hanol <regis@hanol.fr>	2021-06-01 16:23:18 -04:00
jbrw	a24b6daa87	FIX: An unresolved blank uri should attempt an alternate Oneboxing strategy, if available (#13070 )	2021-05-14 15:23:20 -04:00
jbrw	19182b1386	DEV: Oneboxer wildcard subdomains (#13015 ) * DEV: Allow wildcards in Oneboxer optional domain Site Settings Allows a wildcard to be used as a subdomain on Oneboxer-related SiteSettings, e.g.: - `force_get_hosts` - `cache_onebox_response_body_domains` - `force_custom_user_agent_hosts` * DEV: fix typos * FIX: Try doing a GET after receiving a 500 error from a HEAD By default we try to do a `HEAD` requests. If this results in a 500 error response, we should try to do a `GET` * DEV: `force_get_hosts` should be a hidden setting * DEV: Oneboxer Strategies Have an alternative oneboxing ‘strategy’ (i.e., set of options) to use when an attempt to generate a Onebox fails. Keep track of any non-default strategies that were used on a particular host, and use that strategy for that host in the future. Initially, the alternate strategy (`force_get_and_ua`) forces the FinalDestination step of Oneboxing to do a `GET` rather than `HEAD`, and forces a custom user agent. * DEV: change stubbed return code The stubbed status code needs to be a value not recognized by FinalDestination	2021-05-13 15:48:35 -04:00
Bianca Nenciu	21d1ee1065	FIX: Use Nokogiri and Loofah consistently (#12693 ) CookedPostProcessor used Loofah to parse the cooked content of a post and Nokogiri to parse cooked Oneboxes. Even though Loofah is built on top of Nokogiri, replacing an element from the cooked post (a Nokogiri node) with a parsed onebox (a Loofah node) produced a strange result which included XML namespaces. Removing the mix and using Loofah to parse Oneboxes fixed the problem.	2021-04-14 18:09:55 +03:00
jbrw	50252d803e	DEV: stub youtube embed requests (#12637 ) * DEV: stub youtube embed requests * DEV: Ignore redirects on youtube.com when oneboxing	2021-04-07 13:32:27 -04:00
jbrw	68d0916eb5	FEATURE: Oneboxer cache response body (#12562 ) * FEATURE: Cache successful HTTP GET requests during Oneboxing Some oneboxes may fail if when making excessive and/or odd requests against the target domains. This change provides a simple mechanism to cache the results of succesful GET requests as part of the oneboxing process, with the goal of reducing repeated requests and ultimately improving the rate of successful oneboxing. To enable: Set `SiteSetting.cache_onebox_response_body` to `true` Add the domains you’re interesting in caching to `SiteSetting. cache_onebox_response_body_domains` e.g. `example.com\|example.org\|example.net` Optionally set `SiteSetting.cache_onebox_user_agent` to a user agent string of your choice to use when making requests against domains in the above list. * FIX: Swap order of duration and value in redis call The correct order for `setex` arguments is `key`, `duration`, and `value`. Duration and value had been flipped, however the code would not have thrown an error because we were caching the value of `1.day.to_i` for a period of 1 seconds… The intention appears to be to set a value of 1 (purely as a flag) for a period of 1 day.	2021-03-31 13:19:34 -04:00
jbrw	aed97c7bab	FIX: Add amazon sites to force_get_hosts (#12341 ) It has been observed that doing a HEAD against an Amazon store URL may result in a 405 error being returned. Skipping the HEAD request may result in an improved oneboxing experience when requesting these URLs.	2021-03-10 14:42:17 -05:00
jbrw	fff8a24f2b	FIX: Don’t display error if only error is a missing image (#12216 ) `Onebox.preview` can return 0-to-n errors, where the errors are missing OpenGraph attributes (e.g. title, description, image, etc.). If any of these attributes are missing, we construct an error message and attach it to the Oneboxer preview HTML. The error message is something like: “Sorry, we were unable to generate a preview for this web page, because the following oEmbed / OpenGraph tags could not be found: description, image” However, if the only missing tag is `image` we don’t need to display the error, as we have enough other data (title, description, etc.) to construct a useful/complete Onebox.	2021-02-25 14:30:40 -05:00
Martin Brennan	13c2a4886f	FEATURE: Add disable_onebox_media_download_controls hidden site setting (#12208 ) Uses discourse/onebox@ff9ec90 Adds a hidden site setting called disable_onebox_media_download_controls which will add controlslist="nodownload" to video and audio oneboxes, and also to the local video and audio oneboxes within Discourse.	2021-02-25 12:39:15 +10:00
Dan Ungureanu	2d51833ca9	FIX: Make Oneboxer#apply insert block Oneboxes correctly (#11449 ) It used to insert block Oneboxes inside paragraphs which resulted in invalid HTML. This needed an additional parsing for removal of empty paragraphs and the resulting HTML could still be invalid. This commit ensure that block Oneboxes are inserted correctly, by splitting the paragraph containing the link and putting the block between the two. Paragraphs left with nothing but whitespaces will be removed. Follow up to `7f3a30d79f`.	2020-12-14 17:49:37 +02:00
jbrw	da9b837da0	DEV: More robust processing of URLs (#11361 ) * DEV: More robust processing of URLs The previous `UrlHelper.encode_component(CGI.unescapeHTML(UrlHelper.unencode(uri))` method would naively process URLs, which could result in a badly formed response. `Addressable::URI.normalized_encode(uri)` appears to deal with these edge-cases in a more robust way. * DEV: onebox should use UrlHelper * DEV: fix spec * DEV: Escape output when rendering local links	2020-12-03 17:16:01 -05:00
jbrw	51f9a56137	FEATURE: Onebox local categories (#11311 ) * FEATURE: onebox for local categories This commit adjusts the category onebox to look more like the category boxes do on the category page. Co-authored-by: Jordan Vidrine <jordan@jordanvidrine.com>	2020-11-25 10:53:05 +11:00
jbrw	331236d6d7	Onebox improved error handling and support for Instagram Access Tokens (#11253 ) * FEATURE: display error if Oneboxing fails due to HTTP error - display warning if onebox URL is unresolvable - display warning if attributes are missing * FEATURE: Use new Instagram oEmbed endpoint if access token is configured Instagram requires an Access Token to access their oEmbed endpoint. The requirements (from https://developers.facebook.com/docs/instagram/oembed/) are as follows: - a Facebook Developer account, which you can create at developers.facebook.com - a registered Facebook app - the oEmbed Product added to the app - an Access Token - The Facebook app must be in Live Mode The generated Access Token, once added to SiteSetting.facebook_app_access_token, will be passed to onebox. Onebox can then use this token to access the oEmbed endpoint to generate a onebox for Instagram. * DEV: update user agent string * DEV: don’t do HEAD requests against news.yahoo.com * DEV: Bump onebox version from 2.1.5 to 2.1.6 * DEV: Avoid re-reading templates * DEV: Tweaks to onebox mustache templates * DEV: simplified error message for missing onebox data * Apply suggestions from code review Co-authored-by: Gerhard Schlager <mail@gerhard-schlager.at>	2020-11-18 12:55:16 -05:00
David Taylor	a3577435f7	FEATURE: Additional control of iframes in oneboxes (#10523 ) This commit adds a new site setting "allowed_onebox_iframes". By default, all onebox iframes are allowed. When the list of domains is restricted, Onebox will automatically skip engines which require those domains, and use a fallback engine.	2020-08-27 20:12:13 +01:00
Krzysztof Kotlarek	e0d9232259	FIX: use allowlist and blocklist terminology (#10209 ) This is a PR of the renaming whitelist to allowlist and blacklist to the blocklist.	2020-07-27 10:23:54 +10:00
Dan Ungureanu	fe284ffd06	Revert "DEV: Remove useless code (#10130 )" Some oneboxes still generate empty P tags (video oneboxes). This reverts commit `c299d02287`.	2020-06-29 13:56:28 +03:00
Dan Ungureanu	c299d02287	DEV: Remove useless code (#10130 ) protection is not needed and can easily be bypassed with empty divs anyway.	2020-06-29 17:49:30 +10:00
Guo Xiang Tan	b28d97b64a	FIX: Bump onebox for twitch video and clips embedding fix.	2020-06-24 11:00:30 +08:00
Régis Hanol	91c89df68a	FIX: onebox local topic when using slug-less URL When linking to a topic in the same Discourse, we try to onebox the link to show the title and other various information depending on whether it's a "standard" or "inline" onebox. However, we were not properly detecting links to topics that had no slugs (eg. https://meta.discourse.org/t/1234).	2020-06-23 17:18:38 +02:00
Krzysztof Kotlarek	9bff0882c3	FEATURE: Nokogumbo (#9577 ) * FEATURE: Nokogumbo Use Nokogumbo HTML parser.	2020-05-05 13:46:57 +10:00
Joffrey JAFFEUX	fd4ce6ab8f	DEV: hbs extensions are misleading in this case (#9170 ) This would also prevent any linting tool to attempt to lint this incorrectly.	2020-03-11 14:42:14 +01:00
Dan Ungureanu	ec40242b5c	FIX: Make inline oneboxes work with secured topics in secured contexts (#8895 )	2020-02-12 12:11:28 +02:00
Penar Musaraj	4b6a47be48	DEV: do not persist force_custom_user_agent_hosts setting Followup to f029e2	2020-02-06 11:56:54 -05:00
Penar Musaraj	f029e2eaf6	FEATURE: Add site setting for specific hosts using custom user agent when oneboxing Followup to #00c406	2020-02-06 10:32:42 -05:00
Sam Saffron	7f3a30d79f	FIX: blank cooked markdown could raise an exception in logs Previously if somehow a user created a blank markdown document using tag tricks (eg `<p></p><p></p><p></p><p></p><p></p><p></p>`) and so on, we would completely strip the document down to blank on post process due to onebox hack. Needs a followup cause I am still unclear about the reason for empty p stripping and it can cause some unclear cases when we re-cook posts.	2020-01-29 11:37:25 +11:00
Martin Brennan	65481858c2	FEATURE: Use upload:// short URL for videos and audio in composer (#8760 ) For consistency this PR introduces using custom markdown and short upload:// URLs for video and audio uploads, rather than just treating them as links and relying on the oneboxer. The markdown syntax for videos is ![file text\|video](upload://123456.mp4) and for audio it is ![file text\|audio](upload://123456.mp3). This is achieved in discourse-markdown-it by modifying the rules for images in mardown-it via md.renderer.rules.image. We return HTML instead of the token when we encounter audio or video after \| and the preview renders that HTML. Also when uploading an audio or video file we insert the relevant markdown into the composer.	2020-01-23 09:41:39 +10:00
Joffrey JAFFEUX	0d3d2c43a0	DEV: s/\$redis/Discourse\.redis (#8431 ) This commit also adds a rubocop rule to prevent global variables.	2019-12-03 10:05:53 +01:00
Martin Brennan	901054fd75	FIX: Cache failed onebox URL request server-side (#8421 ) We already cache failed onebox URL requests client-side, we now want to cache this on the server-side for extra protection. failed onebox previews will be cached for 1 hour, and any more requests for that URL will fail with a 404 status. Forcing a rebake via the Rebake HTML action will delete the failed URL cache (like how the oneboxer preview cache is deleted).	2019-11-28 07:48:29 +10:00
Arpit Jalan	520a83aa62	FIX: correct hostname in vimeo.com	2019-11-27 14:52:28 +05:30
Arpit Jalan	52c8cab7f2	FIX: bypass finaldestination check for Vimeo links.	2019-11-27 14:00:46 +05:30
Sam Saffron	0fb497eb23	DEV: use Discourse.cache over Rails.cache Discourse.cache is a more consistent method to use and offers clean fallback if you are skipping redis This is part of a larger change that both optimizes Discoruse.cache and omits use of setex on $redis in favor of consistently using discourse cache Bench does reveal that use of Rails.cache and Discourse.cache is 1.25x slower than redis.setex / get so a re-implementation will follow prior to porting	2019-11-27 12:36:19 +11:00
Penar Musaraj	102909edb3	FEATURE: Add support for secure media (#7888 ) This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access. A few notes: - the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads - the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured - upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status - when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error - when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3	2019-11-18 11:25:42 +10:00

1 2 3

127 Commits