Commit Graph

134 Commits

Author SHA1 Message Date
Ted Johansson b2a5f5802a
DEV: Replace custom Onebox symbolize_keys implementation with ActiveSupport (#23828)
We have a custom implementation of #symbolize_keys in our Onebox helpers. This is likely a legacy from when Onebox was a standalone gem. This change replaces all usages with either #deep_symbolize_keys from ActiveSupport, or appropriate option to the JSON parser gem used.
2023-10-09 09:32:09 +02:00
Ted Johansson 60e624e768
DEV: Replace custom Onebox blank implementation with ActiveSupport (#23827)
We have a custom implementation of #blank? in our Onebox helpers. This is likely a legacy from when Onebox was a standalone gem. This change replaces all usages with respective incarnations of #blank?, #present?, and #presence from ActiveSupport. It changes a bunch of "unless blank" to "if present" as well.
2023-10-07 19:54:26 +02:00
Rafael dos Santos Silva d10e9a6c1d
FEATURE: Onebox and Download for WEBP and AVIF (#23235)
This adds support for oneboxing WEBP and AVIF images in posts and fixing
oneboxing fixes download remote images for those formats too.

Reported in https://meta.discourse.org/t/-/276433?u=falco
2023-08-24 16:44:06 -03:00
Jarek Radosz 94649565ce
DEV: Correct `Style/RedundantReturn` rubocop issues (#23052) 2023-08-10 02:03:38 +02:00
Joffrey JAFFEUX df7dab9dce
FIX: ensures generic onebox has width/height for thumbnail (#23040)
Prior to this fix we would output an image with no width/height which would then bypass a large part of `CookedProcessorMixin` and have no aspect ratio. As a result, an image with no size would cause layout shift.

It also removes a fix for oneboxes in chat messages due to this case.
2023-08-09 20:31:11 +02:00
Roman Agilov 3eac47443f
FEATURE: Add audio.com onebox provider (#22936)
* Audio.com provider added to onebox
* added specs for audio.com onebox provider
2023-08-08 16:55:04 +10:00
Ryan Vandersmith 44a104dff8
FIX: Update "Embed Motoko" Onebox URLs (#22198)
Embed Motoko service's primary URL is transiting from embed.smartcontracts.org to embed.motoko.org, this PR updates the Onebox logic to work for either domain.
2023-07-26 09:41:01 +08:00
Blake Erickson 9e8010df8b
DEV: Use thumbnail url for wikimedia onebox image (#22620)
Wikimedia provides a thumbnail url for its images, so we should use that
for oneboxes instead of the full-size image. Because the size of the
  onebox image we display is quite small anyways the thumbnail wikimedia
  provides should suffice and will save bandwidth.

See: https://meta.discourse.org/t/264039
2023-07-14 12:20:18 -06:00
Rafael dos Santos Silva 3fd327c458
FEATURE: Basic support for threads.net onebox (#22471) 2023-07-06 16:02:49 -03:00
Jan Cernik 77732cd2b4
FIX: Minor Twitter onebox improvements (#22387) 2023-07-03 19:53:12 -03:00
Jan Cernik 24c90534fb
FIX: Use Twitter API v2 for oneboxes and restore OpenGraph fallback (#22187) 2023-06-22 14:39:02 -03:00
Richard 22ef6a7c29 Fix Twitch onebox multisite issue 2023-05-15 16:45:33 +02:00
Penar Musaraj a67c96438c
UX: Fix user onebox layout (#21284) 2023-04-28 09:50:49 -04:00
Loïc Guitaut 8b67a534a0 FIX: Allow floats for zoom level in Google Maps onebox
Sometimes we get Maps URL containing a zoom level as a float (17.5z and
not 17z) but this doesn’t work with our current onebox implementation.

While Google accepts those float zoom levels, it removes automatically
the floating part in the URL (thus when visiting a Maps URL containing
17.5z, the URL will be rewritten shortly after as 17z). When putting a
float zoom level in an embedded URL, this actually breaks (Maps API
returns a 400 error).

This patch addresses the issue by allowing the onebox engine to match on
a zoom level expressed as a float but we only keep the integer part thus
rendering properly maps.
2023-03-01 12:45:33 +01:00
Loïc Guitaut f7c57fbc19 DEV: Enable `unless` cops
We discussed the use of `unless` internally and decided to enforce
available rules from rubocop to restrict its most problematic uses.
2023-02-21 10:30:48 +01:00
Gerhard Schlager 7fd63b34b1
DEV: Make it obvious that `joined` translation is used by onebox (#20158)
This also moves the date as interpolation key into the string which makes translation easier.
2023-02-03 10:02:14 +08:00
Jan Cernik aecd8b1eff
FIX: Add support for multiple TikTok aspect ratios (#20064) 2023-01-30 18:12:01 -03:00
Jan Cernik d0c820e816
FEATURE: Add better TikTok onebox support (#19934) 2023-01-23 09:49:02 -03:00
Daniel Waterworth 666536cbd1
DEV: Prefer \A and \z over ^ and $ in regexes (#19936) 2023-01-20 12:52:49 -06:00
Loïc Guitaut 14d97f9cf1 FEATURE: Show more context in Discourse topic oneboxes
Currently when generating a onebox for Discourse topics, some important
context is missing such as categories and tags.

This patch addresses this issue by introducing a new onebox engine
dedicated to display this information when available. Indeed to get this
new information, categories and tags are exposed in the topic metadata
as opengraph tags.
2023-01-11 14:22:53 +01:00
David Taylor 6417173082
DEV: Apply syntax_tree formatting to `lib/*` 2023-01-09 12:10:19 +00:00
Ryan Vandersmith e6439e89cf
FEATURE: Onebox for Embed Motoko (#19293) 2022-12-16 09:59:40 -05:00
Rafael dos Santos Silva d247e5d37c
FEATURE: Youtube Short onebox support (#19335)
* FEATURE: Youtube Shorts onebox support

Co-authored-by: Canapin <canapin@gmail.com>
2022-12-06 11:56:48 -03:00
David Taylor 68b4fe4cf8
SECURITY: Expand and improve SSRF Protections (#18815)
See https://github.com/discourse/discourse/security/advisories/GHSA-rcc5-28r3-23rr

Co-authored-by: OsamaSayegh <asooomaasoooma90@gmail.com>
Co-authored-by: Daniel Waterworth <me@danielwaterworth.com>
2022-11-01 16:33:17 +00:00
Bianca Nenciu 266e165885
FIX: Use only first line from commit message (#18724)
Linking a commit from a GitHub pull request included the complete commit
message, instead of just the first line. The rest of the commit message
will be added to the body of the Onebox.
2022-10-24 22:26:48 +03:00
Bianca Nenciu 73e9875a1d
FEATURE: Handle oneboxes for complex GitHub URLs (#18474)
GitHub PR URLs can link to a commit of the PR, a comment or a review
discussion.
2022-10-06 20:26:04 +03:00
David Taylor 2e00d4d024
DEV: Fix flaky twitter onebox behavior (#18141)
The order in which Onebox engines are loaded is not guaranteed. Occasionally during tests, the twitter engine would be loaded before the instagram engine, and cause the Instagram Onebox spec to fail due to the lack of `Onebox.options.twitter_client`.

This commit makes the load order of Onebox engines consistent, and fixes the issue in the twitter_status_onebox.
2022-08-31 08:42:55 +08:00
Bianca Nenciu 626d50c15c
FIX: Disable Twitter onebox without API support (#17519)
Twitter removed OpenGraph tags from their pages. We can no longer
extract all the information (for example, the quoted tweet) we need
to render Oneboxes without using their API.
2022-08-17 18:32:48 +03:00
Bianca Nenciu e7f04a8674
FIX: Use URI#merge to merge base and relative URLs (#17454)
The old implementation did not handle all cases, such as the case when
`src` is a relative URL that starts with `..`.
2022-07-18 14:17:54 +03:00
Ghassan Maslamani d0a4bc636f
FIX: Vimeo regex pattern (#17277)
Vimeo has two url structure:

- Normal /video_id

- Private/Unlisted /video_id/hash_string

This changes change the regex pattern thus it would be able to

catch both. Also it tolerate trailing slash.

This shall fixes:

https://meta.discourse.org/t/vimeo-embed-urls-parsed-incorrectly-in-email/231042
2022-06-30 13:13:25 -03:00
Rafael dos Santos Silva f130ec35d9
FEATURE: Use full post width for Vimeo embeds (#17289) 2022-06-30 13:08:24 -03:00
jbrw 9874fe3fb3
FIX: Improve mixcloud oneboxing (#17237)
- Sets `https://www.mixcloud.com` as a `requires_iframe_origins` to allow the iframe content to be displayed
- Attempts to render something approximating the Mixcloud content in the preview pane of the Composer, rather than just displaying a large version of the artwork associated with the link
2022-06-27 08:32:24 +10:00
Penar Musaraj 3baefa25b5
FIX: Use first supported type item when JSON-LD returns array (#17217) 2022-06-23 13:02:01 -04:00
Kris dfba188bd6
UX: remove extra whitespace in github onebox (#17111) 2022-06-17 11:09:45 +10:00
Jarek Radosz f723b4c322
FIX: Handle sites with more than 1 JSON-LD element (#17095)
A followup to #17007
2022-06-15 02:55:55 +02:00
sansnumero f0c6dd5682
Add support for JSON LD in Onebox (#17007)
* FIX: Fix a bug that is accessing the values in a hash wrongly and write tests

I decided to write tests in order to be confident in my refactor that's in the next commit.
Meanwhile I have discovered a potential bug. The `title_attr` key was accessed as a string,
but all the keys are actually symbols so it was never evaluated to be true.

irb(main):025:0> d = {key: 'value'}
=> {:key=>"value"}
irb(main):026:0> d['key']
=> nil
irb(main):027:0> d[:key]
=> "value"

* DEV: Extract methods for readability

I will be adding a new method following the conventions in place for adding a new normalizer. And this will make the readability of the `raw` block even more difficult; so I am extracting self contained private methods beforehand.

* FEATURE: Parse JSON-LD and introduce Movie object

JSON LD data is very easily transferable to Ruby objects because they contain types. If these types are mapped to Ruby objects, it is also better to make all the parsed data very explicit and easily extendable.

JSON-LD has many more standardized item types, with a full list here: https://schema.org/docs/full.html
However in order to decrease the scope, I only adapted the movie type.

* DEV: Change inheritance between normalizers

Normalizers are not supposed to have an inheritance relationships amongst each other. They are all normalizers, but all normalizing separate protocols. This is why I chose to extract a parent class and relieve Open Graph off that responsibility. Removing the parent class altogether could also a possibility, but I am keeping the scope limited to having a more accurate representation of the normalizers while making it easier to add a new one.

* Lint changes

* Bring back the Oembed OpenGraph inheritance

There is one test that caught that this inheritance was necessary. I still think modelling wise this inheritance shouldn't exist, but this can be tackled separately.

* Return empty hash if the json received is invalid

Before this change if there was a parsing error with JSON it would throw an exception. The goal of this commit is to rescue that exception and then log a warning. I chose to use Discourse's logger wrapper `warn_exception` to have the backtrace and not just used Rails logger. I considered raising an `InvalidParameters` error however if the JSON here is invalid it should not block showing of the Onebox, so logging is enough.

* Prep to support more JSONLD schema types with case

* Extract mustache template object created from JSONLD
2022-06-13 17:32:34 +02:00
Mayfield 99b0578b4c
FIX: escape youtube title when constructing onebox preview html (#16999) 2022-06-08 13:42:37 +08:00
David Taylor 8fe3934856
UX: Make YouTube playlist onebox full width to match video onebox (#16936) 2022-05-27 10:39:12 +01:00
Loïc Guitaut 46176b7dd7 DEV: Don’t patch Sanitize::Config
Currently we’re reopening the `Sanitize::Config` class (which is part of
the `sanitize` gem) to put our custom config for Onebox in it. This is
unnecessary as we can simply create a dedicated module to hold our
custom configuration.
2022-04-06 17:10:51 +02:00
David Taylor ff93833fdf
UX: Use committed date for GitHub oneboxes (#16318)
Our copy says 'committed {date}`, but we were previously using the commit's authored date
2022-03-30 09:16:28 +08:00
jbrw 528c3e311a
FIX: Only display the first listed price (#16138)
Multiple prices may be returned by Amazon (e.g. for new, and also for used). We should only display the first price.
2022-03-08 15:24:45 -05:00
jbrw fc30669db2
FIX: Support new layout on Amazon product pages (#16091)
Some product pages on Amazon are using a new HTML structure, meaning the previous Onebox engine was unable to gather the price and/or description. This change should allow these pages to be Oneboxed.
2022-03-04 18:31:53 -05:00
Jarek Radosz 2fc70c5572
DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
Alan Guo Xiang Tan 7afe768d60
DEV: Add tests for wistia onebox. (#15860)
Follow-up to 4ef56b0ca4
2022-02-08 13:04:32 +08:00
jbrw 4ef56b0ca4
FIX: Explicitly set `allowfullscreen` on Wistia Oneboxes (#15828) 2022-02-08 13:02:32 +11:00
Rafael dos Santos Silva 5b5cbbfe5c
FEATURE: Onebox for news.ycombinator.com (#15781) 2022-02-03 13:39:21 -03:00
Natalie Tay aac9f43038
Only block domains at the final destination (#15689)
In an earlier PR, we decided that we only want to block a domain if 
the blocked domain in the SiteSetting is the final destination (/t/59305). That 
PR used `FinalDestination#get`. `resolve` however is used several places
 but blocks domains along the redirect chain when certain options are provided.

This commit changes the default options for `resolve` to not do that. Existing
users of `FinalDestination#resolve` are
- `Oneboxer#external_onebox`
- our onebox helper `fetch_html_doc`, which is used in amazon, standard embed 
and youtube
  - these folks already go through `Oneboxer#external_onebox` which already
  blocks correctly
2022-01-31 15:35:12 +08:00
Bianca Nenciu 847c77de65
FIX: Add another method to check binary file (#15648)
This method looks for a NULL byte that is not usually contained in text
files. Follow up to 376799b1a4.
2022-01-20 23:47:18 +02:00
Bianca Nenciu 376799b1a4
FIX: Hide excerpt of binary files in GitHub onebox (#15639)
Oneboxer did not know if a file is binary or not and always tried to
show an excerpt of the file.
2022-01-19 14:45:36 +02:00
jbrw 2909b8b820
FIX: origins_to_regexes should always return an array (#15589)
If the SiteSetting `allowed_onebox_iframes` contains a value of `*`, it will use the values of `all_iframe_origins` during the Oneboxing process. If `all_iframe_origins` itself contains a value of `*`, `origins_to_regexes` will try to return a "catch-all" regex.

Other code assumes `origins_to_regexes`will return an array, so this change ensures the `*` case will return an array containing only the catch-all regex.
2022-01-17 12:48:41 -05:00