This is a much better description of its function. It performs idempotent normalization of a URL. If consumers truly need to `encode` a URL (including double-encoding of existing encoded entities), they can use the existing `.encode` method.
Flips content_security_policy_frame_ancestors default to enabled, and
removes HTTP_REFERER checks on embed requests, as the new referer
privacy options made the check fragile.
The following methods have long been deprecated in ruby due to flaws in their implementation per http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb/ruby/ruby-core/29293?29179-31097:
URI.escape
URI.unescape
URI.encode
URI.unencode
escape/encode are just aliases for one another. This PR uses the Addressable gem to replace these methods with its own encode, unencode, and encode_component methods where appropriate.
I have put all references to Addressable::URI here into the UrlHelper to keep them corralled in one place to make changes to this implementation easier.
Addressable is now also an explicit gem dependency.
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains.
We no longer need to use Rails "require_dependency" anywhere and instead can just use standard
Ruby patterns to require files.
This is a far reaching change and we expect some followups here.
Related to https://meta.discourse.org/t/host-is-invalid-error-when-tld-is-longer-than-7-characters/46081.
Using Discourse `v2.4.0.beta2 +119`, I can't add an host (when embedding, cf. `/admin/customize/embedding`) ending with `.engineering`.
Turns out current regex limits to 10 characters.
Fix is dumb: it only allows for up to 24 chars, which is the **current** max TLD length, see https://stackoverflow.com/a/22038535/1907212.
---
Maybe a better (and longer-term) fix would be to allow for up to 64 chars, which I understand comes from the RFC.
I'm not at ease with regexes, so can't be sure about it, but [this suggestion](https://meta.discourse.org/t/host-is-invalid-error-when-tld-is-longer-than-7-characters/46081/8?u=julienma) seems pretty good:
> rules of DNS labels are:
>
> - All labels are 1 to 63 characters, case insensitive A to Z, 0 to 9 and - (hyphen), all from ASCII.
> - No labels may start with a hyphen.
> - No top level domain label may start with a number.
>
>That means a regexp for a valid domain name would look like:
>
>`/^([a-z0-9][a-z0-9-]{0,62}\.)+[a-z][a-z0-9-]{0,62}\.?$/`
>
>Domains that are just a TLD are sufficiently bizarre as to be worth ignoring.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
* `rescue nil` is a really bad pattern to use in our code base.
We should rescue errors that we expect the code to throw and
not rescue everything because we're unsure of what errors the
code would throw. This would reduce the amount of pain we face
when debugging why something isn't working as expexted. I've
been bitten countless of times by errors being swallowed as a
result during debugging sessions.