The Interactivity API has been rendering client data in a SCRIPT element with the
type `application/json` so that it's not executed as a script, but is available
to one. The data runs through `wp_json_encode()` and is encoded with some flags
to ensure that potentially-dangerous characters are escaped.
However, this can lead to some challenges. Eagerly escaping when not necessary
can make the data difficult to comprehend when reading the output HTML. For example,
all non-ASCII Unicode characters are escaped with their code point equivalent.
This results in `\ud83c\udd70` instead of `🅰`.
In this patch, the flags for JSON encoding are refined to ensure what's necessary
while relaxing other rules (leaving in those Unicode characters if the blog charset
is UTF-8). This makes for Interactivity API data that's quicker as a human reader
to decipher and diagnose.
In summary:
- This data is JSON encoded and printed in a `<script type="application/json">` tag.
- If we ensure that `<` is never printed inside the data, it should be impossible to
break out of the script tag and the browser treats everything as the element's `textContent`.
- All other escaping becomes unnecessary at that point, including unicode escaping
if the page uses the UTF-8 charset (the same encoding as JSON).
See https://github.com/WordPress/wordpress-develop/pull/6433#pullrequestreview-2043218338
Developed in https://github.com/WordPress/wordpress-develop/pull/6520
Discussed in https://core.trac.wordpress.org/ticket/61170Fixes: #61170
Follow-up to: [57563].
Props: bjorsch, dmsnell, jonsurrell, sabernhardt, westonruter.
Built from https://develop.svn.wordpress.org/trunk@58159
git-svn-id: http://core.svn.wordpress.org/trunk@57622 1a063a9b-81f0-0310-95a4-ce76da25c4cd