WordPress

History

dmsnell 5b3b3f7df2 HTML API: Add `normalize()` to give us the HTML we always wanted. HTML often appears in ways that are unexpected. It may be missing implicit tags, may have unquoted, single-quoted, or double-quoted attributes, may contain duplicate attributes, may contain unescaped text content, or any number of other possible invalid constructions. The HTML API understands all fo these inputs, but downline parsers may not, and HTML snippets which are safe on their own may introduce problems when joined with other HTML snippets. This patch introduces the `serialize()` method on the HTML Processor, which prints a fully-normative HTML output, eliminating invalid markup along the way. It produces a string which contains every missing tag, double-quoted attributes, and no duplicates. A `normalize()` static method on the HTML Processor provides a convenient wrapper for constructing a fragment parser and immediately serializing. Subclasses relying on the `serialize_token()` method may perform structural HTML modifications with as much security as the upcoming `\Dom\HTMLDocument()` parser will, though these are not able to provide the full safety that will eventually appear with `set_inner_html()`. Further work may explore serializing to XML (which involves a number of other important transformations) and adding constraints to serialization (such as only allowing inline/flow/formatting elements and text). Developed in https://github.com/wordpress/wordpress-develop/pull/7331 Discussed in https://core.trac.wordpress.org/ticket/62036 Props dmsnell, jonsurrell, westonruter. Fixes #62036. Built from https://develop.svn.wordpress.org/trunk@59076 git-svn-id: http://core.svn.wordpress.org/trunk@58472 1a063a9b-81f0-0310-95a4-ce76da25c4cd		2024-09-20 22:32:17 +00:00
..
class-wp-html-active-formatting-elements.php	HTML API: Add missing tags in IN BODY insertion mode to HTML Processor.	2024-07-22 22:24:15 +00:00
class-wp-html-attribute-token.php	HTML API: Track spans of text with (offset, length) instead of (start, end).	2023-12-10 13:19:28 +00:00
class-wp-html-decoder.php	HTML API: Add missing `@global` tag on HTML Decoder.	2024-09-02 20:55:14 +00:00
class-wp-html-doctype-info.php	HTML API: Parse DOCTYPE tokens and set HTML parser mode accordingly.	2024-08-23 14:55:15 +00:00
class-wp-html-open-elements.php	HTML API: Only examine HTML nodes in `pop_until()` instack of open elements.	2024-09-04 19:25:14 +00:00
class-wp-html-processor-state.php	HTML API: Respect document compat mode when handling CSS class names.	2024-09-04 04:34:15 +00:00
class-wp-html-processor.php	HTML API: Add `normalize()` to give us the HTML we always wanted.	2024-09-20 22:32:17 +00:00
class-wp-html-span.php	HTML API: Add PHP type annotations.	2024-07-19 23:44:16 +00:00
class-wp-html-stack-event.php	HTML API: Add PHP type annotations.	2024-07-19 23:44:16 +00:00
class-wp-html-tag-processor.php	HTML API: Add `normalize()` to give us the HTML we always wanted.	2024-09-20 22:32:17 +00:00
class-wp-html-text-replacement.php	HTML API: Add PHP type annotations.	2024-07-19 23:44:16 +00:00
class-wp-html-token.php	HTML API: Add support for SVG and MathML (Foreign content)	2024-08-08 07:25:15 +00:00
class-wp-html-unsupported-exception.php	HTML API: Add context to Unsupported_Exception class for improved debugging.	2024-07-12 22:29:13 +00:00
html5-named-character-references.php	Introduce Token Map: An optimized static translation class.	2024-05-23 19:56:08 +00:00