Previously, there were a few cases where the modifiable text read from an HTML comment differs slightly from the parsed value of its inner text in a browser. This is due to the specific way that invalid HTML syntax tokens become "bogus comments."
This patch introduces a new method to the Tag Processor to allow differentiating these specific cases, such as when copying or serializing HTML from one source to another. Similar code has already been in use in the html5lib tests, and this patch simplifies the test runner, evidencing the fact that this method was already needed.
Developed in https://github.com/wordpress/wordpress-develop/pull/7342
Discussed in https://core.trac.wordpress.org/ticket/62036
Props dmsnell, jonsurrell.
See #62036.
Built from https://develop.svn.wordpress.org/trunk@59075
git-svn-id: http://core.svn.wordpress.org/trunk@58471 1a063a9b-81f0-0310-95a4-ce76da25c4cd