Currently the Tag Processor assumes that an input document is a ''full'' HTML document. Because of this, if there's lingering content after the last tag match it will treat that content as plaintext and skip over it. This is fine for the Tag Processor because if there is lingering content that isn't a valid tag then there's nothing for `next_tag()` to match.
However, in order to support a number of feature expansions it is important to recognize that the remaining content ''may'' involve partial syntax elements, such as incomplete tags, attributes, or comments.
In this patch we're adding a mode inside the Tag Processor which will flip when we start parsing HTML syntax but the document finishes before the token does. This will provide the ability to:
- extend the input document,
- avoid misinterpreting syntax as text, and
- guess if we have a complete document, know if we have an incomplete document.
In the process of building this patch a few fixes were identified and fixed in the Tag Processor, namely in the handling of incomplete syntax elements.
Props dmsnell, jonsurrell.
Fixes#60122, #60108.
Built from https://develop.svn.wordpress.org/trunk@57211
git-svn-id: http://core.svn.wordpress.org/trunk@56717 1a063a9b-81f0-0310-95a4-ce76da25c4cd