Similar to interpolation, we do not want to completely remove whitespace
nodes that are siblings of an expansion.
For example, the following template
```html
<div>
<strong>items left<strong> {count, plural, =1 {item} other {items}}
</div>
```
was being collapsed to
```html
<div><strong>items left<strong>{count, plural, =1 {item} other {items}}</div>
```
which results in the text looking like
```
items left4
```
instead it should be collapsed to
```html
<div><strong>items left<strong> {count, plural, =1 {item} other {items}}</div>
```
which results in the text looking like
```
items left 4
```
---
**Analysis of the code and manual testing has shown that this does not cause
the generated ids to change, so there is no breaking change here.**
PR Close#31962
Leading trivia, such as whitespace or comments, is
confusing for developers looking at source-mapped
templates, since they expect the source-map segment
to start after the trivia.
This commit adds skipping trivial characters to the lexer;
and then implements that in the template parser.
PR Close#30095
This PR alligns markup language lexer with the previous behaviour in version 7.x:
https://stackblitz.com/edit/angular-iancj2
While this behaviour is not perfect (we should be giving users an error message
here about invalid HTML instead of assuming text node) this is probably best we
can do without more substential re-write of lexing / parsing infrastructure.
This PR just fixes#29231 and restores VE behaviour - a more elaborate fix will
be done in a separate PR as it requries non-trivial rewrites.
PR Close#29328
This PR alligns markup language lexer with the previous behaviour in version 7.x:
https://stackblitz.com/edit/angular-iancj2
While this behaviour is not perfect (we should be giving users an error message
here about invalid HTML instead of assuming text node) this is probably best we
can do without more substential re-write of lexing / parsing infrastructure.
This PR just fixes#29231 and restores VE behaviour - a more elaborate fix will
be done in a separate PR as it requries non-trivial rewrites.
PR Close#29328
Previously the start of a character indicated by an escape sequence
was being incorrectly computed by the lexer, which caused tokens
to include the start of the escaped character sequence in the
preceding token. In particular this affected the name extracted
from opening tags if the name was terminated by an escape sequence.
For example, `<t\n>` would have the name `t\` rather than `t`.
This fix refactors the lexer to use a "cursor" object to iterate over
the characters in the template source. There are two cursor implementations,
one expects a simple string, the other expects a string that contains
JavaScript escape sequences that need to be unescaped.
PR Close#28978
The parts of a token are supposed to be an array of not-null strings,
but we were using `null` for tags that had no prefix. This has been
fixed to use the empty string in such cases, which allows the `null !`
hack to be removed.
PR Close#28978
When tokenizing markup (e.g. HTML) element attributes
can have quoted or unquoted values (e.g. `a=b` or `a="b"`).
The `ATTR_VALUE` tokens were capturing the quotes, which
was inconsistent and also affected source-mapping.
Now the tokenizer captures additional `ATTR_QUOTE` tokens,
which the HTML related parsers understand and factor into their
token parsing.
PR Close#28055
In order to support source mapping of templates, we need
to be able to tokenize the template in its original context.
When the template is defined inline as a JavaScript string
in a TS/JS source file, the tokenizer must be able to handle
string escape sequences, such as `\n` and `\"` as they
appear in the original source file.
This commit teaches the lexer how to unescape these
sequences, but only when the `escapedString` option is
set to true. Otherwise there is no change to the tokenizing
behaviour.
PR Close#28055
The lexer that does the tokenizing can now process only a part the source
string, by passing a `range` property in the `options` argument. The
locations of the nodes that are tokenized will now take into account the
position of the span in the context of the original source string.
This `range` option is, in turn, exposed from the template parser as well.
Being able to process parts of files helps to enable SourceMap support
when compiling inline component templates.
PR Close#28055
This commit consolidates the options that can modify the
parsing of text (e.g. HTML, Angular templates, CSS, i18n)
into an AST for further processing into a single `options`
hash.
This makes the code cleaner and more readable, but also
enables us to support further options to parsing without
triggering wide ranging changes to code that should not
be affected by these new options. Specifically, it will let
us pass information about the placement of a template
that is being parsed in its containing file, which is essential
for accurate SourceMap processing.
PR Close#28055