[Docs] Add clarification to analysis example (#31826)
There have been at least two PRs trying to fix the spelling of "lazi" because it isn't very clear from the example that the english analyzer will stem each token in the example. This adds a short description of the analysis process to make this clearer. Relates to #31797
This commit is contained in:
parent
03adbf2a39
commit
3c11c7c261
|
@ -13,15 +13,18 @@ defined per index.
|
||||||
[float]
|
[float]
|
||||||
== Index time analysis
|
== Index time analysis
|
||||||
|
|
||||||
For instance at index time, the built-in <<english-analyzer,`english`>> _analyzer_ would
|
For instance, at index time the built-in <<english-analyzer,`english`>> _analyzer_
|
||||||
convert this sentence:
|
will first convert the sentence:
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
------
|
------
|
||||||
"The QUICK brown foxes jumped over the lazy dog!"
|
"The QUICK brown foxes jumped over the lazy dog!"
|
||||||
------
|
------
|
||||||
|
|
||||||
into these terms, which would be added to the inverted index.
|
into distinct tokens. It will then lowercase each token, remove frequent
|
||||||
|
stopwords ("the") and reduce the terms to their word stems (foxes -> fox,
|
||||||
|
jumped -> jump, lazy -> lazi). In the end, the following terms will be added
|
||||||
|
to the inverted index:
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
------
|
------
|
||||||
|
|
Loading…
Reference in New Issue