[Docs] Add clarification to analysis example (#31826)

There have been at least two PRs trying to fix the spelling of "lazi" because it
isn't very clear from the example that the english analyzer will stem each token
in the example. This adds a short description of the analysis process to make
this clearer.

Relates to #31797
This commit is contained in:
Christoph Büscher 2018-07-06 14:36:58 +02:00 committed by GitHub
parent 03adbf2a39
commit 3c11c7c261
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 6 additions and 3 deletions

View File

@ -13,15 +13,18 @@ defined per index.
[float]
== Index time analysis
For instance at index time, the built-in <<english-analyzer,`english`>> _analyzer_ would
convert this sentence:
For instance, at index time the built-in <<english-analyzer,`english`>> _analyzer_
will first convert the sentence:
[source,text]
------
"The QUICK brown foxes jumped over the lazy dog!"
------
into these terms, which would be added to the inverted index.
into distinct tokens. It will then lowercase each token, remove frequent
stopwords ("the") and reduce the terms to their word stems (foxes -> fox,
jumped -> jump, lazy -> lazi). In the end, the following terms will be added
to the inverted index:
[source,text]
------