[DOCS] Add a lowercase email example to keyword tokenizer docs (#53257)

This commit is contained in:
James Rodewig 2020-03-30 09:06:04 -04:00 committed by GitHub
parent 374e76d7cd
commit 21f362a2a8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 48 additions and 0 deletions

View File

@ -44,6 +44,54 @@ The above sentence would produce the following term:
[ New York ]
---------------------------
[discrete]
[[analysis-keyword-tokenizer-token-filters]]
=== Combine with token filters
You can combine the `keyword` tokenizer with token filters to normalise
structured data, such as product IDs or email addresses.
For example, the following <<indices-analyze,analyze API>> request uses the
`keyword` tokenizer and <<analysis-lowercase-tokenfilter,`lowercase`>> filter to
convert an email address to lowercase.
[source,console]
---------------------------
POST _analyze
{
"tokenizer": "keyword",
"filter": [ "lowercase" ],
"text": "john.SMITH@example.COM"
}
---------------------------
/////////////////////
[source,console-result]
----------------------------
{
"tokens": [
{
"token": "john.smith@example.com",
"start_offset": 0,
"end_offset": 22,
"type": "word",
"position": 0
}
]
}
----------------------------
/////////////////////
The request produces the following token:
[source,text]
---------------------------
[ john.smith@example.com ]
---------------------------
[float]
=== Configuration