opensearch-docs-cn/_query-dsl/analyzers/text-analyzers.md

75 lines
3.7 KiB
Markdown
Raw Normal View History

new Text analyzers page + Lang analyzer section (#1376) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-11 19:59:26 -04:00
---
layout: default
title: Text analyzers
nav_order: 190
has_children: true
permalink: /analyzers/text-analyzers/
redirect_from:
- /opensearch/query-dsl/text-analyzers/
new Text analyzers page + Lang analyzer section (#1376) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-11 19:59:26 -04:00
---
# Optimizing text for searches with text analyzers
OpenSearch applies text analysis during indexing or searching for `text` fields. There is a standard analyzer that OpenSearch uses by default for text analysis. To optimize unstructured text for search, you can convert it into structured text with our text analyzers.
## Text analyzers
OpenSearch provides several text analyzers to convert your structured text into the format that works best for your searches.
OpenSearch supports the following text analyzers:
Refactor the Query DSL section (#2904) * for query dsl index page rewrites for proper index page Signed-off-by: alicejw <alicejw@amazon.com> * fix formatting in table Signed-off-by: alicejw <alicejw@amazon.com> * update query table intro Signed-off-by: alicejw <alicejw@amazon.com> * rmv proprietary from overview Signed-off-by: alicejw <alicejw@amazon.com> * awkward sentence fix Signed-off-by: alicejw <alicejw@amazon.com> * to add list of all query categories Signed-off-by: alicejw <alicejw@amazon.com> * for query category descriptions Signed-off-by: alicejw <alicejw@amazon.com> * remove commented note Signed-off-by: alicejw <alicejw@amazon.com> * update term-level query page Signed-off-by: alicejw <alicejw@amazon.com> * for clarity about term and full-text query use cases Signed-off-by: alicejw <alicejw@amazon.com> * for parallel bullet list of queries Signed-off-by: alicejw <alicejw@amazon.com> * remove redundant word Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/term.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/term.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * for tech review feedback Signed-off-by: alicejw <alicejw@amazon.com> * for entire list of query types we support, even though we don't have document topic pages for them yet. Signed-off-by: alicejw <alicejw@amazon.com> * to include full list of query types we support Signed-off-by: alicejw <alicejw@amazon.com> * change Boolean to type for consistency in the section Signed-off-by: alicejw <alicejw@amazon.com> * update query type category list title Signed-off-by: alicejw <alicejw@amazon.com> * for compound query type definitions Signed-off-by: alicejw <alicejw@amazon.com> * for additional descriptions Signed-off-by: alicejw <alicejw@amazon.com> * for query context descriptions Signed-off-by: alicejw <alicejw@amazon.com> * for additional edits to query descriptions list Signed-off-by: alicejw <alicejw@amazon.com> * create span query category page and update bullet list on index to cross-reference to it. Signed-off-by: alicejw <alicejw@amazon.com> * add pages for geo and shape query category, and add cross-references Signed-off-by: alicejw <alicejw@amazon.com> * remove regex it is part of term-level queries Signed-off-by: alicejw <alicejw@amazon.com> * for bullet list granular edits Signed-off-by: alicejw <alicejw@amazon.com> * put bullet list in alphabetical order Signed-off-by: alicejw <alicejw@amazon.com> * for doc review updates Signed-off-by: alicejw <alicejw@amazon.com> * reword for reviewer feedback Signed-off-by: alicejw <alicejw@amazon.com> * small rewording Signed-off-by: alicejw <alicejw@amazon.com> * typo space Signed-off-by: alicejw <alicejw@amazon.com> * put topics in alphabetical order in left nav Signed-off-by: alicejw <alicejw@amazon.com> * additional reviewer's comment Signed-off-by: alicejw <alicejw@amazon.com> * for second doc reviewer's feedback updates Signed-off-by: alicejw <alicejw@amazon.com> * for doc reviewer comment that was hidden Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/geo-and-shape.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/index.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/span-query.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/span-query.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/term.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * updates from third doc review for tech accuracy requested by editorial Signed-off-by: alicejw <alicejw@amazon.com> * create compound query sub-page to move descriptions to make bullet list parallel Signed-off-by: alicejw <alicejw@amazon.com> * fix compound query page title Signed-off-by: alicejw <alicejw@amazon.com> * add fuzzy query definition Signed-off-by: alicejw <alicejw@amazon.com> * for editorial feedback updates Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/term.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Refactor Query DSL section Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Adds doc review comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Fix typo Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Implemented editorial comments Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> * Changed periods to colons when introducing code blocks Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> --------- Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: Fanit Kolchina <kolchfa@amazon.com> Co-authored-by: alicejw <alicejw@amazon.com> Co-authored-by: Alice Williams <88908598+alicejw-aws@users.noreply.github.com>
2023-02-15 17:12:50 -05:00
- **Standard analyzer** Parses strings into terms at word boundaries according to the Unicode text segmentation algorithm. It removes most, but not all, punctuation and converts strings to lowercase. You can remove stop words if you enable that option, but it does not remove stop words by default.
- **Simple analyzer** Converts strings to lowercase and removes non-letter characters when it splits a string into tokens on any non-letter character.
- **Whitespace analyzer** Parses strings into terms between each whitespace.
- **Stop analyzer** Converts strings to lowercase and removes non-letter characters by splitting strings into tokens at each non-letter character. It also removes stop words (for example, "but" or "this") from strings.
- **Keyword analyzer** Receives a string as input and outputs the entire string as one term.
- **Pattern analyzer** Splits strings into terms using regular expressions and supports converting strings to lowercase. It also supports removing stop words.
- **Language analyzer** Provides analyzers specific to multiple languages.
- **Fingerprint analyzer** Creates a fingerprint to use as a duplicate detector.
add editorial changes for 1376-text analyzers (#1577) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> * for editorial review updates Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-14 18:19:32 -04:00
The full specialized text analyzers reference is in progress and will be published soon.
new Text analyzers page + Lang analyzer section (#1376) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-11 19:59:26 -04:00
{: .note }
## How to use text analyzers
add editorial changes for 1376-text analyzers (#1577) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> * for editorial review updates Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-14 18:19:32 -04:00
If you want to use a text analyzer, specify the name of the analyzer for the `analyzer` field: standard, simple, whitespace, stop, keyword, pattern, fingerprint, or language.
new Text analyzers page + Lang analyzer section (#1376) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-11 19:59:26 -04:00
add editorial changes for 1376-text analyzers (#1577) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> * for editorial review updates Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-14 18:19:32 -04:00
Each analyzer consists of one tokenizer and zero or more token filters. Different analyzers have different character filters, tokenizers, and token filters. To pre-process the string before the tokenizer is applied, you can use one or more character filters.
new Text analyzers page + Lang analyzer section (#1376) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-11 19:59:26 -04:00
#### Example: Specify the standard analyzer in a simple query
```json
GET _search
{
"query": {
"match": {
"title": "A brief history of Time",
"analyzer": "standard"
}
}
}
```
Rewrite full-text query definitions (#1548) * start of rewrites for query type definitions Signed-off-by: alicejw <alicejw@amazon.com> * for issue https://github.com/opensearch-project/documentation-website/issues/1116 Signed-off-by: alicejw <alicejw@amazon.com> * for defining the terms multiple query type in this issue https://github.com/opensearch-project/documentation-website/issues/1114 Signed-off-by: alicejw <alicejw@amazon.com> * remove extra instance of multi-term for clarity Signed-off-by: alicejw <alicejw@amazon.com> * clarity for synonym usage with multiple terms searches Signed-off-by: alicejw <alicejw@amazon.com> * for proper 3rd party doc reference Signed-off-by: alicejw <alicejw@amazon.com> * format error fix Signed-off-by: alicejw <alicejw@amazon.com> * fix link format Signed-off-by: alicejw <alicejw@amazon.com> * introduce that we use Apache Lucene search library and give link Signed-off-by: alicejw <alicejw@amazon.com> * additional changes Signed-off-by: alicejw <alicejw@amazon.com> * for 1st pass doc review updates Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * for 2nd doc reviewer updates Signed-off-by: alicejw <alicejw@amazon.com> * for clarity between using analyzers during index time and the auto query time analysis with the standard analyzer Signed-off-by: alicejw <alicejw@amazon.com> * update link text to new section title Signed-off-by: alicejw <alicejw@amazon.com> * update link text for lang analyzer section Signed-off-by: alicejw <alicejw@amazon.com> * update 10 anchor links to a section that now has a new title and anchor Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: Nate Bower <nbower@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: Nate Bower <nbower@amazon.com> * updates per editorial review feedback provided Signed-off-by: alicejw <alicejw@amazon.com> * one additional edit Signed-off-by: alicejw <alicejw@amazon.com> * fix format errors from MDlinter Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com> Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nate Bower <nbower@amazon.com>
2022-10-19 11:17:21 -04:00
## Analyzer options
new Text analyzers page + Lang analyzer section (#1376) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-11 19:59:26 -04:00
Rewrite full-text query definitions (#1548) * start of rewrites for query type definitions Signed-off-by: alicejw <alicejw@amazon.com> * for issue https://github.com/opensearch-project/documentation-website/issues/1116 Signed-off-by: alicejw <alicejw@amazon.com> * for defining the terms multiple query type in this issue https://github.com/opensearch-project/documentation-website/issues/1114 Signed-off-by: alicejw <alicejw@amazon.com> * remove extra instance of multi-term for clarity Signed-off-by: alicejw <alicejw@amazon.com> * clarity for synonym usage with multiple terms searches Signed-off-by: alicejw <alicejw@amazon.com> * for proper 3rd party doc reference Signed-off-by: alicejw <alicejw@amazon.com> * format error fix Signed-off-by: alicejw <alicejw@amazon.com> * fix link format Signed-off-by: alicejw <alicejw@amazon.com> * introduce that we use Apache Lucene search library and give link Signed-off-by: alicejw <alicejw@amazon.com> * additional changes Signed-off-by: alicejw <alicejw@amazon.com> * for 1st pass doc review updates Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * for 2nd doc reviewer updates Signed-off-by: alicejw <alicejw@amazon.com> * for clarity between using analyzers during index time and the auto query time analysis with the standard analyzer Signed-off-by: alicejw <alicejw@amazon.com> * update link text to new section title Signed-off-by: alicejw <alicejw@amazon.com> * update link text for lang analyzer section Signed-off-by: alicejw <alicejw@amazon.com> * update 10 anchor links to a section that now has a new title and anchor Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: Nate Bower <nbower@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: Nate Bower <nbower@amazon.com> * updates per editorial review feedback provided Signed-off-by: alicejw <alicejw@amazon.com> * one additional edit Signed-off-by: alicejw <alicejw@amazon.com> * fix format errors from MDlinter Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com> Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nate Bower <nbower@amazon.com>
2022-10-19 11:17:21 -04:00
Option | Valid values | Description
:--- | :--- | :---
`analyzer` | `standard, simple, whitespace, stop, keyword, pattern, language, fingerprint` | The analyzer you want to use for the query. Different analyzers have different character filters, tokenizers, and token filters. The `stop` analyzer, for example, removes stop words (for example, "an," "but," "this") from the query string. For a full list of acceptable language values, see [Language analyzer]({{site.url}}{{site.baseurl}}/query-dsl/analyzers/language-analyzers/) on this page.
Rewrite full-text query definitions (#1548) * start of rewrites for query type definitions Signed-off-by: alicejw <alicejw@amazon.com> * for issue https://github.com/opensearch-project/documentation-website/issues/1116 Signed-off-by: alicejw <alicejw@amazon.com> * for defining the terms multiple query type in this issue https://github.com/opensearch-project/documentation-website/issues/1114 Signed-off-by: alicejw <alicejw@amazon.com> * remove extra instance of multi-term for clarity Signed-off-by: alicejw <alicejw@amazon.com> * clarity for synonym usage with multiple terms searches Signed-off-by: alicejw <alicejw@amazon.com> * for proper 3rd party doc reference Signed-off-by: alicejw <alicejw@amazon.com> * format error fix Signed-off-by: alicejw <alicejw@amazon.com> * fix link format Signed-off-by: alicejw <alicejw@amazon.com> * introduce that we use Apache Lucene search library and give link Signed-off-by: alicejw <alicejw@amazon.com> * additional changes Signed-off-by: alicejw <alicejw@amazon.com> * for 1st pass doc review updates Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> * for 2nd doc reviewer updates Signed-off-by: alicejw <alicejw@amazon.com> * for clarity between using analyzers during index time and the auto query time analysis with the standard analyzer Signed-off-by: alicejw <alicejw@amazon.com> * update link text to new section title Signed-off-by: alicejw <alicejw@amazon.com> * update link text for lang analyzer section Signed-off-by: alicejw <alicejw@amazon.com> * update 10 anchor links to a section that now has a new title and anchor Signed-off-by: alicejw <alicejw@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: Nate Bower <nbower@amazon.com> * Update _opensearch/query-dsl/full-text.md Co-authored-by: Nate Bower <nbower@amazon.com> * updates per editorial review feedback provided Signed-off-by: alicejw <alicejw@amazon.com> * one additional edit Signed-off-by: alicejw <alicejw@amazon.com> * fix format errors from MDlinter Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com> Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Co-authored-by: Nate Bower <nbower@amazon.com>
2022-10-19 11:17:21 -04:00
`quote_analyzer` | String | This option lets you choose to use the standard analyzer without any options, such as `language` or other analyzers. Usage is `"quote_analyzer": "standard"`.
new Text analyzers page + Lang analyzer section (#1376) * test new DCO bypass * for dco auto sign test Signed-off-by: alicejw <alicejw@amazon.com> * test dco check Signed-off-by: alicejw <alicejw@amazon.com> * for new analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * test dco check after pull from main Signed-off-by: alicejw <alicejw@amazon.com> * for new text analyzers page Signed-off-by: alicejw <alicejw@amazon.com> * remove lang analyzers section from fulltext page, add link to new page text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rename page to text-analyzers Signed-off-by: alicejw <alicejw@amazon.com> * rmv test text for DCO check Signed-off-by: alicejw <alicejw@amazon.com> * for querydsl analyzers Signed-off-by: alicejw <alicejw@amazon.com> * for note about other 7 analyzer sections to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * for definitions of 7 specialized analyzers and note that full reference is in-progress to be published soon Signed-off-by: alicejw <alicejw@amazon.com> * add note to learn more and point to concepts page Signed-off-by: alicejw <alicejw@amazon.com> * for peer edit comments Signed-off-by: alicejw <alicejw@amazon.com> * add new line Signed-off-by: alicejw <alicejw@amazon.com> * remove specialized modifier for the text analyzers Signed-off-by: alicejw <alicejw@amazon.com> * doc review comments Signed-off-by: alicejw <alicejw@amazon.com> * change title Signed-off-by: alicejw <alicejw@amazon.com> * better page title Signed-off-by: alicejw <alicejw@amazon.com> Signed-off-by: alicejw <alicejw@amazon.com>
2022-10-11 19:59:26 -04:00
<!-- This is a list of the 7 individual new pages we need to write
If you want to select one of the text analyzers, see [Text analyzers reference]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/specialized-analyzers).
## Specialized text analyzers
1. Standard analyzer
1. Simple
1. Whitespace
1. Stop
1. Keyword
1. Pattern
1. Language
1. Fingerprint
-->