OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-03-09 14:34:43 +00:00

History

Christoph Büscher 4ffa050735 Allow custom characters in token_chars of ngram tokenizers (#49250 )

Currently the `token_chars` setting in both `edgeNGram` and `ngram` tokenizers
only allows for a list of predefined character classes, which might not fit
every use case. For example, including underscore "_" in a token would currently
require the `punctuation` class which comes with a lot of other characters.
This change adds an additional "custom" option to the `token_chars` setting,
which requires an additional `custom_token_chars` setting to be present and
which will be interpreted as a set of characters to inlcude into a token.

Closes #25894

2019-11-20 10:37:12 +01:00

src

Allow custom characters in token_chars of ngram tokenizers (#49250 )

2019-11-20 10:37:12 +01:00

build.gradle

Apply 2-space indent to all gradle scripts (#49071 )

2019-11-14 11:01:23 +00:00