OpenSearch

Commit Graph

Author	SHA1	Message	Date
James Rodewig	988e8c8fc6	[DOCS] Swap `[float]` for `[discrete]` (#60134 ) Changes instances of `[float]` in our docs for `[discrete]`. Asciidoctor prefers the `[discrete]` tag for floating headings: https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks	2020-07-23 12:42:33 -04:00
James Rodewig	ab29162ab3	[DOCS] Fix tokenizer page titles (#58361 ) (#58598 ) Changes the titles for tokenizer pages to sentence case. Also moves the 'Path hierarchy tokenizer examples' page within the 'Path hierarchy tokenizer' page and adds a related redirect.	2020-06-26 09:24:41 -04:00
Andrei Balici	19a336e8d3	Add `max_token_length` setting to the CharGroupTokenizer (#56860 ) Adds `max_token_length` option to the CharGroupTokenizer. Updates documentation as well to reflect the changes. Closes #56676	2020-05-20 14:28:40 +02:00
James Rodewig	b59ecde041	[DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353 ) (#46502 )	2019-09-09 13:38:14 -04:00
James Rodewig	bb7bff5e30	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 ) (#46418 )	2019-09-06 09:22:08 -04:00
Josh Soref	edb48321ba	[DOCS] Various spelling corrections (#37046 )	2019-01-07 14:44:12 +01:00
Itamar Syn-Hershko	5f172b6795	[Feature] Adding a char_group tokenizer (#24186 ) === Char Group Tokenizer The `char_group` tokenizer breaks text into terms whenever it encounters a character which is in a defined set. It is mostly useful for cases where a simple custom tokenization is desired, and the overhead of use of the <<analysis-pattern-tokenizer, `pattern` tokenizer>> is not acceptable. === Configuration The `char_group` tokenizer accepts one parameter: `tokenize_on_chars`:: A string containing a list of characters to tokenize the string on. Whenever a character from this list is encountered, a new token is started. Also supports escaped values like `\\n` and `\\f`, and in addition `\\s` to represent whitespace, `\\d` to represent digits and `\\w` to represent letters. Defaults to an empty list. === Example output ```The 2 QUICK Brown-Foxes jumped over the lazy dog's bone for $2``` When the configuration `\\s-:<>` is used for `tokenize_on_chars`, the above sentence would produce the following terms: ```[ The, 2, QUICK, Brown, Foxes, jumped, over, the, lazy, dog's, bone, for, $2 ]```	2018-05-22 16:26:31 +02:00

7 Commits