OpenSearch

Commit Graph

Author	SHA1	Message	Date
Clinton Gormley	5da9e5dcbc	Docs: Improved tokenizer docs (#18356 ) * Docs: Improved tokenizer docs Added descriptions and runnable examples * Addressed Nik's comments * Added TESTRESPONSEs for all tokenizer examples * Added TESTRESPONSEs for all analyzer examples too * Added docs, examples, and TESTRESPONSES for character filters * Skipping two tests: One interprets "$1" as a stack variable - same problem exists with the REST tests The other because the "took" value is always different * Fixed tests with "took" * Fixed failing tests and removed preserve_original from fingerprint analyzer	2016-05-19 19:42:23 +02:00
Zachary Tong	5ee5cc25cc	Move AsciiFolding earlier in FingerprintAnalyzer filter chain Rearranges the FingerprintAnalyzer so that AsciiFolding comes earlier in the chain (after lowercasing, before stop removal, for maximum deduping power) Closes #18266	2016-05-12 09:34:15 -04:00
Clinton Gormley	97a41ee973	First pass at improving analyzer docs (#18269 ) * Docs: First pass at improving analyzer docs I've rewritten the intro to analyzers plus the docs for all analyzers to provide working examples. I've also removed: * analyzer aliases (see #18244) * analyzer versions (see #18267) * snowball analyzer (see #8690) Next steps will be tokenizers, token filters, char filters * Fixed two typos	2016-05-11 14:17:56 +02:00
Zachary Tong	80288ad60c	Add `fingerprint` token filter and `fingerprint` analyzer Adds a `fingerprint` token filter which uses Lucene's FingerprintFilter, and a `fingerprint` analyzer that combines the Fingerprint filter with lowercasing, stop word removal and asciifolding. Closes #13325	2016-04-20 16:10:56 -04:00

4 Commits