Commit Graph

3 Commits

Author SHA1 Message Date
Zachary Tong 5ee5cc25cc Move AsciiFolding earlier in FingerprintAnalyzer filter chain
Rearranges the FingerprintAnalyzer so that AsciiFolding comes earlier in the chain (after lowercasing, before stop removal, for maximum deduping power)

Closes 
2016-05-12 09:34:15 -04:00
Clinton Gormley 97a41ee973 First pass at improving analyzer docs ()
* Docs: First pass at improving analyzer docs

I've rewritten the intro to analyzers plus the docs
for all analyzers to provide working examples.

I've also removed:

* analyzer aliases (see )
* analyzer versions (see )
* snowball analyzer (see )

Next steps will be tokenizers, token filters, char filters

* Fixed two typos
2016-05-11 14:17:56 +02:00
Zachary Tong 80288ad60c Add `fingerprint` token filter and `fingerprint` analyzer
Adds a `fingerprint` token filter which uses Lucene's FingerprintFilter,
and a `fingerprint` analyzer that combines the Fingerprint filter with
lowercasing, stop word removal and asciifolding.

Closes 
2016-04-20 16:10:56 -04:00