* Docs: Improved tokenizer docs
Added descriptions and runnable examples
* Addressed Nik's comments
* Added TESTRESPONSEs for all tokenizer examples
* Added TESTRESPONSEs for all analyzer examples too
* Added docs, examples, and TESTRESPONSES for character filters
* Skipping two tests:
One interprets "$1" as a stack variable - same problem exists with the REST tests
The other because the "took" value is always different
* Fixed tests with "took"
* Fixed failing tests and removed preserve_original from fingerprint analyzer
* Docs: First pass at improving analyzer docs
I've rewritten the intro to analyzers plus the docs
for all analyzers to provide working examples.
I've also removed:
* analyzer aliases (see #18244)
* analyzer versions (see #18267)
* snowball analyzer (see #8690)
Next steps will be tokenizers, token filters, char filters
* Fixed two typos
The 'default' / 'standard' analyzer can be a trappy default sicne it filters
english stopwords by default. Yet a default should not be dedicated to a certain language
since elasticsearch is used in many different scenarios where a standard analysis chain
with specialization to english full-text might be rather counter productive.
This commit changes the 'standard' analyzer to use an empty stopword list for indices
that are created from 1.0.0.Beta1 version onwards but will maintain backwards compatibiliy
for older indices.
Closes#3775