OpenSearch/docs
Adrien Grand 9ea25df649 Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.

Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.

Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.

5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB:     [20000, 20000, 20000, 20000, 20000]

3 shards:
Murmur3: [33185, 33347, 33468]
DJB:     [30100, 30000, 39900]

33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB:     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]

Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).

Some tests have been modified because they relied on implementation details of
the routing hash function.

Close #7954
2014-11-04 16:32:42 +01:00
..
community Docs: Add elastics-rb to the list of community clients 2014-11-02 13:55:21 +01:00
groovy-api Updated groovy docs to point to the new groovy repo 2014-05-14 12:08:02 +02:00
java-api MLT Field Query: remove it from master 2014-10-29 10:19:00 +01:00
javascript added doc page for the JavaScipt client, and listed it in the clients list. 2013-12-17 15:26:29 -07:00
perl Docs: Updated Perl client page to mention async client 2014-10-29 14:48:56 +01:00
python [DOCS] adding a note on python client versioning schema 2014-02-11 03:43:53 +01:00
reference Switch to murmurhash3 to route documents to shards. 2014-11-04 16:32:42 +01:00
resiliency Docs: Updated the resiliency docs to point to the DiscoveryWithServiceDisruptions class 2014-10-02 21:08:32 +02:00
river [DOCS] Fixed typo 2013-10-05 17:10:30 +02:00
ruby [DOC] Added comprehensive documentation for the Ruby and Rails integrations 2014-07-10 11:21:27 +02:00
README.md [DOCS] various docs fixes 2014-01-23 10:52:13 +01:00

README.md

The Elasticsearch docs are in AsciiDoc format and can be built using the Elasticsearch documentation build process

See: https://github.com/elasticsearch/docs