2013-06-28 11:17:23 -04:00
|
|
|
---
|
|
|
|
"Refresh":
|
2013-07-01 09:58:23 -04:00
|
|
|
|
2013-06-28 11:17:23 -04:00
|
|
|
- do:
|
|
|
|
indices.create:
|
|
|
|
index: test_1
|
|
|
|
body:
|
|
|
|
settings:
|
2014-02-10 04:58:13 -05:00
|
|
|
refresh_interval: -1
|
2014-04-07 12:59:25 -04:00
|
|
|
number_of_shards: 5
|
2014-02-10 04:58:13 -05:00
|
|
|
number_of_replicas: 0
|
2013-06-28 11:17:23 -04:00
|
|
|
- do:
|
|
|
|
cluster.health:
|
2014-02-10 04:58:13 -05:00
|
|
|
wait_for_status: green
|
2013-06-28 11:17:23 -04:00
|
|
|
|
|
|
|
- do:
|
|
|
|
index:
|
|
|
|
index: test_1
|
|
|
|
type: test
|
|
|
|
id: 1
|
|
|
|
body: { foo: bar }
|
|
|
|
refresh: 1
|
|
|
|
|
Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.
Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.
Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.
5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB: [20000, 20000, 20000, 20000, 20000]
3 shards:
Murmur3: [33185, 33347, 33468]
DJB: [30100, 30000, 39900]
33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]
Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).
Some tests have been modified because they relied on implementation details of
the routing hash function.
Close #7954
2014-10-01 18:34:05 -04:00
|
|
|
# If you wonder why this document get 3 as an id instead of 2, it is because the
|
|
|
|
# current routing algorithm would route 1 and 2 to the same shard while we need
|
|
|
|
# them to be different for this test to pass
|
2013-06-28 11:17:23 -04:00
|
|
|
- do:
|
|
|
|
index:
|
|
|
|
index: test_1
|
|
|
|
type: test
|
Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.
Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.
Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.
5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB: [20000, 20000, 20000, 20000, 20000]
3 shards:
Murmur3: [33185, 33347, 33468]
DJB: [30100, 30000, 39900]
33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]
Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).
Some tests have been modified because they relied on implementation details of
the routing hash function.
Close #7954
2014-10-01 18:34:05 -04:00
|
|
|
id: 3
|
2013-06-28 11:17:23 -04:00
|
|
|
body: { foo: bar }
|
|
|
|
refresh: 1
|
|
|
|
|
|
|
|
- do:
|
|
|
|
search:
|
|
|
|
index: test_1
|
|
|
|
type: test
|
|
|
|
body:
|
Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.
Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.
Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.
5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB: [20000, 20000, 20000, 20000, 20000]
3 shards:
Murmur3: [33185, 33347, 33468]
DJB: [30100, 30000, 39900]
33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]
Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).
Some tests have been modified because they relied on implementation details of
the routing hash function.
Close #7954
2014-10-01 18:34:05 -04:00
|
|
|
query: { terms: { _id: [1,3] }}
|
2013-06-28 11:17:23 -04:00
|
|
|
|
|
|
|
- match: { hits.total: 2 }
|
|
|
|
|
|
|
|
- do:
|
|
|
|
delete:
|
|
|
|
index: test_1
|
|
|
|
type: test
|
|
|
|
id: 1
|
|
|
|
|
|
|
|
- do:
|
|
|
|
search:
|
|
|
|
index: test_1
|
|
|
|
type: test
|
|
|
|
body:
|
Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.
Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.
Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.
5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB: [20000, 20000, 20000, 20000, 20000]
3 shards:
Murmur3: [33185, 33347, 33468]
DJB: [30100, 30000, 39900]
33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]
Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).
Some tests have been modified because they relied on implementation details of
the routing hash function.
Close #7954
2014-10-01 18:34:05 -04:00
|
|
|
query: { terms: { _id: [1,3] }}
|
2013-06-28 11:17:23 -04:00
|
|
|
|
|
|
|
- match: { hits.total: 2 }
|
|
|
|
|
|
|
|
- do:
|
|
|
|
delete:
|
|
|
|
index: test_1
|
|
|
|
type: test
|
Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.
Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.
Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.
5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB: [20000, 20000, 20000, 20000, 20000]
3 shards:
Murmur3: [33185, 33347, 33468]
DJB: [30100, 30000, 39900]
33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]
Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).
Some tests have been modified because they relied on implementation details of
the routing hash function.
Close #7954
2014-10-01 18:34:05 -04:00
|
|
|
id: 3
|
2013-06-28 11:17:23 -04:00
|
|
|
refresh: 1
|
|
|
|
|
2014-02-10 04:58:13 -05:00
|
|
|
# If a replica shard where doc 1 is located gets initialized at this point, doc 1
|
|
|
|
# won't be found by the following search as the shard gets automatically refreshed
|
|
|
|
# right before getting started. This is why this test only works with 0 replicas.
|
|
|
|
|
2013-06-28 11:17:23 -04:00
|
|
|
- do:
|
|
|
|
search:
|
|
|
|
index: test_1
|
|
|
|
type: test
|
|
|
|
body:
|
Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.
Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.
Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.
5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB: [20000, 20000, 20000, 20000, 20000]
3 shards:
Murmur3: [33185, 33347, 33468]
DJB: [30100, 30000, 39900]
33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]
Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).
Some tests have been modified because they relied on implementation details of
the routing hash function.
Close #7954
2014-10-01 18:34:05 -04:00
|
|
|
query: { terms: { _id: [1,3] }}
|
2013-06-28 11:17:23 -04:00
|
|
|
|
|
|
|
- match: { hits.total: 1 }
|