OpenSearch/rest-api-spec
Adrien Grand 9ea25df649 Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.

Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.

Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.

5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB:     [20000, 20000, 20000, 20000, 20000]

3 shards:
Murmur3: [33185, 33347, 33468]
DJB:     [30100, 30000, 39900]

33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB:     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]

Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).

Some tests have been modified because they relied on implementation details of
the routing hash function.

Close #7954
2014-11-04 16:32:42 +01:00
..
api Added missing percolate API parameters (percolate_routing, percolate_preference) to the REST API Spec 2014-10-28 19:34:34 +01:00
test Switch to murmurhash3 to route documents to shards. 2014-11-04 16:32:42 +01:00
utils [UTIL] Fixed an error for the `--output` parameter in `thor api:spec:generate` 2013-12-05 17:36:54 +01:00
.gitignore Initial commit (blank repository) 2013-05-23 17:56:22 +02:00
LICENSE.txt [SETUP] Added README and LICENSE 2013-05-24 12:06:08 +02:00
README.markdown Update README with full command to generate spec 2013-12-11 18:33:45 +00:00

README.markdown

Elasticsearch REST API JSON specification

This repository contains a collection of JSON files which describe the Elasticsearch HTTP API.

Their purpose is to formalize and standardize the API, to facilitate development of libraries and integrations.

Example for the "Create Index" API:

{
  "indices.create": {
    "documentation": "http://www.elasticsearch.org/guide/reference/api/admin-indices-create-index/",
    "methods": ["PUT", "POST"],
    "url": {
      "path": "/{index}",
      "paths": ["/{index}"],
      "parts": {
        "index": {
          "type" : "string",
          "required" : true,
          "description" : "The name of the index"
        }
      },
      "params": {
        "timeout": {
          "type" : "time",
          "description" : "Explicit operation timeout"
        }
      }
    },
    "body": {
      "description" : "The configuration for the index (`settings` and `mappings`)"
    }
  }
}

The specification contains:

  • The name of the API (indices.create), which usually corresponds to the client calls
  • Link to the documentation at http://elasticsearch.org
  • List of HTTP methods for the endpoint
  • URL specification: path, parts, parameters
  • Whether body is allowed for the endpoint or not and its description

The methods and url.paths elements list all possible HTTP methods and URLs for the endpoint; it is the responsibility of the developer to use this information for a sensible API on the target platform.

Utilities

The repository contains some utilities in the utils directory:

  • The thor api:generate:spec will generate the basic JSON specification from Java source code
  • The thor api:generate:code generates Ruby source code and tests from the specs, and can be extended to generate assets in another programming language

Run bundle install and then thor list in the utils folder.

The full command to generate the api spec is:

thor api:spec:generate --output=myfolder --elasticsearch=/path/to/es

License

This software is licensed under the Apache 2 license.