Commit Graph

603 Commits

Author SHA1 Message Date
Adrien Grand 9ea25df649 Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.

Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.

Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.

5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB:     [20000, 20000, 20000, 20000, 20000]

3 shards:
Murmur3: [33185, 33347, 33468]
DJB:     [30100, 30000, 39900]

33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB:     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]

Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).

Some tests have been modified because they relied on implementation details of
the routing hash function.

Close #7954
2014-11-04 16:32:42 +01:00
Martijn Laarman 006acfe8bf Added missing percolate API parameters (percolate_routing, percolate_preference) to the REST API Spec
Closes #7173
2014-10-28 19:34:34 +01:00
Clinton Gormley b94b3b2bcd Tests: Refixed the put_alias tests
The missing/null index parameter is in the URL, not the body.
So it should be required, and throw a param error if not provided.

Relates #7863
2014-10-28 13:39:05 +01:00
Clinton Gormley f9b5906871 Test: Change missing/null alias test to catch a request error, not a param error
Relates: #7863
2014-10-28 11:43:10 +01:00
Spencer 055e766be7 Updated paths to be inline with #8240 2014-10-27 23:08:19 -07:00
tlrx 8c864cf3f6 Cat Recovery API: Reverting changes introduced with commit e1c75bae87
Adding these 2 headers to the CAT Recovery made the CI tests hanging for a loooong time.

Related to #8041
2014-10-27 20:49:58 +01:00
Zachary Tong f5b2dfd052 Aliases: Throw exception if index is null or missing when creating an alias
Fixes a bug where alias creation would allow `null` for index name, which thereby
applied the alias to _all_ indices.  This patch makes the validator throw an
exception if the index is null.

```bash
POST /_aliases
{
   "actions": [
      {
         "add": {
            "alias": "empty-alias",
            "index": null
         }
      }
   ]
}
```
```json
{
   "error": "ActionRequestValidationException[Validation Failed: 1: Alias action [add]: [index] may not be null;]",
   "status": 400
}
```

The reason this bug wasn't caught by the existing tests is because
the old test for nullness only validated against a cluster which had
zero indices.  The null index is translated into "_all", and since
there are no indices, this fails because the index doesn't exist.
 So the test passes.

However, as soon as you add an index, "_all" resolves and you get the
situation described in the original bug report:  null index is
accepted by the alias, resolves to "_all" and gets applied to everything.

The REST tests, otoh, explicitly tested this bug as a real feature and therefore
passed.  The REST tests were modified to change this behavior.

Fixes #7863
2014-10-27 14:39:01 -04:00
tlrx 96e62b3c1b [TESTS Fix wrong assertion in test introduced by #8128 2014-10-27 11:11:54 +01:00
tlrx e1c75bae87 Cat API: Add node name to _cat/recovery
Add source_node and target_node fields to the recovery cat API. Also fixed and updated the documentation which was not complete concerning fields names.

Closes #8041
2014-10-27 09:47:26 +01:00
Alex Ksikes c13f5f21de Term Vectors: support for distributed frequencies
Adds distributed frequencies support for the Term Vectors API. A new parameter
called `dfs` is introduced which defaults to `false`.

Closes #8144
2014-10-23 13:59:59 +02:00
Adrien Grand f4ee3f25e4 Mappings: Store _timestamp by default.
Storing `_timestamp` by default means that under the default configuration, you
would have all the information you need in order to reindex into a different
index.

Close #8139
2014-10-20 12:17:26 +02:00
Spencer 249a145a5c Update cat.nodes regex
Support for multi-word node names like 'Henry "Hank" McCoy'
2014-10-15 14:33:41 -07:00
Chris Earle 04926954e2 Updating test to support Windows file descriptor count (-1) 2014-10-15 14:02:32 -05:00
Chris Earle 29c5aaa1d3 Fixing test to allow decimal numbers for load. 2014-10-15 11:54:12 -05:00
Chris Earle 2d8a140ed8 Add file descriptor details to cat/nodes
cat/nodes currently does not report any details related to file descriptors. This adds the current number in use, the maximum number available as well as their ratio (percentage) to cat/nodes as hidden-by-default metrics. In addition, this also adds current heap usage (as a non-percentage of ts max) and ram usage (as a non-percerntage of its max) to allow tools to provide more granularity.

Closes #7652
2014-10-15 10:18:41 -05:00
Clinton Gormley dfcc0f97f0 Spec: Removed flush and max_num_segments from indices.upgrade 2014-10-11 17:19:06 +02:00
Clinton Gormley 1e47f02891 Rest: Added missing parameters to indices.upgrade 2014-10-11 16:40:45 +02:00
Clinton Gormley 20a901964c Spec: Added human flag to indices.get_upgrade 2014-10-10 17:26:36 +02:00
Karel Minarik b1d4cec7ab [SPEC] Separated the "Upgrade Index" API into two methods
* `get_upgrade` => `GET _upgrade`  -- Return the status
* `upgrade`     => `POST _upgrade` -- Perform the operation

Original specification part of c021f22523.

Related: #7884, #7922
2014-10-09 16:19:58 +02:00
Ryan Ernst c06c10bbb0 Remove deprecations from master (follow up to #7922) 2014-10-07 08:35:11 -07:00
Ryan Ernst c021f22523 Add Upgrade API
This commit does the following:
* Add the new API at the rest layer, being backed by the optimize API
  with upgrade flag, and segments api to find upgrade status.
* Add `upgrade` flag to optimize API, and deprecate `force` flag (will
  remove in master)
* Add test for both synchronous and async upgrade

closes #7884
closes #7922
2014-10-07 08:09:50 -07:00
Igor Motov 555bfcb02b [SNAPSHOT] Add repository validation
Fixes #7096
2014-10-07 10:50:16 -04:00
David Pilato f0052a58d6 Admin: show open and closed indices in _cat/indices
When asking for `GET /_cat/indices?v`, you can now retrieve closed indices in addition to opened ones.

```
health status index              pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .marvel-2014.05.21   1   1       8792            0     21.7mb         21.7mb
       close  test
yellow open   .marvel-2014.05.22   1   1       3871            0     10.7mb         10.7mb
red    open   .marvel-2014.05.27   1   1
```

Closes #7907.
Closes #7936.
2014-10-03 13:16:14 +02:00
Alex Ksikes c4830cf862 Term Vectors: support for realtime
By default term vectors are now realtime, as opposed to previously near
realtime. If they are not found in the index, they will be generated on the
fly. The document is fetched from the transaction log and treated as an
artificial document. One can set `realtime` parameter to `false` in order to
disable this functionality. This consequently makes the MLT query realtime in
fetching documents, as it previsouly used to be before switching from using
the multi get API to the mtv API.

Closes #7846
2014-10-03 09:26:47 +02:00
Honza Král 3aa4ac9caa [TEST] Make mlt rest tests pass on a single node cluster 2014-09-30 22:42:51 +02:00
Alex Ksikes b118558962 MLT Query: Support for artificial documents
Previously, the only way to specify a document not present in the index was to
use `like_text`. This would usually lead to complex queries made of multiple
MLT queries per document field. This commit adds the ability to the MLT query
to directly specify documents not present in the index (artificial documents).
The syntax is similar to the Percolator API or to the Multi Term Vector API.

Closes #7725
2014-09-29 15:49:13 +02:00
Spencer 19f5a86c21 Update indices.get_mapping.json 2014-09-25 09:53:27 -07:00
javanna 07ca08dbed [TEST] improved regular scroll REST test
Added sort clause by field and checked docs returned each time
2014-09-25 13:02:16 +02:00
javanna dc1ef7e670 [TEST] improved regular scroll REST test
The intermediate document that gets indexed has now a non existing id, and we make sure it doesn't get returned.
2014-09-25 11:46:05 +02:00
javanna f52375198b [TEST] add regular scroll REST test
Closes #7860
2014-09-25 11:03:12 +02:00
Brian Murphy 8e742c2096 Indexed Scripts/Templates : Cleanup
This contains several cleanups to the indexed scripts.
Remove the unused FetchSourceContext from the Get request..
Add lang,_version,_id to the REST GET API.
Removes the routing from GetIndexedScriptRequest since the script index is a single shard that is replicated across all nodes.
Fix backward compatible template file reference
Before 1.3.0 on disk scripts could be referenced by requesting
````
_search/template

{
  "template" : "ondiskscript"
}
````
This was broken in 1.3.0 by requiring
````
{
  "template" :
  {
    "file" : "ondiskscript"
  }
}
````
This commit restores the previous behavior.
Remove support for preference, realtime and refresh
These parameters don't make sense anymore for indexed scripts as we always force the preference to _local and
always refresh after a Put to the indexed scripts index.

Closes #7568
Closes #7559
Closes #7647
Closes #7567
2014-09-19 11:59:08 +01:00
javanna 7e0481d906 More Like This API: remove unused search_query_hint parameter
Closes #7691
2014-09-11 17:34:54 +02:00
Clinton Gormley 269b91c688 Spec: Fixed the docs URL for indices.get and indices.exists 2014-09-11 16:39:46 +02:00
Honza Král 480b90cfd6 [API] Fix minor issues with indices.get definition and tests
mark index param as required
make body match json, not string containing json
2014-09-11 14:36:11 +02:00
Colin Goodheart-Smithe 5fe782b784 Indices API: Added GET Index API
Returns information about settings, aliases, warmers, and mappings. Basically returns the IndexMetadata. This new endpoint replaces the /{index}/_alias|_aliases|_mapping|_mappings|_settings|_warmer|_warmers and /_alias|_aliases|_mapping|_mappings|_settings|_warmer|_warmers endpoints whilst maintaining the same response formats.  The only exception to this is on the /_alias|_aliases|_warmer|_warmers endpoint which will now return a section for 'aliases' or 'warmers' even if no aliases or warmers exist. This backwards compatibility change is documented in the reference docs.

Closes #4069
2014-09-11 11:19:21 +01:00
Boaz Leskes 4f8ddd97bf [Rest] reroute API response didn't filter metadata
By default the reroute API should return the new cluster state, excluding the metadata. It was however it was wrongly using an old parameter (filter_metadata) and thus failed to do so. This commits restores but wiring it to the correct `metric` parameter. We also add an enum representing the possible metrics, to avoid similar future mistakes.

Closes #7520
Closes #7523
2014-09-10 14:48:06 +02:00
Martijn van Groningen 52f1ab6e16 Core: Added the `index.query.parse.allow_unmapped_fields` setting to fail queries if they refer to unmapped fields.
The percolator and filters in aliases by default enforce strict query parsing.

Closes #7335
2014-09-09 15:00:47 +02:00
javanna a857798e1c Indexed scripts: make sure headers are handed over to internal requests and streamline versioning support
The get, put and delete indexed script apis map to get, index and delete api and internally create those corresponding requests. We need to make sure that the original headers are handed over to the new request by passing the original request in the constructor when creating the new one.

Also streamlined the support for version and version_type in the REST layer since the parameters were not consistently parsed and set to the internal java API requests.

Modified the REST delete template and delete script actions to make use of a client instead of using the `ScriptService` directly.

Closes #7569
2014-09-04 16:00:32 +02:00
Boaz Leskes 0e6bb1f28b [Rest] Add the cluster name to the "/" endpoint
The root endpoint returns basic information about this node, like it's name and ES version etc. The cluster name is an important information that belongs in that list.

Closes #7524
2014-09-01 10:05:11 +02:00
Simon Willnauer 91b8498cec [TEST] Port can have more or less than 4 digits 2014-08-29 08:57:35 +02:00
Clinton Gormley ce0b2ade9c Test: Fixed the regex in cat.indices/10_basic.yaml
And renamed the file from .yml to .yaml
2014-08-25 16:13:16 +02:00
Simon Willnauer 6950c38a04 Tests: Improve test coverage.
Close #7428
2014-08-25 11:56:38 +02:00
Alex Ksikes 62ef4a30dc Term vector API: return 'found: false' for docs between index and refresh
Closes #7121
2014-08-21 09:58:49 +02:00
Colin Goodheart-Smithe fb651f7755 [TEST] fix for wildcard_expansion REST tests
removed timeout from wait_for_green and set number of replicas to 0 so client tests can pass on single node.
2014-08-18 14:36:43 +01:00
Colin Goodheart-Smithe 925a4ba28b [TEST] fix to wildcard_expansion tests to wait for green status 2014-08-18 10:18:12 +01:00
Colin Goodheart-Smithe f4d75f0212 REST API: Allows all options for expand_wildcards parameter
This change means that the default settings for expand_wildcards are only applied if the expand_wildcards parameter is not specified rather than being set upfront. It also adds the none and all options to the parameter to allow the user to specify no expansion and expansion to all indexes (equivalent to 'open,closed')

Closes #7258
2014-08-15 12:50:11 +01:00
javanna 6d3bcc4451 Java API: add index, type and id to ExplainResponse
Index, type and id were returned as part of the REST explain api response, but not through java api. That info was read out of the request, relying on the fact that the index would get overridden with the concrete one within that same request.

Closes #7201
2014-08-08 12:52:03 +02:00
Clinton Gormley 9d65db4dba Test: Trimmed trailing whitespace to make valid YAML 2014-08-06 15:05:46 +02:00
Clinton Gormley 11f8edd74a REST spec: Added missing query_cache param to clear_cache, nodes.stats and indices.stats
Relates to #7167 and #7161
2014-08-06 13:32:33 +02:00
Shay Banon e6e2781ee7 [Query Cache] Add a request level flag to control query cache
A request level flag, defaults to be unset, to control the query cache. When not set, it defaults to the index level settings, when explicitly set, will override the index level setting
closes #7167
2014-08-05 18:28:49 +02:00