Commit Graph

971 Commits

Author SHA1 Message Date
Adrien Grand 068c788ec8 Disable fielddata on text fields by defaults. #17386
`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
2016-03-30 14:35:32 +02:00
javanna 8fc9dbbb99 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-29 14:27:04 +02:00
Clinton Gormley b87beeb05f Rename update-by-query REST tests to update_by_query 2016-03-29 13:13:49 +02:00
Clinton Gormley 647437ce56 REST: The body is required in the reindex API 2016-03-29 11:45:20 +02:00
Clinton Gormley 97606850e8 Renamed update-by-query REST spec to update_by_query 2016-03-29 11:45:20 +02:00
javanna de5cbda8e7 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-29 10:48:47 +02:00
Lee Hinman 80ab366de4 Add API to explain why a shard is or isn't assigned
This adds a new `/_cluster/allocation/explain` API that explains why a
shard can or cannot be allocated to nodes in the cluster. Additionally,
it will show where the master *desires* to put the shard, according to
the `ShardsAllocator`.

It looks like this:

```
GET /_cluster/allocation/explain?pretty
{
  "index": "only-foo",
  "shard": 0,
  "primary": false
}
```

Though, you can optionally send an empty body, which means "explain the
allocation for the first unassigned shard you find".

The output when a shard is unassigned looks like this:

```
{
  "shard" : {
    "index" : "only-foo",
    "index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
    "id" : 0,
    "primary" : false
  },
  "assigned" : false,
  "unassigned_info" : {
    "reason" : "INDEX_CREATED",
    "at" : "2016-03-22T20:04:23.620Z"
  },
  "nodes" : {
    "V-Spi0AyRZ6ZvKbaI3691w" : {
      "node_name" : "Susan Storm",
      "node_attributes" : {
        "bar" : "baz"
      },
      "final_decision" : "NO",
      "weight" : 0.06666675,
      "decisions" : [ {
        "decider" : "filter",
        "decision" : "NO",
        "explanation" : "node does not match index include filters [foo:\"bar\"]"
      } ]
    },
    "Qc6VL8c5RWaw1qXZ0Rg57g" : {
      "node_name" : "Slipstream",
      "node_attributes" : {
        "bar" : "baz",
        "foo" : "bar"
      },
      "final_decision" : "NO",
      "weight" : -1.3833332,
      "decisions" : [ {
        "decider" : "same_shard",
        "decision" : "NO",
        "explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
      } ]
    },
    "PzdyMZGXQdGhqTJHF_hGgA" : {
      "node_name" : "The Symbiote",
      "node_attributes" : { },
      "final_decision" : "NO",
      "weight" : 2.3166666,
      "decisions" : [ {
        "decider" : "filter",
        "decision" : "NO",
        "explanation" : "node does not match index include filters [foo:\"bar\"]"
      } ]
    }
  }
}
```

And when the shard *is* assigned, the output looks like:

```
{
  "shard" : {
    "index" : "only-foo",
    "index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
    "id" : 0,
    "primary" : true
  },
  "assigned" : true,
  "assigned_node_id" : "Qc6VL8c5RWaw1qXZ0Rg57g",
  "nodes" : {
    "V-Spi0AyRZ6ZvKbaI3691w" : {
      "node_name" : "Susan Storm",
      "node_attributes" : {
        "bar" : "baz"
      },
      "final_decision" : "NO",
      "weight" : 1.4499999,
      "decisions" : [ {
        "decider" : "filter",
        "decision" : "NO",
        "explanation" : "node does not match index include filters [foo:\"bar\"]"
      } ]
    },
    "Qc6VL8c5RWaw1qXZ0Rg57g" : {
      "node_name" : "Slipstream",
      "node_attributes" : {
        "bar" : "baz",
        "foo" : "bar"
      },
      "final_decision" : "CURRENTLY_ASSIGNED",
      "weight" : 0.0,
      "decisions" : [ {
        "decider" : "same_shard",
        "decision" : "NO",
        "explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
      } ]
    },
    "PzdyMZGXQdGhqTJHF_hGgA" : {
      "node_name" : "The Symbiote",
      "node_attributes" : { },
      "final_decision" : "NO",
      "weight" : 3.6999998,
      "decisions" : [ {
        "decider" : "filter",
        "decision" : "NO",
        "explanation" : "node does not match index include filters [foo:\"bar\"]"
      } ]
    }
  }
}
```

Only "NO" decisions are returned by default, but all decisions can be
shown by specifying the `?include_yes_decisions=true` parameter in the
request.

Resolves #14593
2016-03-28 15:21:02 -06:00
Nik Everett 0e6141e675 Replace is_true: took with took >= 0
This prevents tests from failing on machines that can finish the request
less than half a millisecond.
2016-03-28 13:03:48 -04:00
javanna a685148268 [TEST] expand REST tests to check for roles in nodes info, nodes stats and tasks list response 2016-03-25 22:53:21 +01:00
javanna a9f4982c40 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-25 20:16:40 +01:00
Clinton Gormley 30d78f4be0 In cat.snapshots, repository is required
Closes #17216
2016-03-25 14:23:52 +01:00
javanna 27d4994aff Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-24 18:10:11 +01:00
Areek Zillur e16e113691 Remove suggest threadpool
In #17198, we removed suggest transport action, which
used the `suggest` threadpool to execute requests. Now
`suggest` threadpool is unused and suggest requests are
executed on the `search` threadpool.
2016-03-23 18:01:45 -04:00
Areek Zillur 91dd9b3301 Merge suggest stats into search stats 2016-03-23 16:37:56 -04:00
Honza Král b139f4e0bf [TEST] Move yaml test requiring yaml, add skip:yaml
Clients don't ship with yaml (de)serializer by default so this test must
be optionally skipped
2016-03-23 14:50:23 +01:00
javanna 030453d320 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-23 11:25:34 +01:00
Honza Král f8e84f0bbb [TEST] fix incorrect indent in ingest/70_bulk.yaml 2016-03-22 20:53:23 +01:00
Honza Král ca4b8667bb [TEST] Move yaml test requiring header, add skip:headers 2016-03-22 20:53:23 +01:00
Nik Everett da96b6e41d [reindex] Add thottling support
The throttle is applied when starting the next scroll request so that its
timeout can include the throttle time.
2016-03-22 12:34:14 -04:00
javanna eebd0cfccd Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-22 10:34:40 +01:00
Simon Willnauer 7f16a1d9a7 Improve upgrade experience of node level index settings
In 5.0 we don't allow index settings to be specified on the node level ie.
in yaml files or via commandline argument. This can cause problems during
upgrade if this was used extensively. For instance if analyzers where
specified on a node level this might cause the index to be closed when
imported (see #17187). In such a case all indices relying on this
must be updated via `PUT /${index}/_settings`. Yet, this API has slightly
different semantics since it overrides existing settings. To make this less
painful this change adds a `preserve_existing` parameter on that API to ensure
we have the same semantics as if the setting was applied on the node level.

This change also adds a better error message and a change to the migration guide
to ensure upgrades are smooth if index settings are specified on the node level.

If a index setting is detected this change fails the node startup and prints a message
like this:
```
*************************************************************************************
Found index level settings on node level configuration.

Since elasticsearch 5.x index level settings can NOT be set on the nodes
configuration like the elasticsearch.yaml, in system properties or command line
arguments.In order to upgrade all indices the settings must be updated via the
/${index}/_settings API. Unless all settings are dynamic all indices must be closed
in order to apply the upgradeIndices created in the future should use index templates
to set default values.

Please ensure all required values are updated on all indices by executing:

curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{
  "index.number_of_shards" : "1",
  "index.query.default_field" : "main_field",
  "index.translog.durability" : "async",
  "index.ttl.disable_purge" : "true"
}'
*************************************************************************************
```
2016-03-21 20:12:18 +01:00
javanna 4077a2c9e1 adapt cluster stats REST tests after merge 2016-03-21 18:24:12 +01:00
javanna bf390a935e Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-21 17:18:23 +01:00
Martijn van Groningen e3b7e5d75a percolator: Replace percolate api with the new percolator query
Also replaced the PercolatorQueryRegistry with the new PercolatorQueryCache.

The PercolatorFieldMapper stores the rewritten form of each percolator query's xcontext
in a binary doc values field. This make sure that the query rewrite happens only during
indexing (some queries for example fetch shapes, terms in remote indices) and
the speed up the loading of the queries in the percolator query cache.

Because the percolator now works inside the search infrastructure a number of features
(sorting fields, pagination, fetch features) are available out of the box.

The following feature requests are automatically implemented via this refactoring:

Closes #10741
Closes #7297
Closes #13176
Closes #13978
Closes #11264
Closes #10741
Closes #4317
2016-03-21 12:21:50 +01:00
Clinton Gormley 0543d46c1d Fixed regex in cat.recovery REST tes
The time column should accept integer ms or floating point seconds
2016-03-16 17:22:00 +01:00
Simon Willnauer 121e7c8ca4 Add infrastructure to run REST tests on a multi-version cluster
This change adds the infrastructure to run the rest tests on a multi-node
cluster that users 2 different minor versions of elasticsearch. It doesn't implement
any dedicated BWC tests but rather leverages the existing REST tests.

Since we don't have a real version to test against, the tests uses the current version
until the first minor / RC is released to ensure the infrastructure works.

Relates to #14406
Closes #17072
2016-03-13 10:52:39 +01:00
Jason Tedor f465d98eb3 Add raw recovery progress to cat recovery API
This commit adds fields bytes_recovered and files_recovered to the cat
recovery API. These fields, respectively, indicate the total number of
bytes and files recovered. Additionally, for consistency, some totals
fields and translog recovery fields have been renamed.

Closes #17064
2016-03-11 08:27:09 -05:00
Nik Everett b8d931d23c [reindex] Timeout if sub-requests timeout
Sadly, it isn't easy to simulate a timeout during an integration test, you
just have to cause one. Groovy's sleep should do the job.
2016-03-10 13:05:23 -05:00
Martijn van Groningen 0bbb84c19a test: 'Test bulk request with default pipeline' may get run first and then the total ingest count for pipeline1 is 2. 2016-03-10 15:18:08 +01:00
Martijn van Groningen 2fa33d5c47 Added ingest statistics to node stats API
The ingest stats include the following statistics:
* `ingest.total.count`- The total number of document ingested during the lifetime of this node
* `ingest.total.time_in_millis` - The total time spent on ingest preprocessing documents during the lifetime of this node
* `ingest.total.current` - The total number of documents currently being ingested.
* `ingest.total.failed` - The total number ingest preprocessing operations failed during the lifetime of this node

Also these stats are returned on a per pipeline basis.
2016-03-10 13:21:43 +01:00
Nik Everett 6d0efae713 Teach list tasks api to wait for tasks to finish
_wait_for_completion defaults to false. If set to true then the API will
wait for all the tasks that it finds to stop running before returning. You
can use the timeout parameter to prevent it from waiting forever. If you
don't set a timeout parameter it'll default to 30 seconds.

Also adds a log message to rest tests if any tasks overrun the test. This
is just a log (instead of failing the test) because lots of tasks are run
by the cluster on its own and they shouldn't cause the test to fail. Things
like fetching disk usage from the other nodes, for example.

Switches the request to getter/setter style methods as we're going that
way in the Elasticsearch code base. Reindex is all getter/setter style.

Closes #16906
2016-03-08 11:53:57 -05:00
Jun Ohtani 071d578953 Analysis : Allow string explain param in JSON
Move some test methods from AnalylzeActionIT to RestAnalyzeActionTest
Allow string explain param if it can parse
Fix wrong param name in rest-api-spec

Closes #16925
2016-03-08 16:19:02 +09:00
Martijn van Groningen 82d01e4315 Added ingest info to node info API, which contains a list of available processors.
Internally the put pipeline API uses this information in node info API to validate if all specified processors in a pipeline exist on all nodes in the cluster.
2016-03-07 14:44:50 +01:00
javanna 9c4a5bbe7e adapt cluster stats api to node.client setting removal
The cluster stats api now returns counts for each node role. The `master_data`, `master_only`, `data_only` and `client` fields have been removed from the response in favour of `master`, `data`, `ingest` and `coordinating_only`. The same node can have multiple roles, hence contribute to multiple roles counts. Every node is implicitly a coordinating node, so whenever a node has no explicit roles, it will be counted as coordinating only.
2016-03-05 10:55:19 +01:00
javanna f786e9866c adapt _cat/nodes to node.client removal
_cat/nodes used to return `c` for client node or `d` for data node as part of the node.role column. This commit changes it to return `m` for master eligible, `d` for data and/or `i` for ingest. A node with no explicit roles will be a coordinating only node and marked with `-`. A node can obviously have multiple roles. The master column has been adapted to return only whether a node is the current master (`*`) or not (`-`).
2016-03-05 10:55:19 +01:00
Nik Everett 4d6cb34417 [reindex] Add ingest support 2016-03-04 10:05:13 -05:00
Clinton Gormley 30669f63e8 Document required settings when running the REST test suite 2016-03-04 13:50:40 +01:00
Simon Willnauer 5008694ba1 Remove support for legacy checksums
Elasticsearch 5.0 doesn't support indices wiht legacy checksums anymore.
The last time we write legacy checksums was in 1.3.0 which was based
on lucene 4.9 already which means that all files have CRC32 checksums.
All indices that Elasticsearch can read today must be written with
lucene version >= 4.8 anyway so we can drop this layer of backwards
compatibility entirely.

Since we are close to upgrading to Lucene 6.0 we should get rid of this
in a more contiained change than the lucene upgrade.
2016-03-03 22:58:18 +01:00
Adrien Grand fc0cc4a6bb Fix field_stats tests to use text/keyword instead of string. 2016-03-03 16:24:02 +01:00
Clinton Gormley 6b27de3f8c Fixed REST test to not rely on dynamic mapping 2016-03-03 14:38:10 +01:00
Clinton Gormley ce7fccb287 Fixed bad YAML in REST tests 2016-03-03 14:38:06 +01:00
Martijn van Groningen 75387001df Added `ingest_took` to bulk response to indicate how much time was spent on ingest preprocessing.
The `ingest_took` is separate from `took`, which keeps track how much time is spent on indexing/deleting/updating.
The `ingest_took` is only visible in the rest response if at least for one bulk item has ingest enabled.
2016-03-01 18:24:26 +01:00
Nik Everett c7c8bb357a Merge pull request #16861 from nik9000/reindex_is_ready
Reindex required some parsing changes for search requests to support
differing defaults from the regular search api.
2016-03-01 10:02:48 -05:00
Spencer 3f80feb899 [REST_API_SPEC] remove invalid use of catch: param
`catch: param` is designed to catch errors generated by client-side validation logic when users don't supply valid parameters to an API request. This test though is testing the server-side validation of pipeline aggregations, and so a "param" catch is invalid. Instead we will just test for a parse_exception error type using a regex.
2016-02-29 09:27:36 -07:00
Nik Everett c38119bae9 Merge branch 'master' into feature/reindex 2016-02-26 16:59:54 -05:00
Igor Motov d6af669776 Combine node name and task id into single string task id
This commit changes the URL for task operations from `/_tasks/{nodeId}/{taskId}` to `/_tasks/{taskId}`, where `{taskId}` has a form of nodeid:id
2016-02-24 12:44:12 -08:00
Simon Willnauer 354aae2fec Merge pull request #16770 from s1monw/http_on_cat
Expose http address in cat/nodes and cat/nodeattrs APIs

We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
2016-02-22 14:20:33 -08:00
David Pilato a0a6eff0d0 Fix test for [cat/recovery] Make recovery time a TimeValue()
Related to #16743
2016-02-22 13:37:11 -08:00
Simon Willnauer 3c15200f6f Expose http address in cat/nodes and cat/nodeattrs APIs
We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
2016-02-22 13:22:54 -08:00
Lee Hinman 99052c3fef Limit the accepted length of the _id
Elasticsearch should reject ids that are this long, to ensure a document
always remains retrievable for clients that impose a maximum URI length

Closes #16034
2016-02-22 12:34:18 -07:00