The wildcard cat API REST tests relied on bulk.max and bulk.min in
the thread_pool response. However due to the thread pool types being
randomized in InternalTestCluster, the min/max values were not guaranteed
to exist (the cached thread pool type is unbounded and thus does not have a
max value).
In order to prevent this, the test has been removed and now the cat
nodes test is used for wildcard testing, which always returns stats
about the heap.
The tests for the recently added added wildcard feature were
relying on order of the hashmap being used, which could be
different.
The implementation now ensures, that the header fields are
parsed in the order they have been added.
This change adds a new "filter_path" parameter that can be used to filter and reduce the responses returned by the REST API of elasticsearch.
For example, returning only the shards that failed to be optimized:
```
curl -XPOST 'localhost:9200/beer/_optimize?filter_path=_shards.failed'
{"_shards":{"failed":0}}%
```
It supports multiple filters (separated by a comma):
```
curl -XGET 'localhost:9200/_mapping?pretty&filter_path=*.mappings.*.properties.name,*.mappings.*.properties.title'
```
It also supports the YAML response format. Here it returns only the `_id` field of a newly indexed document:
```
curl -XPOST 'localhost:9200/library/book?filter_path=_id' -d '---hello:\n world: 1\n'
---
_id: "AU0j64-b-stVfkvus5-A"
```
It also supports wildcards. Here it returns only the host name of every nodes in the cluster:
```
curl -XGET 'http://localhost:9200/_nodes/stats?filter_path=nodes.*.host*'
{"nodes":{"lvJHed8uQQu4brS-SXKsNA":{"host":"portable"}}}
```
And "**" can be used to include sub fields without knowing the exact path. Here it returns only the Lucene version of every segment:
```
curl 'http://localhost:9200/_segments?pretty&filter_path=indices.**.version'
{
"indices" : {
"beer" : {
"shards" : {
"0" : [ {
"segments" : {
"_0" : {
"version" : "5.2.0"
},
"_1" : {
"version" : "5.2.0"
}
}
} ]
}
}
}
}
```
Note that elasticsearch sometimes returns directly the raw value of a field, like the _source field. If you want to filter _source fields, you should consider combining the already existing _source parameter (see Get API for more details) with the filter_path parameter like this:
```
curl -XGET 'localhost:9200/_search?pretty&filter_path=hits.hits._source&_source=title'
{
"hits" : {
"hits" : [ {
"_source":{"title":"Book #2"}
}, {
"_source":{"title":"Book #1"}
}, {
"_source":{"title":"Book #3"}
} ]
}
}
```
Mappings conflicts should not be ignored. If I read the history correctly, this
option was added when a mapping update to an existing field was considered a
conflict, even if the new mapping was exactly the same. Now that mapping updates
are smart enough to detect conflicting options, we don't need an option to
ignore conflicts.
There currently are small differences between search api and count, exists, validate query, explain api when it comes to reading query_string parameters. `analyze_wildcard`, `lowercase_expanded_terms` and `lenient` are only read by the search api and ignored by all other mentioned apis. Unified code to fix this and make sure it doesn't happen again. Also shared some code when it comes to printing out the query as part of SearchSourceBuilder conversion to ToXContent.
Extended REST spec to include all the supported params (some that were already supported weren't listed), and added REST tests (also some basic tests for count and search_exists which weren't tested at all).
Closes#11057
Removes the More Like This API, users should now use the More Like This query.
The MLT API tests were converted to their query equivalent. Also some clean
ups in MLT tests.
Closes#10736Closes#11003
This removes Elasticsearch's filter cache and uses Lucene's instead. It has some
implications:
- custom cache keys (`_cache_key`) are unsupported
- decisions are made internally and can't be overridden by users ('_cache`)
- not only filters can be cached but also all queries that do not need scores
- parent/child queries can now be cached, however cached entries are only
valid for the current top-level reader so in practice it will likely only
be used on read-only indices
- the cache deduplicates filters, which plays nicer with large keys (eg. `terms`)
- better stats: we already had ram usage and evictions, but now also hit count,
miss count, lookup count, number of cached doc id sets and current number of
doc id sets in the cache
- dynamically changing the filter cache size is not supported anymore
Internally, an important change is that it removes the NoCacheFilter infrastructure
in favour of making Query.rewrite specializing the query for the current reader so
that it will only be cached on this reader (look for IndexCacheableQuery).
Note that consuming filters with the query API (createWeight/scorer) instead of
the filter API (getDocIdSet) is important for parent/child queries because
otherwise a QueryWrapperFilter(ParentQuery) would run the wrapped query per
segment while relations might be cross segments.
Remove the ability to specify search type ‘query_and_fetch’ and
‘df_query_and_fetch’ from the REST API.
- Adds REST tests
- Updates REST API spec to remove ‘query_and_fetch’ and
‘df_query_and_fetch’ as options
- Removes documentation for these options
Closes#9606
The current implementation is dangerous: it unexpectedly refreshes,
which can quickly cause an unhealthy index (segment explosion). It
can also delete different documents on primary vs replicas, causing
inconsistent replicas.
For 2.0 we will replace this with an optional plugin that does a
scan/scroll search and then issues bulk delete requests.
Closes#10859
This commit adds support for structural errors / failures / exceptions
on the elasticsearch REST layer. Exceptions are rendering with at least
a `type` and a `reason` corresponding to the exception name and the message.
Some expcetions like the ones associated with an index or a shard will have
additional information about the index the exception was triggered on or the
shard respectivly.
Each rendered response will also contain a list of root causes which is a list
of distinct shard level errors returned for the request. Root causes are the lowest
level elasticsearch exception found per shard response and are intended to be displayed
to the user to indicate the soruce of the exception.
Shard level response are by-default grouped by their type and reason to reduce the amount
of duplicates retunred. Yet, the same exception retunred from different indices will not be
grouped.
Closes#3303
This commit splits the current ClusterBlockLevel.METADATA into two disctins ClusterBlockLevel.METADATA_READ and ClusterBlockLevel.METADATA_WRITE blocks. It allows to make a distinction between
an operation that modifies the index or cluster metadata and an operation that does not change any metadata.
Before this commit, many operations where blocked when the cluster was read-only: Cluster Stats, Get Mappings, Get Snapshot, Get Index Settings, etc. Now those operations are allowed even when
the cluster or the index is read-only.
Related to #8102, #2833Closes#3703Closes#5855Closes#10521Closes#10522
The field stats api returns field level statistics such as lowest, highest values and number of documents that have at least one value for a field.
An api like this can be useful to explore a data set you don't know much about. For example you can figure at with the lowest and highest response times are, so that you can create a histogram or range aggregation with sane settings.
This api doesn't run a search to figure this statistics out, but rather use the Lucene index look these statics up (using Terms class in Lucene). So finding out these stats for fields is cheap and quick.
The min/max values are based on the type of the field. So for a numeric field min/max are numbers and date field the min/max date and other fields the min/max are term based.
Closes#10523
Also changed the stash logger to not log all stashed values under debug (it does trace now) but do dump the stash content upon failure (under info as a XContent)
Extends ShardStats with commit specific information. We currently expose commit id, generation and the user data map.
The information is also retrievable via the Rest API by using `GET _stats?level=shards`
Closes#10687
In Lucene 5.1 lots of filters got deprecated in favour of equivalent queries.
Additionally, random-access to filters is now replaced with approximations on
scorers. This commit
- replaces the deprecated NumericRangeFilter, PrefixFilter, TermFilter and
TermsFilter with NumericRangeQuery, PrefixQuery, TermQuery and TermsQuery,
wrapped in a QueryWrapperFilter
- replaces XBooleanFilter, AndFilter and OrFilter with a BooleanQuery in a
QueryWrapperFilter
- removes DocIdSets.isBroken: the new two-phase iteration API will now help
execute slow filters efficiently
- replaces FilterCachingPolicy with QueryCachingPolicy
Close#8960
This commit changes dynamic mappings updates so that they are synchronous on the
entire cluster and their validity is checked by the master node. There are some
important consequences of this commit:
- a failing index request on a non-existing type does not implicitely create
the type anymore
- dynamic mappings updates cannot create inconsistent mappings on different
shards
- indexing requests that introduce new fields might induce latency spikes
because of the overhead to update the mappings on the master node
Close#8688
This option defaults to false, because it is also important to upgrade
the "merely old" segments since many Lucene improvements happen within
minor releases.
But you can pass true to do the minimal work necessary to upgrade to
the next major Elasticsearch release.
The HTTP GET upgrade request now also breaks out how many bytes of
ancient segments need upgrading.
Closes#10213Closes#10540
Conflicts:
dev-tools/create_bwc_index.py
rest-api-spec/api/indices.upgrade.json
src/main/java/org/elasticsearch/action/admin/indices/optimize/OptimizeRequest.java
src/main/java/org/elasticsearch/action/admin/indices/optimize/ShardOptimizeRequest.java
src/main/java/org/elasticsearch/action/admin/indices/optimize/TransportOptimizeAction.java
src/main/java/org/elasticsearch/index/engine/InternalEngine.java
src/test/java/org/elasticsearch/bwcompat/StaticIndexBackwardCompatibilityTest.java
src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
src/test/java/org/elasticsearch/rest/action/admin/indices/upgrade/UpgradeReallyOldIndexTest.java
For bacwards compatibility reasons routing_nodes were previously printed out when routing_table was requested, together with the actual routing_table. Now they are printed out only when requests through `routing_nodes` flag.
Relates to #10412Closes#10486
Cluster state api returns both routing_table and routing_nodes sections whenever routing_table is requested. That is pretty much the same info, just grouped differently. This commit allows to differentiate between the two. Yet, routing_table still returns both for bw comp reasons.
Closes#10352Closes#10412
Align get indexed scripts and get search template apis to our get api, which returns a response body when the document is not found, with a found boolean flag. Also, return metadata info all the time too.
Closes#7325Closes#10396
RoutingTables activePrimaryShardsGrouped(), allActiveShardsGrouped() and
allAssignedShardsGrouped() methods treated empty index array input
parameters as meaning "all" indices and expanded to the routing maps
keyset. However, the expansion of index names is now already done in
MetaData#concreteIndices(). Returning an empty index name list here
when a wildcard pattern didn't match any index name could lead to
problems like #9081 because the RoutingTable still expanded this
list of names to "_all". In case of e.g. the recovery endpoint this
could lead to problems.
Closes#9081Closes#10148
This commit brings the benefits of the `count` search type to search requests
that have a `size` of 0:
- a single round-trip to shards (no fetch phase)
- ability to use the query cache
Since `count` now provides no benefits over `query_then_fetch`, it has been
deprecated.
Close#7630
Deleting a type from an index is inherently dangerous because
the type can be recreated with new mappings which may conflict
with existing segments still using the old mappings. This
removes the ability to delete a type (similar to how deleting
fields within a type is not allowed, for the same reason).
closes#8877closes#10231