* [Analysis] Parse synonyms with the same analysis chain
Synonym Token Filter / Synonym Graph Filter tokenize synonyms with whatever tokenizer and token filters appear before it in the chain.
Close#7199
This commit removes the global caching of the field query and replaces it with
a caching per field. Each field can use a different `highlight_query` and the rewriting of
some queries (prefix, automaton, ...) depends on the targeted field so the query used for highlighting
must be unique per field.
There might be a small performance penalty when highlighting multiple fields since the query needs to be rewritten
once per highlighted field with this change.
Fixes#25171
This commit changes the BWC versions on the get mapping 404s now that
this API returning 404s when a type is missing is supported since 5.5.0.
Relates #23192
Get mappings HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
mappings HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.
Relates #23192
This PR enables Ingest plugins to leverage processor-scoped REST
endpoints. First of which being the Grok endpoint that retrieves
Grok Patterns for users to retrieve all the built-in patterns.
Example usage: Kibana Grok Autocomplete!
When we disabled `_all` by default for indices created in 6.0, we missed adding
a layer that would handle the situation where `_all` was not enabled in 5.x and
then the cluster was updated to 6.0, this means that when the cluster was
updated the `_all` field would be disabled for 5.x indices and field values
would not be added to the `_all` field.
This adds a compatibility layer for 5.x indices where we treat the default
enabled value for the `_all` field to be `true` if unset on 5.x indices.
Resolves#25068
Previously this would output:
```
GET /test-1/_mappings
{ }
```
And after this change:
```
GET /test-1/_mappings
{
"test-1": {
"mappings": {}
}
}
```
To bring parity back to the REST output after #24723.
Relates to #25090
Previously in #24723 we changed the `_alias` API to not go through the
`RestGetIndicesAction` endpoint, instead creating a `RestGetAliasesAction` that
did the same thing.
This changes the formatting so that it matches the old formatting of the
endpoint, before:
```
GET /test-1/_alias
{ }
```
And after this change:
```
GET /test-1/_alias
{
"test-1": {
"aliases": {}
}
}
```
This is related to #25090
This commit refactors the query phase in order to be able
to automatically detect queries that can be early terminated.
If the index sort matches the query sort, the top docs collection is early terminated
on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector.
This change also adds a new parameter to the search request called `track_total_hits`.
It indicates if the total number of hits that match the query should be tracked.
If false, queries sorted by the index sort will not try to compute this information and
and will limit the collection to the first N documents per segment.
Aggregations are not impacted and will continue to see every document
even when the index sort matches the query sort and `track_total_hits` is false.
Relates #6720
This change skips rest tests that use mutlitple types if the cluster
is a pure 6.x cluster. This allows all indics to be created with a version
less than 6.0 and that means we can safely use the `mapping.single_type` setting.
Relates to #24961
Previous work modified the status code on the get aliases API when an
alias is missing so that these requests 404 now. This change was also
backported to 5.5 so we can adjust the skips to skip everything before
5.5.0.
Previously the HEAD and GET aliases endpoints were misaigned in
behavior. The HEAD verb would 404 if any aliases are missing while the
GET verb would not if any aliases existed. When HEAD was aligned with
GET, this broke the previous usage of HEAD to serve as an existence
check for aliases. It is the behavior of GET that is problematic here
though, if any alias is missing the request should 404. This commit
addresses this by modifying the behavior of GET to behave in this
way. This fixes the behavior for HEAD to also 404 when aliases are
missing.
Relates #25043
REST handlers that require a body will throw an an ElasticsearchParseException "request body required".
REST handlers that require a body OR source param will throw an ElasticsearchParseException "request body or source param required".
Replaced asserts in BulkRequest parsing code with a more descriptive IllegalArgumentException if the line contains an empty object.
Updated bulk REST test to verify an empty action line is rejected properly.
Updated BulkRequestTests with randomized testing for an empty action line.
Used try-with-resouces for XContentParser in AbstractBulkByQueryRestHandler.
This removes the parsing of things like `GET /idx/_aliases,_mappings`, instead,
a user must choose between retriving all index metadata with `GET /idx`, or only
a specific form such as `GET /idx/_settings`.
Relates to (and is a prerequisite of) #24437
* Adds nodes usage API to monitor usages of actions
The nodes usage API has 2 main endpoints
/_nodes/usage and /_nodes/{nodeIds}/usage return the usage statistics
for all nodes and the specified node(s) respectively.
At the moment only one type of usage statistics is available, the REST
actions usage. This records the number of times each REST action class is
called and when the nodes usage api is called will return a map of rest
action class name to long representing the number of times each of the action
classes has been called.
Still to do:
* [x] Create usage service to store usage statistics
* [x] Record usage in REST layer
* [x] Add Transport Actions
* [x] Add REST Actions
* [x] Tests
* [x] Documentation
* Rafactors UsageService so counts are done by the handlers
* Fixing up docs tests
* Adds a name to all rest actions
* Addresses review comments
This commit adds an optional `context` url parameter to the put stored
script request. When a context is specified, the script is compiled
against that context before storing, as a validation the script will
work when used in that context.
* SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results.
Closes#23674
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
* Add parent-join module
This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.
Relates #20257
This commit adds support for histogram and date_histogram agg compound order by refactoring and reusing terms agg order code. The major change is that the Terms.Order and Histogram.Order classes have been replaced/refactored into a new class BucketOrder. This is a breaking change for the Java Transport API. For backward compatibility with previous ES versions the (date)histogram compound order will use the first order. Also the _term and _time aggregation order keys have been deprecated; replaced by _key.
Relates to #20003: now that all these aggregations use the same order code, it should be easier to move validation to parse time (as a follow up PR).
Relates to #14771: histogram and date_histogram aggregation order will now be validated at reduce time.
Closes#23613: if a single BucketOrder that is not a tie-breaker is added with the Java Transport API, it will be converted into a CompoundOrder with a tie-breaker.
Currently, the get snapshots API (e.g. /_snapshot/{repositoryName}/_all)
provides information about snapshots in the repository, including the
snapshot state, number of shards snapshotted, failures, etc. In order
to provide information about each snapshot in the repository, the call
must read the snapshot metadata blob (`snap-{snapshot_uuid}.dat`) for
every snapshot. In cloud-based repositories, this can be expensive,
both from a cost and performance perspective. Sometimes, all the user
wants is to retrieve all the names/uuids of each snapshot, and the
indices that went into each snapshot, without any of the other status
information about the snapshot. This minimal information can be
retrieved from the repository index blob (`index-N`) without needing to
read each snapshot metadata blob.
This commit enhances the get snapshots API with an optional `verbose`
parameter. If `verbose` is set to false on the request, then the get
snapshots API will only retrieve the minimal information about each
snapshot (the name, uuid, and indices in the snapshot), and only read
this information from the repository index blob, thereby giving users
the option to retrieve the snapshots in a repository in a more
cost-effective and efficient manner.
Closes#24288
Now that indices have a single type by default, we can move to the next step
and identify documents using their `_id` rather than the `_uid`.
One notable change in this commit is that I made deletions implicitly create
types. This helps with the live version map in the case that documents are
deleted before the first type is introduced. Otherwise there would be no way
to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from
`DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are
different if versioning is involved.
If a node in version >= 5.3 acts as a coordinating node during a scroll request that targets a single shard, the scroll may return the same documents over and over iff the targeted shard is hosted by a node with a version <= 5.3.
The nodes in this version will advance the scroll only if the search_type has been set to `query_and_fetch` though this search type has been removed in 5.3.
This change handles this situation by adding the removed search_type in the request that targets a node in version <= 5.3.
If a field caps request contains a field name that doesn't exist in all indices
the response will be partial and we hide an NPE. The NPE is now fixed but we still
have the problem that we don't pass on errors on the shard level to the user. This will
be fixed in a followup.
`_search_shards`API today only returns aliases names if there is an alias
filter associated with one of them. Now it can be useful to see which aliases
have been expanded for an index given the index expressions. This change also includes non-filtering aliases even without a filtering alias being present.
This method has to do with how the transport action may or may not resolve wildcards expressions to aliases names. It is only needed in TransportIndicesAliasesAction and for this reason it should be a private method in it rather than part of a request class which is also part of the Java API and later in the high level REST client.
The alias parameter was documented as a list in our rest-spec, yet only the first value out of a list was getting read and processed. This commit adds support for multiple aliases to _cat/aliases
Closes#23661