The create request now requires that an ID be present.
Currently the clients hard code a create method, but
we should just add a create REST spec so this method
can be autogenerated.
Today we don't parse alias filters on the coordinating node, we only forward
the alias patters to executing node and resolve it late. This has several problems
like requests that go through filtered aliases are never cached if they use date math,
since the parsing happens very late in the process even without rewriting. It also used
to be processed on every shard while we can only do it once per index on the coordinating node.
Another nice side-effect is that we are never prone to cluster-state updates that change an alias,
all nodes will execute the exact same alias filter since they are process based on the same
cluster state.
* Adding built-in sorting capability to _cat apis.
Closes#16975
* addressing pr comments
* changing value types back to original implementation and fixing cosmetic issues
* Changing compareTo, hashCode of value types to a better implementation
* Changed value compareTos to use Double.compare instead of if statements + fixed some failed unit tests
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).
This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.
Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.
Relates #20722
This commit changes the default behavior of `_flush` to block if other flushes are ongoing.
This also removes the use of `FlushNotAllowedException` and instead simply return immediately
by skipping the flush. Users should be aware if they set this option that the flush might or might
not flush everything to disk ie. no transactional behavior of some sort.
Closes#20569
Adds a cat api endpoint: /_cat/templates and its more specific version, /_cat/templates/{name}.
It looks something like:
$ curl "localhost:9200/_cat/templates?v"
name template order version
sushi_california_roll *avocado* 1 1
pizza_hawaiian *pineapples* 1
pizza_pepperoni *pepperoni* 1
The specified version (only allows * globs) looks like:
$ curl "localhost:9200/_cat/templates/pizza*"
name template order version
pizza_hawaiian *pineapples* 1
pizza_pepperoni *pepperoni* 1
Partially specified columns:
$ curl "localhost:9200/_cat/templates/pizza*?v=true&h=name,template"
name template
pizza_hawaiian *pineapples*
pizza_pepperoni *pepperoni*
The help text:
$ curl "localhost:9200/_cat/templates/pizza*?help"
name | n | template name
template | t | template pattern string
order | o | template application order number
version | v | version
Closes#20467
We can now run templates using `explain` and/or `profile` parameters.
Which is interesting when you have defined a complicated profile but want to debug it in an easier way than running the full query again.
You can use `explain` parameter when running a template:
```js
GET /_search/template
{
"file": "my_template",
"params": {
"status": [ "pending", "published" ]
},
"explain": true
}
```
You can use `profile` parameter when running a template:
```js
GET /_search/template
{
"file": "my_template",
"params": {
"status": [ "pending", "published" ]
},
"profile": true
}
```
The command-line arguments for Elasticsearch must now be specified using
-E. This commit fixes the usage of command-line arguments in the REST
API spec README.
The only repository we can be sure is safe to clean is `fs` so we clean
any snapshots in those repositories after each test. Other repositories
like url and azure tend to throw exceptions rather than let us fetch
their contents during the REST test. So we clean what we can....
Closes#18159
The refresh description should indicate that the affected shards are
refreshed as opposed to the entire index.
This was raised as a discrepancy on
discuss (https://discuss.elastic.co/t/refresh-parameter-of-index-api/59008/2)
on the .NET client that originates from code generated from the rest api
spec. The description has been updated in master but should be updated
for the 2.4.0 release.
We put the rest api spec into a jar for upload to maven, so that we can
use within external rest tests. This change adds making a pom for maven
(as well as producing sources and javadoc jars, even though they will be
empty, because maven central requires them).
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.
The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain
The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain
The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.
Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.
Fixes#20155
This commit adds a health status parameter to the cat indices API for
filtering on indices that match the specified status (green|yellow|red).
Relates #20393
Add docs to template support for _msearch
Relates to #10885
Relates to #15674
* Reference those docs from the rest api spec for _msearch/template support.
memory are much less than the total memory, the percentage
returned could be 0%. The yaml tests check that the free/used
percentage are valid values by asserting `is_true`, but it
turns out that `is_true` returns false if the value is
assigned but it is 0 or even the string "0". This commit
changes the assertion in the yaml test to ensure the value
is greater than or equal to 0 instead.
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.
Resolves#19769
This adds a version field to Templates, which is itself is unused by Elasticsearch, but exists for users to better manage their own templates. Like description, it's optional.
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.
Resolves#19769
Currently it does not because our parsers do not support big integers/decimals
(on purpose) but we do not have to ask our parser for the number type, we can
just ask the jackson parser for a number representation of the value with the
right type.
Note that I did not add similar tests for big decimals because Jackson seems to
never return big decimals, even for decimal values that are out of the range of
values that can be represented by doubles.
Closes#11508
The mem section was buggy in cluster stats and removed. It is now added back with the same structure as in node stats, containing total memory, available memory, used memory and percentages. All the values are the sum of all the nodes across the cluster (or at least the ones that we were able to get the values from).
If elasticsearch controls the ID values as well as the documents
version we can optimize the code that adds / appends the documents
to the index. Essentially we an skip the version lookup for all
documents unless the same document is delivered more than once.
On the lucene level we can simply call IndexWriter#addDocument instead
of #updateDocument but on the Engine level we need to ensure that we deoptimize
the case once we see the same document more than once.
This is done as follows:
1. Mark every request with a timestamp. This is done once on the first node that
receives a request and is fixed for this request. This can be even the
machine local time (see why later). The important part is that retry
requests will have the same value as the original one.
2. In the engine we make sure we keep the highest seen time stamp of "retry" requests.
This is updated while the retry request has its doc id lock. Call this `maxUnsafeAutoIdTimestamp`
3. When the engine runs an "optimized" request comes, it compares it's timestamp with the
current `maxUnsafeAutoIdTimestamp` (but doesn't update it). If the the request
timestamp is higher it is safe to execute it as optimized (no retry request with the same
timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one
and update the `maxUnsafeAutoIdTimestamp` unless it's been updated already to a higher value
Relates to #19813
* Params improvements to Cluster Health API wait for shards
Previously, the cluster health API used a strictly numeric value
for `wait_for_active_shards`. However, with the introduction of
ActiveShardCount and the removal of write consistency level for
replication operations, `wait_for_active_shards` is used for
write operations to represent values for ActiveShardCount. This
commit moves the cluster health API's usage of `wait_for_active_shards`
to be consistent with its usage in the write operation APIs.
This commit also changes `wait_for_relocating_shards` from a
numeric value to a simple boolean value `wait_for_no_relocating_shards`
to set whether the cluster health operation should wait for
all relocating shards to complete relocation.
* Addresses code review comments
* Don't be lenient if `wait_for_relocating_shards` is set
While removing an index isn't actually an alias action, if we add
an alias action that deletes an index then we can delete and index
and add an alias with the same name as the index atomically, in
the same cluster state update.
Closes#20064
This commit adds the support for exclusion filter to the response filtering (filter_path) feature. It changes the XContentBuilder APIs so that it now accepts two types of filters: inclusive and exclusive. Filters are no more String arrays but sets of String instead.
Adds an explicit recoverySource field to ShardRouting that characterizes the type of recovery to perform:
- fresh empty shard copy
- existing local shard copy
- recover from peer (primary)
- recover from snapshot
- recover from other local shards on same node (shrink index action)
This makes GET operations more consistent with `_search` operations which expect
`(stored_)fields` to work on stored fields and source filtering to work on the
`_source` field. This is now possible thanks to the fact that GET operations
do not read from the translog anymore (#20102) and also allows to get rid of
`FieldMapper#isGenerated`.
The `_termvectors` API (and thus more_like_this too) was relying on the fact
that GET operations would extract fields from either stored fields or the source
so the logic to do this that used to exist in `ShardGetService` has been moved
to `TermVectorsService`. It would be nice that term vectors do not rely on this,
but this does not seem to be a low hanging fruit.
The network types in use on a cluster can be useful information to have,
so this commit adds aggregate metrics for the network types in use in a
cluster to the cluster stats.
Relates #20144
This change adds a special field named _none_ that allows to disable the retrieval of the stored fields in a search request or in a TopHitsAggregation.
To completely disable stored fields retrieval (including disabling metadata fields retrieval such as _id or _type) use _none_ like this:
````
POST _search
{
"stored_fields": "_none_"
}
````
Today we do a lot of accounting inside the engine to maintain locations
of documents inside the transaction log. This is only needed to ensure
we can return the documents source from the engine if it hasn't been refreshed.
Aside of the added complexity to be able to read from the currently writing translog,
maintainance of pointers into the translog this also caused inconsistencies like different values
of the `_ttl` field if it was read from the tlog or not. TermVectors are totally different if
the document is fetched from the tranlog since copy fields are ignored etc.
This chance will simply call `refresh` if the documents latest version is not in the index. This
streamlines the semantics of the `_get` API and allows for more optimizations inside the engine
and on the transaction log. Note: `_refresh` is only called iff the requested document is not refreshed
yet but has recently been updated or added.
#Relates to #19787