By default, tasks are grouped by node. However, task execution in elasticsearch can be quite complex and an individual task that runs on a coordinating node can have many subtasks running on other nodes in the cluster. This commit makes it possible to list task grouped by common parents instead of by node. When this option is enabled all subtask are grouped under the coordinating node task that started all subtasks in the group. To group tasks by common parents, use the following syntax:
GET /tasks?group_by=parents
This allows the user to update the reindex throttle on the fly, with changes
that speed up the throttling being applied immediately and changes that
slow down the throttling being applied during the next batch. This means
that if a user throttles reindex in such a way that it tries to sleep for
16 years and then realizes that they've done something wrong then they
can change the throttle and reindex will wake up again. We don't apply
slow downs immediately so we never get in danger of losing the scan context.
Also, if reindex is canceled while it is sleeping (how it honor throttling)
then it'll immediately wake up and cancel itself.
The old HighlightParseElement was only left because it was still
used in tests and some places in InnerHits. This PR removes it
and replaces the tests that checked that the original parse element
and the rafactored highlighter code produce the same output with
new tests that compare builder input to the SearchContextHighlight
that is created.
This commit adds the new `action.search.shard_count.limit` setting which
configures the maximum number of shards that can be queried in a single search
request. It has a default value of 1000.
Today we don't invoke `IndexingOperationListeners` when we are running
a recovery form store or replaying translog from remote. This is problematic since
the actual code path for indexing is different between normal indexing and recovery.
An important detail is left out on recovery since we implemented the `IndexingMemoryController`
as an `IndexingOperationListener` we might never flush the `IndexWriter` of a recovering shard
which can lead to `OOMs` on node startup / recovery.
Both top level and inline inner hits are now covered by InnerHitBuilder.
Although there are differences between top level and inline inner hits,
they now make use of the same builder logic.
The parsing of top level inner hits slightly changed to be more readable.
Before the nested path or parent/child type had to be specified as encapsuting
json object, now these settings are simple fields. Before this was required
to allow streaming parsing of inner hits without missing contextual information.
Once some issues are fixed with inline inner hits (around multi level hierachy of inner hits),
top level inner hits will be deprecated and removed in the next major version.
This commit introduces SearchOperationListeneres which allow to hook
into search operation lifecycle and execute operations like slow-logs
and statistic collection in a transparent way. SearchOperationListenrs
can be registered on the IndexModule just like IndexingOperationListeners.
The main consumers (slow log) have already been moved out of IndexService
into IndexModule which reduces the dependency on IndexService as well as
IndexShard and makes slowlogging transparent.
Closes#17398
`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
Remove ability to specify arbitrary node attributes with `node.` prefix
Today the basic node settings like `node.data` and `node.master` can't really be fully validated
since we allow to specify custom user attributes on the node level. We have to, in order to
support that, add a wildcard setting for `node.*` to let these setting pass validation.
Instead we should require a more contraint prefix like `node.attr.` that defines a namespace
that is reserved for user attributes.
This commit adds a new namespace for attributes in `node.attr`.
Closes#17280
Today the basic node settings like `node.data` and `node.master` can't really be fully validated
since we allow to specify custom user attributes on the node level. We have to, in order to
support that, add a wildcard setting for `node.*` to let these setting pass validation.
Instead we should require a more contraint prefix like `node.attr.` that defines a namespace
that is reserved for user attributes.
This commit adds a new namespace for attributes in `node.attr`.
Closes#17280
This adds a `created` flag to `IndexingOperationListener#postIndex` to
easily differentiate between updates and creates on the listener level.
Closes#17333
Merges #17340
When we test we add `-Djna.nosys=true` to the system properties but
we don't add it to system properties when running the naming conventions
test. This was causing the build to fail on a newly minted Ubuntu 15.10
machine, presumably because I made the mistake of installing maven using
the system package manager.
With restriction for the total number of fields introduced in #17357 this test can fail if a large number of records is randomly selected for indexing.