The old HighlightParseElement was only left because it was still
used in tests and some places in InnerHits. This PR removes it
and replaces the tests that checked that the original parse element
and the rafactored highlighter code produce the same output with
new tests that compare builder input to the SearchContextHighlight
that is created.
This commit adds the new `action.search.shard_count.limit` setting which
configures the maximum number of shards that can be queried in a single search
request. It has a default value of 1000.
Today we don't invoke `IndexingOperationListeners` when we are running
a recovery form store or replaying translog from remote. This is problematic since
the actual code path for indexing is different between normal indexing and recovery.
An important detail is left out on recovery since we implemented the `IndexingMemoryController`
as an `IndexingOperationListener` we might never flush the `IndexWriter` of a recovering shard
which can lead to `OOMs` on node startup / recovery.
Both top level and inline inner hits are now covered by InnerHitBuilder.
Although there are differences between top level and inline inner hits,
they now make use of the same builder logic.
The parsing of top level inner hits slightly changed to be more readable.
Before the nested path or parent/child type had to be specified as encapsuting
json object, now these settings are simple fields. Before this was required
to allow streaming parsing of inner hits without missing contextual information.
Once some issues are fixed with inline inner hits (around multi level hierachy of inner hits),
top level inner hits will be deprecated and removed in the next major version.
This commit introduces SearchOperationListeneres which allow to hook
into search operation lifecycle and execute operations like slow-logs
and statistic collection in a transparent way. SearchOperationListenrs
can be registered on the IndexModule just like IndexingOperationListeners.
The main consumers (slow log) have already been moved out of IndexService
into IndexModule which reduces the dependency on IndexService as well as
IndexShard and makes slowlogging transparent.
Closes#17398
`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
Remove ability to specify arbitrary node attributes with `node.` prefix
Today the basic node settings like `node.data` and `node.master` can't really be fully validated
since we allow to specify custom user attributes on the node level. We have to, in order to
support that, add a wildcard setting for `node.*` to let these setting pass validation.
Instead we should require a more contraint prefix like `node.attr.` that defines a namespace
that is reserved for user attributes.
This commit adds a new namespace for attributes in `node.attr`.
Closes#17280
Today the basic node settings like `node.data` and `node.master` can't really be fully validated
since we allow to specify custom user attributes on the node level. We have to, in order to
support that, add a wildcard setting for `node.*` to let these setting pass validation.
Instead we should require a more contraint prefix like `node.attr.` that defines a namespace
that is reserved for user attributes.
This commit adds a new namespace for attributes in `node.attr`.
Closes#17280
This adds a `created` flag to `IndexingOperationListener#postIndex` to
easily differentiate between updates and creates on the listener level.
Closes#17333
Merges #17340
With restriction for the total number of fields introduced in #17357 this test can fail if a large number of records is randomly selected for indexing.
This is to prevent mapping explosion when dynamic keys such as UUID are used as field names. index.mapping.total_fields.limit specifies the total number of fields an index can have. An exception will be thrown when the limit is reached. The default limit is 1000. Value 0 means no limit. This setting is runtime adjustable
Closes#11443
Transport client was replacing the address of the nodes connecting to with the ones received from the liveness api rather keeping the original listed nodes. Written a test for that.
In extreme cases a local primary shard can be replaced with a replica while a replication request is in flight and the primary action is applied to the shard (via `acquirePrimaryOperationLock()). #17044 changed the exception used in that method to something that isn't recognized as `TransportActions.isShardNotAvailableException`, causing the operation to fail immediately instead of retrying. This commit fixes this by check the primary flag before
acquiring the lock. This is safe to do as an IndexShard will never be demoted once a primary.
Closes#17358
If a terms aggregation was ordered by a metric nested in a single bucket aggregator which did not collect any documents (e.g. a filters aggregation which did not match in that term bucket) an ArrayOutOfBoundsException would be thrown when the ordering code tried to retrieve the value for the metric. This fix fixes all numeric metric aggregators so they return their default value when a bucket ordinal is requested which was not collected.
Closes#17225
For geo distance sort parsing: Disallow anything but
VALUE_STRING as geo hash, disallow resetting field
name for geo fields.
Also make error message for wrong lat/lon values more
verbose by including the affected field name.
If you add a string field to a document, it will have the following default
mapping:
```
{
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
```
Today, if you call /index/type/_search instead of /index/_search, elasticsearch
will automatically insert a type filter to only match documents of the given
type. This commit uses a new TypeQuery instead of a TermQuery for this filter,
which rewrites to a MatchAllDocsQuery when all documents of a shard match the
filtered type. This is helpful because BooleanQuery has a special rewrite rule
to remove MatchAllDocsQuery as FILTER clauses. So for instance if your query is
`+body:"quick fox" #_type:my_type`, it will be rewritten to
`+body:"quick fox" #*:*` which is then rewritten to `body:"quick fox"`.
This adds a new `/_cluster/allocation/explain` API that explains why a
shard can or cannot be allocated to nodes in the cluster. Additionally,
it will show where the master *desires* to put the shard, according to
the `ShardsAllocator`.
It looks like this:
```
GET /_cluster/allocation/explain?pretty
{
"index": "only-foo",
"shard": 0,
"primary": false
}
```
Though, you can optionally send an empty body, which means "explain the
allocation for the first unassigned shard you find".
The output when a shard is unassigned looks like this:
```
{
"shard" : {
"index" : "only-foo",
"index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
"id" : 0,
"primary" : false
},
"assigned" : false,
"unassigned_info" : {
"reason" : "INDEX_CREATED",
"at" : "2016-03-22T20:04:23.620Z"
},
"nodes" : {
"V-Spi0AyRZ6ZvKbaI3691w" : {
"node_name" : "Susan Storm",
"node_attributes" : {
"bar" : "baz"
},
"final_decision" : "NO",
"weight" : 0.06666675,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
},
"Qc6VL8c5RWaw1qXZ0Rg57g" : {
"node_name" : "Slipstream",
"node_attributes" : {
"bar" : "baz",
"foo" : "bar"
},
"final_decision" : "NO",
"weight" : -1.3833332,
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
} ]
},
"PzdyMZGXQdGhqTJHF_hGgA" : {
"node_name" : "The Symbiote",
"node_attributes" : { },
"final_decision" : "NO",
"weight" : 2.3166666,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
}
}
}
```
And when the shard *is* assigned, the output looks like:
```
{
"shard" : {
"index" : "only-foo",
"index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
"id" : 0,
"primary" : true
},
"assigned" : true,
"assigned_node_id" : "Qc6VL8c5RWaw1qXZ0Rg57g",
"nodes" : {
"V-Spi0AyRZ6ZvKbaI3691w" : {
"node_name" : "Susan Storm",
"node_attributes" : {
"bar" : "baz"
},
"final_decision" : "NO",
"weight" : 1.4499999,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
},
"Qc6VL8c5RWaw1qXZ0Rg57g" : {
"node_name" : "Slipstream",
"node_attributes" : {
"bar" : "baz",
"foo" : "bar"
},
"final_decision" : "CURRENTLY_ASSIGNED",
"weight" : 0.0,
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
} ]
},
"PzdyMZGXQdGhqTJHF_hGgA" : {
"node_name" : "The Symbiote",
"node_attributes" : { },
"final_decision" : "NO",
"weight" : 3.6999998,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
}
}
}
```
Only "NO" decisions are returned by default, but all decisions can be
shown by specifying the `?include_yes_decisions=true` parameter in the
request.
Resolves#14593
Today we might run into a rejected execution exception when we shutdown the node while handling a transport exception. The exception is run in a seperate thread but that thread might not be able to execute due to the shutdown. Today we barf and fill the logs with large exception. This commit catches this exception and logs it as debug logging instead.
Extends changes made in 8652cd8
This commit fixes a bug that was causing the result of TransportTasksAction#filterNodeIds to be ignored and as a result the tasks actions were executed on all nodes.
Node roles are now serialized as well, they are not part of the node attributes anymore. DiscoveryNodeService takes care of dividing settings into attributes and roles. DiscoveryNode always requires to pass in attributes and roles separately.