We started passing down the root document's _source when processing
nested hits, to avoid reloading and reparsing the root source for each hit.
Unfortunately the approach did not work when there are multiple layers of
`inner_hits`. In this case, the second-layer inner hit received its immediate
parent's source instead of the root source. This parent source is filtered to
just contain the parts corresponding to the nested document, but the source
parsing logic is designed to always operate on the top-level root source. This
caused failures when loading the second-layer inner hits.
This PR makes sure to always pass the root document's _source when processing
inner hits, even if there are multiple layers.
This PR fixes two bugs that can arise when _source is disabled and we fetch nested documents:
* Fix exception when highlighting `inner_hits` with disabled _source.
* Fix exception in nested `top_hits` with disabled _source.
* Add more tests for highlighting `inner_hits`.
In general, we can't guarantee that a match_all query will return documents in
the order they were indexed. This PR adds an ID to each document to avoid
relying on document order.
This commit fixes the cat tasks api parameter specification and the
handler so that the parameters are consumed during request preparation.
Closes#59493
Backport of #66272
This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.
Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.
The regression was introduced in #60179, which started storing `SourceLookup`
on `FragmentsBuilder`.
Fixes#65533.
Now that we're consistently using `cat_match` to filter which shards we
run on we can get this confusing case:
1. You have a search with, say, a range and a sub-agg.
2. That search has a query that `can_match` can recognize will match no
docs. On *any* shard.
3. So we dutifully run it on a single shard so it can produce the
"empty" aggs.
4. The shard we pick happens to not have the target of the range mapped.
5. This kicks in the special range aggregator that doesn't collect any
documents.
6. Before this commit, that range aggregator *also* never produced any
sub-aggs.
So, without this change, it was quite possible for a search that
happened to match no documents to "throw away" the sub-aggs of a range
and a few other aggs.
We've had this problem for a long, long time but it is more confusing
now because `can_match` is really kicking in and causing us to see cases
where it looks like you are targeting a lot of shards but you really are
only targeting a couple. It used to be that to get the "no sub-aggs"
behavior you had to explicitly target only shards that didn't map the
target field of the `range` agg. And, like, in that case it isn't too
bad because you targeted a sort of degenerate shard. But now that
`can_match` is doing its thing you can end up with the confusing steps
above. It took me several hours to track down what what happening I know
how the individual pieces of all of this works. It took four hours to
figure out how they fit together in this case....
Anyway! This replaces all the aggregator implementations that throw out
the sub-aggregators with ones that keep them. I think this'll be less
confusing in the future.
Closes#64142
In 7.x we can't just by default generate this setting as it might not be
supported by data nodes that are assigned shards for an older version in mixed version
clusters.
Closes#64152
This commit adjusts the defaults for the tiered data roles so that they
are enabled by default, or if the node has the legacy data role. This
ensures that the default experience is that the tiered data roles are
enabled.
To fully specifiy the behavior for the tiered data roles then:
- starting a new node with the defaults: enabled
- starting a new node with node.roles configured: enabled if and only
if the tiered data roles are explicitly configured, independently
of the node having the data role
- starting a new node with node.data enabled: enabled unless the
tiered data roles are explicitly disabled
- starting a new node with node.data disabled: disabled unless the
tiered data roles are explicitly enabled
This PR implements value fetching for the following field types:
* `text` phrase and prefix subfields
* `search_as_you_type`, plus its subfields
* `token_count`, which is implemented by fetching doc values
Supporting these types helps ensure that retrieving all fields through
`"fields": ["*"]` doesn't fail because of unsupported value fetchers.
As a result of this, we can remove a chunk of code from TypeParsers as well. Tests
for search/index mode analyzers have moved into their own file. This commit also
rationalises the serialization checks for parameters into a single SerializerCheck
interface that takes the values includeDefaults, isConfigured and the value
itself.
Relates to #62988
We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.
This commit introduces a new FetchContext class that exposes only those
methods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.
This implements the `fields` API in `_search` for runtime fields using
doc values. Most of that implementation is stolen from the
`docvalue_fields` fetch sub-phase, just moved into the same API that the
`fields` API uses. At this point the `docvalue_fields` fetch phase looks
like a special case of the `fields` API.
While I was at it I moved the "which doc values sub-implementation
should I use for fetching?" question from a bunch of `instanceof`s to a
method on `LeafFieldData` so we can be much more flexible with what is
returned and we're not forced to extend certain classes just to make the
fetch phase happy.
Relates to #59332
This adds two extra bits of info to the profiler:
1. Count of the number of different types of collectors. This lets us figure
out if we're using the optimization for segment ordinals. It adds a few
more similar counters just for good measure.
2. Profiles the `getLeafCollector` and `postCollection` methods. These are
non-trivial for some aggregations, like cardinality.
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation
Addresses #61949.
This commit introduces a new API that manages point-in-times in x-pack
basic. Elasticsearch pit (point in time) is a lightweight view into the
state of the data as it existed when initiated. A search request by
default executes against the most recent point in time. In some cases,
it is preferred to perform multiple search requests using the same point
in time. For example, if refreshes happen between search_after requests,
then the results of those requests might not be consistent as changes
happening between searches are only visible to the more recent point in
time.
A point in time must be opened before being used in search requests. The
`keep_alive` parameter tells Elasticsearch how long it should keep a
point in time around.
```
POST /my_index/_pit?keep_alive=1m
```
The response from the above request includes a `id`, which should be
passed to the `id` of the `pit` parameter of search requests.
```
POST /_search
{
"query": {
"match" : {
"title" : "elasticsearch"
}
},
"pit": {
"id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
"keep_alive": "1m"
}
}
```
Point-in-times are automatically closed when the `keep_alive` is
elapsed. However, keeping point-in-times has a cost; hence,
point-in-times should be closed as soon as they are no longer used in
search requests.
```
DELETE /_pit
{
"id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
}
```
#### Notable works in this change:
- Move the search state to the coordinating node: #52741
- Allow searches with a specific reader context: #53989
- Add the ability to acquire readers in IndexShard: #54966
Relates #46523
Relates #26472
Co-authored-by: Jim Ferenczi <jimczi@apache.org>
Add test for item-level error when no write index defined for an alia…
Co-authored-by: Jake Landis <jake.landis@elastic.co>
Co-authored-by: bellengao <gbl_long@163.com>
The `global_ordinals` implementation of `terms` had a bug when
`min_doc_count: 0` that'd cause sub-aggregations to have array index out
of bounds exceptions. Ooops. My fault. This fixes the bug by assigning
ordinals to those buckets.
Closes#62084
We have a couple of yaml tests that index documents under a 'test' type, while they could omit it. We do want to still test that specifying the type is still allowed in 7.x but we already have specific tests for that, and other tests should use the endpoint that don't require specifying a type.
Previously we didn't retain the requested fields when performing a shallow copy
of the search source. This meant that when a search was rewritten, we could drop
the requested fields and fail to return them in the response.
* Some progress on failing runtime fields tests (bring #61098 to 7.x)
This breaks apart the a test for the `terms` aggregation into one that
work for runtime fields and one that doesn't.
Breaks up an integration test into one that runtime fields can run and
one that runtime fields have to skip. This is because runtime fields
don't have global ords and we assert things *about* global ords in the
test we have to skip.
Allows nanosecond resolution in search_after (#60328)
This fixes `search_after` to properly parse string formatted dates that
have nanosecond resolution.
Closes#52424
Adds a full list of supported aggregations to the node info API. This list
will be used in transform tests and telemetry mapping tests that will be added
as follow-up PRs.
Fixes#59774
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.
Addresses #49028 and #55363.
Transport connections between nodes remain in place until one or other
node shuts down or the connection is disrupted by a flaky network.
Today it is very difficult to demonstrate that transient failures and
cluster instability are caused by the network even though this is often
the case. In particular, transport connections open and close without
logging anything, even at `DEBUG` level, making it very hard to quantify
the scale of the problem or to correlate the networking problems with
external events.
This commit adds the missing `DEBUG`-level logging when transport
connections open and close, and also tracks the total number of
transport connections a node has opened as a measure of the stability of
the underlying network.
In #54716 I removed pipeline aggregators from the aggregation result
tree and caused us to read them from the request. This saves a bunch of
round trip bytes, which is neat. But there was a bug in the backwards
compatibility logic. You see, we still have to give the pipeline
aggregations to nodes older than 7.8 over the wire because that is how
they know what pipelines to run. They have the pipelines in the request
but they don't read them. They use the ones in the response tree.
Anyway, we had a bug where we were never sending pipelines defined two
levels down. So while you are upgrading the pipeline wouldn't run.
Sometimes. If the data node of the "first" result was post-7.8 and the
coordinating node was pre-7.8.
This fixes the bug.