Commit Graph

5340 Commits

Author SHA1 Message Date
David Turner 06d5d360f9 Tidy up fillInStackTrace implementations (#62555)
Removes the unnecessary `synchronized` introduced in #62433 and adjusts
the others to return `this` not `null` as required by the parent
method's Javadocs.
2020-09-18 08:29:48 +01:00
Ignacio Vera 6a3d731be1
Only call reduce on a single InternalAggregation when needed (#62525) (#62594)
Adds a new abstract method in InternalAggregation that flags the framework if it needs to reduce on a single InternalAggregation.
2020-09-18 08:43:58 +02:00
Nhat Nguyen 0127b71901 Adjust keep alive assertion in ShardSearchRequest (#62582)
Relates #62184
2020-09-17 16:09:54 -04:00
Lee Hinman 9bb7ce0b22
[7.x] Allocate new indices on "hot" or "content" tier depending on data stream inclusion (#62338) (#62557)
Backports the following commits to 7.x:

    Allocate new indices on "hot" or "content" tier depending on data stream inclusion (#62338)
2020-09-17 13:29:23 -06:00
Martijn van Groningen 5f643433c6
Prohibit the usage of create index api in namespaces managed by data stream templates (#62574)
Backport of #62527 to 7.x branch.

This commit adds validation that prohibits the creation of regular indices
in the namespace of templates with data streams enabled.

It shouldn't be possible to create ordinary indices when the name of the index
matches with a composable index template that enables data streams. Auto creation
has logic that creates data streams instead of regular indices. However validation
logic for the create index api was missing.
2020-09-17 20:10:42 +02:00
Jim Ferenczi df93b31b15
Faster sequential access for stored fields (#62509) (#62573)
Faster sequential access for stored fields

Spinoff of #61806
Today retrieving stored fields at search time is optimized for random access.
So we make no effort to keep state in order to not decompress the same data
multiple times because two documents might be in the same compressed block.
This strategy is acceptable when retrieving a top N sorted by score since
there is no guarantee that documents will be on the same block.
However, we have some use cases where the document to retrieve might be
completely sequential:

Scrolls or normal search sorted by document id.
Queries on Runtime fields that extract from _source.
This commit exposes a sequential stored fields reader in the
custom leaf reader that we use at search time.
That allows to leverage the merge instances of stored fields readers that
are optimized for sequential access.
This change focuses on the fetch phase for now and leverages the merge instances
for stored fields only if all documents to retrieve are adjacent.
Applying the same logic in the source lookup of runtime fields should
be trivial but will be done in a follow up.

The speedup on queries sorted by doc id is significant.
I played with the scroll task of the http_logs rally track
on my laptop and had the following result:

|                                                        Metric |   Task |    Baseline |   Contender |     Diff |    Unit |
|--------------------------------------------------------------:|-------:|------------:|------------:|---------:|--------:|
|                                            Total Young Gen GC |        |       0.199 |       0.231 |    0.032 |       s |
|                                              Total Old Gen GC |        |           0 |           0 |        0 |       s |
|                                                    Store size |        |     17.9704 |     17.9704 |        0 |      GB |
|                                                 Translog size |        | 2.04891e-06 | 2.04891e-06 |        0 |      GB |
|                                        Heap used for segments |        |    0.820332 |    0.820332 |        0 |      MB |
|                                      Heap used for doc values |        |    0.113979 |    0.113979 |        0 |      MB |
|                                           Heap used for terms |        |     0.37973 |     0.37973 |        0 |      MB |
|                                           Heap used for norms |        |     0.03302 |     0.03302 |        0 |      MB |
|                                          Heap used for points |        |           0 |           0 |        0 |      MB |
|                                   Heap used for stored fields |        |    0.293602 |    0.293602 |        0 |      MB |
|                                                 Segment count |        |         541 |         541 |        0 |         |
|                                                Min Throughput | scroll |     12.7872 |     12.8747 |  0.08758 | pages/s |
|                                             Median Throughput | scroll |     12.9679 |     13.0556 |  0.08776 | pages/s |
|                                                Max Throughput | scroll |     13.4001 |     13.5705 |  0.17046 | pages/s |
|                                       50th percentile latency | scroll |     524.966 |     251.396 |  -273.57 |      ms |
|                                       90th percentile latency | scroll |     577.593 |     271.066 | -306.527 |      ms |
|                                      100th percentile latency | scroll |      664.73 |     272.734 | -391.997 |      ms |
|                                  50th percentile service time | scroll |     522.387 |     248.776 | -273.612 |      ms |
|                                  90th percentile service time | scroll |     573.118 |      267.79 | -305.328 |      ms |
|                                 100th percentile service time | scroll |     660.642 |     268.963 | -391.678 |      ms |
|                                                    error rate | scroll |           0 |           0 |        0 |       % |
Closes #62024
2020-09-17 19:58:18 +02:00
Alan Woodward 5421a743a7 Move SearchLookup into FetchContext (#62549)
FetchSubPhase#getProcessor currently takes a SearchLookup parameter. This
however is only needed by a couple of subphases, and will almost certainly change in
future as we want to simplify how fetch phases retrieve values for individual hits.

To future-proof against further signature changes, this commit moves the SearchLookup
reference into FetchContext instead.
2020-09-17 17:39:02 +01:00
Alan Woodward e3e3aef3d8 Load version metadata even when stored fields are disabled (#62533)
Currently we throw an error if stored fields are disabled, but hit version metadata is
requested on a search. This doesn't make much sense, as the version information
is stored in docvalues and so has no connection with stored fields.

This commit removes the link between the two, allowing version metadata to be loaded
even when stored fields are disabled in a request.

Fixes #62456
2020-09-17 17:39:02 +01:00
Alan Woodward 91e2330529 Warn on badly-formed null values for date and IP field mappers (#62487)
In #57666 we changed when null_value was parsed for ip and date fields. Previously,
the null value was stored as a string, and parsed into a date or InetAddress whenever
a document containing a null value was encountered. Now, the values are parsed when
the mappings are built, which means that bad values are detected up front; if you try and
add a mapping with a badly-parsed ip or date for a null_value, the mapping will be
rejected.

This causes problems for upgrades in the case when you have a badly-formed null_value
in a pre-7.9 cluster. This commit fixes the upgrade case by changing the logic to only
logging a warning on the badly formed value, replicating the earlier behaviour.

Fixes #62363
2020-09-17 16:38:08 +01:00
Ignacio Vera 901000891a
Fix test error in InternalCardinalityTests#testEqualsAndHashcode (#62542) (#62554)
Make sure the the new HLL++ is different to the original one
2020-09-17 17:09:13 +02:00
Alan Woodward 63afc61b08 Introduce FetchContext (#62357)
We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.

This commit introduces a new FetchContext class that exposes only those
methods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.
2020-09-17 09:57:43 +01:00
Adrien Grand e0a4a94985
Speed up merging when source is disabled. (#62443) (#62474)
The CodecReader wrapper we use to remove the `_recovery_source` field
doesn't override `StoredFieldsreader#getMergeInstance`, which has the
undesired side-effect of preventing the wrapped stored fields reader
from optimizing merging.
2020-09-17 10:53:31 +02:00
David Turner 62dcc5b1ae Suppress stack in VersionConflictEngineException (#62433)
`VersionConflictEngineException` is thrown on the hot path for updates,
but stack traces are expensive to compute and transport and rarely
useful for this kind of exception. This commit avoids computing the
stack trace for these exceptions.
2020-09-17 09:40:07 +01:00
Armin Braun 5112c17319
Add WARN Logging on Slow Transport Message Handling (#62444) (#62521)
Add simple WARN logging on slow inbound TCP messages.
2020-09-17 10:12:20 +02:00
David Turner 14aec44cd8 Log if recovery affected by disconnect (#62437)
Today we only emit `DEBUG` logs if the source disconnects from the
target during a recovery. This deserves to be noisier by default since
it should be rare and may help users identify other problems with their
network or with their shard movements.

This commit promotes this message to `INFO`. There's no need for `WARN`
since these days we will normally resume the recovery where it left off.
2020-09-17 08:22:40 +01:00
Ignacio Vera 2d3ca9c155
Introduce a sparse HyperLogLogPlusPlus class for cloning and serializing low cardinality buckets (#62480) (#62520)
Reduces the memory footprint of an HLL++ structure that uses Linear counting when cloning or deserialising the data structure.
2020-09-17 08:54:50 +02:00
Julie Tibshirani e1da558206 Remove unused test search context for significant_terms. 2020-09-16 14:27:11 -07:00
Jay Modi 5da922064f
LocalNodeMasterListener is a regular listener (#62485)
This commit makes the LocalNodeMasterListener interface extend the
ClusterStateListener interface and use a default implementation for
detecting whether the local node master status changed.

Backport of #62422
2020-09-16 11:42:53 -06:00
Tanguy Leroux 8a2e9e66d4
Wait for relocations and disk threshold monitor in DiskThresholdDeciderIT (#62358) (#62467)
Closes #62326
2020-09-16 17:40:20 +02:00
Armin Braun f6a8599cf8
Don't Start Redundant ConsistentSettingsService (#62283) (#62428)
The consistent settings service is only used in tests so far. No need to start it
unless it's actually used.
2020-09-16 09:43:04 +02:00
Ignacio Vera f3ed641fc7
Adds bucketOrd back to cardinality algorithms (#62389) (#62427) 2020-09-16 08:41:57 +02:00
Nik Everett 24a24d050a
Implement fields fetch for runtime fields (backport of #61995) (#62416)
This implements the `fields` API in `_search` for runtime fields using
doc values. Most of that implementation is stolen from the
`docvalue_fields` fetch sub-phase, just moved into the same API that the
`fields` API uses. At this point the `docvalue_fields` fetch phase looks
like a special case of the `fields` API.

While I was at it I moved the "which doc values sub-implementation
should I use for fetching?" question from a bunch of `instanceof`s to a
method on `LeafFieldData` so we can be much more flexible with what is
returned and we're not forced to extend certain classes just to make the
fetch phase happy.

Relates to #59332
2020-09-15 20:24:10 -04:00
Nik Everett 0a7f335215
Speed up writeVInt (backport of #62345) (#62419)
This speeds up `StreamOutput#writeVInt` quite a bit which is nice
because it is *very* commonly called when serializing aggregations. Well,
when serializing anything. All "collections" serialize their size as a
vint. Anyway, I was examining the serialization speeds of `StringTerms`
and this saves about 30% of the write time for that. I expect it'll be
useful other places.
2020-09-15 17:14:08 -04:00
Nik Everett 771a8893a6
Add more debugging information for cardinality agg (#62317) (#62397)
This adds two extra bits of info to the profiler:
1. Count of the number of different types of collectors. This lets us figure
   out if we're using the optimization for segment ordinals. It adds a few
   more similar counters just for good measure.
2. Profiles the `getLeafCollector` and `postCollection` methods. These are
   non-trivial for some aggregations, like cardinality.
2020-09-15 13:21:11 -04:00
Armin Braun ffbc64bd10
Log WARN on Response Deserialization Failure (#62368) (#62388)
We never see this exception in the logs even though it's pretty severe.
All we might see is an exception about a transport message not having been read fully
from the logic that follows this code.
Technically we should probably bubble up the exception but that's a bigger change
and needs some carefully reasoning, this change for the time being at least simplifies
tracking down deserialization issues in responses.
2020-09-15 18:27:39 +02:00
Alan Woodward f89fa421e2 Remove unnecessary IndexSearcher field on HitContext (#62378)
FastVectorHighlighter uses the top-level reader to rewrite queries against, which
it gets via an IndexSearcher field on HitContext. However, we can already access
this top-level reader via HitContext's existing LeafReaderContext field.

This commit removes the unnecessary field and constructor parameter, and
changes the implementation of topLevelReader to go via ReaderUtils and
the leaf reader context.
2020-09-15 15:46:14 +01:00
Christoph Büscher 0ca9829867 Muting CoordinatorTests#testLogsMessagesIfPublicationDelayed 2020-09-15 15:40:51 +02:00
Albert Zaharovits aeed1c05b0
Ensure authz operation overrides transient authz headers (#61621)
AuthorizationService#authorize uses the thread context to carry the result of the
authorisation as transient headers. The listener argument to the `authorize` method
must necessarily observe the header values. This PR makes it so that
the authorisation transient headers (`_indices_permissions` and `_authz_info`, but
NOT `_originating_action_name`) of the child action override the ones of the parent action.

Co-authored-by: Tim Vernum tim@adjective.org
2020-09-15 16:37:38 +03:00
Armin Braun eae6a3b18e
Fix testMappingVersionAfterDynamicMappingUpdate (#62352) (#62360)
There is a race in this test where the index request will return
once the dynamic mapping update has been observed by the cluster
state observer internally used by the indexing but not hit all
state appliers and thus isn't showing up as the applied state returned
by `clusterService.state()` yet.
2020-09-15 11:59:22 +02:00
Alan Woodward a68f7077c7 Rationalise fetch phase exceptions (#62230)
We have a special FetchPhaseExecutionException which contains some useful
information about which shard and doc a fetch phase has failed in. However, this
is not used in many places - currently only the ExplainPhase and the highlighters
throw one, and the FetchPhase itself catches IOExceptions and just passes them
to the ExceptionsHelper with no extra context.

This commit changes FetchPhase to throw FetchPhaseExecutionException if it
encounters problems in any of its subphases, and removes the special handling
from the explain and highlight phases. It also removes the need to pass shard ids
around when building HitContext objects.
2020-09-15 09:28:19 +01:00
Alan Woodward 8089210815 Some small cleanups in TermVectorsService (#62292)
We removed the use of aggregated stats from term vectors back in #16452, but there is
a bunch of dead code left here which can be stripped out.
2020-09-15 09:01:49 +01:00
Ignacio Vera 3536f7f7c2
Initialize BitArray storage as number of bits (#62327) (#62354) 2020-09-15 08:34:22 +02:00
Armin Braun c81a076f5a
Improve Efficiency of ClusterApplierService Iteration (#62282) (#62350)
The complexity of removing a timeout listener was `O(n)` which
means that in case of many queued up CS update tasks (such as in the
case of an avalanche of dynamic mapping updates) we're dealing with
quadratic complexity for timing out N tasks which was observed to be
an issue in practice.

This PR makes the complexity of timing out a task `O(1)` and generally
simplifies the iteration logic of listeners and applies to be a little
more efficient and inline better.
2020-09-15 05:59:48 +02:00
Julie Tibshirani f56ce4f39b
Fix failure in InnerHitBuilderTests around 'fields' option. (#62344)
The case InnerHitBuilderTests#testEqualsAndHashcode creates a copy of the object
by serializing + deserializing it, then applies a modification. If the 'fields'
list is empty, then deserializing it results in Collections.emptyList. Because
this is immutable, then modifying it can throw an UnsupportedOperationException.

This PR takes the same approach as for docvalue_fields, where we create a new
list instead of trying to add to an empty one.
2020-09-14 15:39:03 -07:00
Julie Tibshirani 4a19bdb2ea
Support the 'fields' option in inner_hits and top_hits. (#62337)
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation

Addresses #61949.
2020-09-14 11:51:45 -07:00
David Turner 9acd2fd1fd Minor cleanups to BytesReferenceStreamInput (#62302)
Followup to #61681:

- reuse the current iterator in `reset()` if possible
- simply some integer-overflow-avoidance in `skip()`
- clarify some comments
- address some IntelliJ warnings
2020-09-14 17:02:27 +01:00
Christoph Büscher e2eada2498
Fix disabling `allow_leading_wildcard` (#62300) (#62318)
Disabling the `query_string` queries `allow_leading_wildcard` parameter didn't
work after a change probably introduced in #60959 because the various field types
`wildcardQuery` don't check the leading characters like
QueryParserBase#getWildcardQuery does. This PR adds the missing check also
before calling the field types wildcard generating method.

Closes #62267
2020-09-14 17:13:17 +02:00
Alan Woodward 5358cee29c Cut over more mapping tests to MapperServiceTestCase (#62312)
Shaves a few more seconds off the build.
2020-09-14 16:00:37 +01:00
Armin Braun 95766da345
Save Some Allocations when Working with ClusterState (#62060) (#62303)
Just a number of obvious spots where we were allocating
duplicate empty structures or otherwise inefficient that I
found while investigating snapshot cluster state update performance.
2020-09-14 15:09:54 +02:00
Armin Braun 875af1c976
Remove Dead Variable in BlobStoreIndexShardSnapshots. (#62285) (#62295)
This was never used.

Co-authored-by: Howard <danielhuang@tencent.com>
2020-09-14 13:40:39 +02:00
Luca Cavanna 53bf057a53 [TEST] avoid double null check in TransportSearchActionTests 2020-09-11 10:10:09 +02:00
Nhat Nguyen aafb2cb812 Support point in time cross cluster search (#61827)
This commit integrates point in time into cross cluster search.

Relates #61062
Closes #61790
2020-09-10 19:25:48 -04:00
Nhat Nguyen 808c8689ac Always include the matching node when resolving point in time (#61658)
If shards are relocated to new nodes, then searches with a point in time
will fail, although a pit keeps search contexts open. This commit solves
this problem by reducing info used by SearchShardIterator and always
including the matching nodes when resolving a point in time.

Closes #61627
2020-09-10 19:25:48 -04:00
Nhat Nguyen 035f0638f4 Support point in time in async_search (#61560)
This commit integrates point in time into async search and
ensures that it works correctly with security enabled.

Relates #61062
2020-09-10 19:25:48 -04:00
Nhat Nguyen 063a6d047c Release search context when scroll keep_alive is too large (#62179)
Previously, we close related search contexts if the keep_alive of a scroll is too large. 
But we accidentally change this behavior in #62061.
2020-09-10 19:25:48 -04:00
Nhat Nguyen 2eb1e8bc84 Make keep alive of point in time optional in search (#62184)
A search request should not be required to extend the keep_alive of a point in time. 
This change makes that parameter optional.
2020-09-10 19:25:48 -04:00
Jim Ferenczi 3fc35aa76e Shard Search Scroll failures consistency (#62061)
Today some uncaught shard failures such as RejectedExecutionException skips the release of shard context
and let subsequent scroll requests access the same shard context again. Depending on how the other shards advanced,
this behavior can lead to missing data since scrolls always move forward.
In order to avoid hidden data loss, this commit ensures that we always release the context of shard search scroll requests whenever a failure
occurs locally. The shard search context will no longer exist in subsequent scroll requests which will lead to consistent shard failures
in the responses.
This change also modifies the retry tests of the reindex feature. Reindex retries scroll search request that contains a shard failure and
move on whenever the failure disappears. That is not compatible with how scrolls work and can lead to missing data as explained above.
That means that reindex will now report scroll failures when search rejection happen during the operation instead of skipping document
silently.
Finally this change removes an old TODO that was fulfilled with #61062.
2020-09-10 19:25:48 -04:00
Jim Ferenczi 4d528e91a1 Ensure validation of the reader context is executed first (#61831)
This change makes sure that reader context is validated (`SearchOperationListener#validateReaderContext)
before any other operation and that it is correctly recycled or removed at the end of the operation.
This commit also fixes a race condition bug that would allocate the security reader for scrolls more than once.

Relates #61446

Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
2020-09-10 19:25:48 -04:00
Luca Cavanna 44bd4a6004 Fix point in time toXContent impl (#62080)
PointInTimeBuilder is a ToXContentObject yet it does not print out a whole object (it is rather a fragment). Also, when it is printed out as part of SearchSourceBuilder, an error is thrown because pit should be wrapped into its own object.

This commit fixes this and adds tests for it.
2020-09-10 19:25:47 -04:00
Nhat Nguyen 3d69b5c41e Introduce point in time APIs in x-pack basic (#61062)
This commit introduces a new API that manages point-in-times in x-pack
basic. Elasticsearch pit (point in time) is a lightweight view into the
state of the data as it existed when initiated. A search request by
default executes against the most recent point in time. In some cases,
it is preferred to perform multiple search requests using the same point
in time. For example, if refreshes happen between search_after requests,
then the results of those requests might not be consistent as changes
happening between searches are only visible to the more recent point in
time.

A point in time must be opened before being used in search requests. The
`keep_alive` parameter tells Elasticsearch how long it should keep a
point in time around.

```
POST /my_index/_pit?keep_alive=1m
```

The response from the above request includes a `id`, which should be
passed to the `id` of the `pit` parameter of search requests.

```
POST /_search
{
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    },
    "pit": {
            "id":  "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
            "keep_alive": "1m"
    }
}
```

Point-in-times are automatically closed when the `keep_alive` is
elapsed. However, keeping point-in-times has a cost; hence,
point-in-times should be closed as soon as they are no longer used in
search requests.

```
DELETE /_pit
{
    "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
}
```

#### Notable works in this change:

- Move the search state to the coordinating node: #52741
- Allow searches with a specific reader context: #53989
- Add the ability to acquire readers in IndexShard: #54966

Relates #46523
Relates #26472

Co-authored-by: Jim Ferenczi <jimczi@apache.org>
2020-09-10 19:25:47 -04:00
Armin Braun e0a81f7d14
Speed up Version Checks (#62216) (#62253)
The `fromId` method would show up in profiling and JIT analysis as not-inlinable because it's too large
in the contexts it's used in in many cases and was consuming a surprising amount of cycles for computing the
min compat versions.

-> extract cold path from `fromId` to make JIT happy and cache minimumg compatible versions to fields.
2020-09-10 22:57:06 +02:00
Armin Braun 25db5acb0d
Simplify TimeValue Serialization (#62023) (#62248)
This can be done without map lookups => less code and much smaller methods => better inlining potentially.
2020-09-10 20:16:21 +02:00
Armin Braun 7b941a18e9
Optimize Snapshot Shard Status Update Handling (#62070) (#62219)
Avoiding a number of noop updates that were observed to cause trouble (as in needless noop CS publishing) which can become an issue when working with a large number of concurrent snapshot operations.
Also this sets up some simplifications made in the clone snapshot branch.
2020-09-10 16:29:16 +02:00
Ignacio Vera c8981ea93d
upgrade to lucene-8.7.0-snapshot-b313618cc1d (#62213) (#62222) 2020-09-10 16:23:18 +02:00
Igor Motov b6bff56a56
Fix hard_bounds interval handling (#62129) (#62188)
The hard bounds were incorrectly scaled for intervals, which was
causing incorrect buckets to show up or no buckets at all for
interval other than 1.

Closes #62126
2020-09-09 15:42:12 -04:00
Nik Everett 1104d65465
Fix bug with terms' min_doc_count (#62130) (#62177)
The `global_ordinals` implementation of `terms` had a bug when
`min_doc_count: 0` that'd cause sub-aggregations to have array index out
of bounds exceptions. Ooops. My fault. This fixes the bug by assigning
ordinals to those buckets.

Closes #62084
2020-09-09 13:04:51 -04:00
Armin Braun 6710104673
Fix Creating NOOP Tasks on SNAPSHOT Pool (#62152) (#62157)
Fixing a few spots where NOOP tasks on the snapshot pool were created needlessly.
Especially when it comes to mixed master+data nodes and concurrent snapshots these
hurt delete operation performance needlessly.
2020-09-09 14:05:17 +02:00
Luca Cavanna fbf0967e20 QueryPhaseResultConsumer to call notifyPartialReduce (#62083)
As part of #60275 QueryPhaseResultConsumer ended up calling SearchProgressListener#onPartialReduce directly instead of notifyPartialReduce. That means we don't catch exceptions that may occur while executing the progress listener callback.

This commit fixes the call and adds a test for this scenario.
2020-09-09 13:44:07 +02:00
Luca Cavanna ad83261348 Print out search request as part of async search task description (#62057)
Currently, the async search task is the task that will be running through the whole execution of an async search. While the submit async search task prints out the search as part of its description, async search task doesn't while it should.

With this commit we address that while also making sure that the description highlights that the task is originated from an async search.

Also, we streamline the way the description is printed out by SearchTask so that it does not get forgotten in the future.
2020-09-09 13:44:07 +02:00
Rory Hunter b7fd7cf154
Write deprecation logs to a data stream (#61966)
Backport of #58924.

Closes #46106. Introduce a mechanism for writing deprecation logs to a data stream
as well as to disk.
2020-09-09 12:16:28 +01:00
Armin Braun ed4984a32e
Remove Redundant Stream Wrapping from Compression (#62017) (#62132)
In many cases we don't need a `StreamInput` or `StreamOutput`
wrapper around these streams so I this commit adjusts the API
to just normal streams and adds the wrapping where necessary.
2020-09-09 03:27:38 +02:00
Nik Everett b8e9a7125f
Speed up empty highlighting many fields (backport of #61860) (#62122)
Kibana often highlights *everything* like this:
```
POST /_search
{
  "query": ...,
  "size": 500,
  "highlight": {
    "fields": {
      "*": { ... }
    }
  }
}
```

This can get slow when there are hundreds of mapped fields. I tested
this locally and unscientifically and it took a request from 20ms to
150ms when there are 100 fields. I've seen clusters with 2000 fields
where simple search go from 500ms to 1500ms just by turning on this sort
of highlighting. Even when the query is just a `range` that and the
fields are all numbers and stuff so it won't highlight anything.

This speeds up the `unified` highlighter in this case in a few ways:
1. Build the highlighting infrastructure once field rather than once pre
   document per field. This cuts out a *ton* of work analyzing the query
   over and over and over again.
2. Bail out of the highlighter before loading values if we can't produce
   any results.

Combined these take that local 150ms case down to 65ms. This is unlikely
to be really useful when there are only a few fetched docs and only a
few fields, but we often end up having many fields with many fetched
docs.
2020-09-08 15:49:50 -04:00
Alan Woodward 28fd4a2ae8 Convert RangeFieldMapper to parametrized form (#62058)
This also adds the ability to define a serialization check on Parameters, used
in this case to only serialize format and locale parameters if the mapper is a
date range.
2020-09-08 18:44:13 +01:00
Alan Woodward 5f05eef7e3 Convert some more mapping tests to MapperServiceTestCase (#62089)
We don't need to extend ESSingleNodeTestCase for all these tests.
2020-09-08 17:51:40 +01:00
Tim Brooks 075271758e
Keep checkpoint file channel open across fsyncs (#61744)
Currently we open and close the checkpoint file channel for every fsync.
This file channel can be kept open for the lifecycle of a translog
writer. This avoids the overhead of opening the file, checking file
permissions, and closing the file on every fsync.
2020-09-08 08:54:53 -06:00
Francisco Fernández Castaño 2bb5716b3d
Add repositories metering API (#62088)
This pull request adds a new set of APIs that allows tracking the number of requests performed
by the different registered repositories.

In order to avoid losing data, the repository statistics are archived after the repository is closed for
a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the
statistics for the active repositories as well as the modified/closed repositories.

Backport of #60371
2020-09-08 14:01:04 +02:00
Armin Braun ebd1569028
Fix testMasterFailOverWithQueuedDeletes (#62062) (#62078)
Fixing very rare corner case where the delete retry is slow.

Closes #62031
2020-09-08 10:35:06 +02:00
Nhat Nguyen bb0a583990
Allow enabling soft-deletes on restore from snapshot (#62018)
Closes #61969
2020-09-07 09:45:36 -04:00
Alan Woodward cbc9578cbd Remove SearchPhase interface (#62050)
The interface is never used as an abstraction - implementations are are called directly,
and most of them don't need to implement the preProcess method.
2020-09-07 13:45:43 +01:00
David Turner 3389d5ccb2 Introduce integ tests for high disk watermark (#60460)
An important goal of the disk threshold decider is to ensure that nodes
use less disk space than the high watermark, and to take action if a
node ever exceeds this watermark. Today we do not have any
integration-style tests of this high-level behaviour. This commit
introduces a small test harness that can adjust the apparent size of the
disk and verify that the disk threshold decider moves shards around in
response.

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
2020-09-07 14:39:39 +02:00
Armin Braun 395538f508
Improve Snapshot State Machine Performance (#62000) (#62049)
Just a few random things to optimize motivated by somewhat sub-standard performance
for large snapshot cluster states with many concurrent snapshots observed in production.
2020-09-07 13:25:40 +02:00
Jim Ferenczi fa8e76abb1
Improve reduction of terms aggregations (#61779) (#62028)
Today, the terms aggregation reduces multiple aggregations at once using a map
to group same buckets together. This operation can be costly since it requires
to lookup every bucket in a global map with no particular order.
This commit changes how term buckets are sorted by shards and partial reduces in
order to be able to reduce results using a merge-sort strategy.
For bwc, results are merged with the legacy code if any of the aggregations use
a different sort (if it was returned by a node in prior versions).

Relates #51857
2020-09-07 13:13:20 +02:00
Alan Woodward a295b0aa86 Fix null_value parsing for data_nanos field mapper (#61994)
The null_value parameter for date fields is always parsed using DateFormatter.parseMillis,
which is incorrect for nanosecond resolution fields. This commit changes the parsing logic
to always use DateFieldType.parse() to parse the null value.
2020-09-07 10:58:54 +01:00
Alan Woodward 1799c0c583 Convert completion, binary, boolean tests to MapperTestCase (#62004)
Also fixes a metadata serialization bug in CompletionFieldMapper.
2020-09-07 10:48:20 +01:00
Luca Cavanna 0c8b438577
Add support for runtime fields (#61776)
This commit includes the work that has been done on the runtime fields feature branch until now. The high level tasks are listed in #59332. The tasks that have not yet been completed can be worked on after merging the feature branch.

We are adding a new x-pack plugin called runtime-fields that plugs in a custom mapper which allows to define runtime fields based on a script.
The changes included in this commit that were made outside of the x-pack/plugin/runtime-fields directory are minimal and revolve around 1) making the ScriptService available while parsing index mappings so that the scripts associated to runtime fields can be compiled 2) sharing code to manipulate ranges etc. as it can be reused in runtime fields.

Co-authored-by: Nik Everett <nik9000@gmail.com>
2020-09-07 09:14:53 +02:00
Howard b26584dff8 Remove unused deciders in BalancedShardsAllocator (#62026) 2020-09-07 00:04:16 -04:00
Armin Braun 1e3edbbe74
Simplify BytesReference StreamInput (#61681) (#62014)
Flattening both streams into a single stream here saves a few objects and some indirection.
Also, removed the redundant `offset` field which added nothing but complexity by forcing the
incrementation of two counters on every read.
2020-09-05 10:45:52 +02:00
Ryan Ernst 6d3b691048
Add snapshot only test modules (#61954)
This commit adds external test modules. These are modules meant for
external systems to test edge cases in elasticsearch, but only within
snapshots. They are not meant to be used in production, so protections
are also added from their accidental inclusion in release builds.

Note that this commit does not actually add any new modules, it only
adds the infrastructure for the new modules, under
`test/external-modules`.
2020-09-04 16:35:18 -07:00
Yannick Welsch 6d08b55d4e Simplify searchable snapshot shard allocation (#61911)
Simplifies allocation for snapshot-backed shards by always making the recovery source "from snapshot" for those
snapshot-backed shards (instead of "recover from local or from empty store"). Also let's the balancer pick a node which
to allocate the snapshot-backed shard to (which takes number of shards on each node into account unlike the current
implementation which just picks whatever node we are allowed to allocate to, with no notion of "balancing" at all).
2020-09-04 15:45:00 +02:00
Alan Woodward 66bb1eea98 Improve error messages on bad [format] and [null_value] params for date mapper (#61932)
Currently, if an incorrectly formatted date is passed as a null_value for a date field mapper
configuration, you get a vague error:

Failed to parse mapping [_doc]: cannot parse empty date
Similarly, if you pass an incorrect format, you get the error:

Failed to parse mapping [_doc]: Invalid format [...]
This commit improves both these errors by including the mapper name and parameter that
are misconfigured.

Fixes #61712
2020-09-04 14:13:28 +01:00
Ignacio Vera 31c026f25c
upgrade to Lucene-8.7.0-snapshot-61ea26a (#61957) (#61974) 2020-09-04 13:46:20 +02:00
Nik Everett 3d23dcd742
Use standard bit set impl in cardinality (#61816) (#61930)
This replaces a specialized bit set implementation used in cardinality
with our standard `BitArray` which works exactly the same way. Its also
tracked by `BigArrays` which is great!
2020-09-03 12:37:30 -04:00
Nik Everett 3934e14bc0
Fixup vwhisto test (#60936) (#61928)
This test assumed some random bounds that turned out not to hold in some
cases.

Closes #60673
2020-09-03 12:37:17 -04:00
Alan Woodward 48870c60c7 Don't spin up a whole node to unit test some data structures (#61923)
BytesRefHashTests and LongObjectHashMapTests currently extend ESSingleNodeTestCase,
which builds an entire node just to run some unit tests over entirely in-memory data
structures. This commit converts them both to extend ESTestCase.
2020-09-03 17:19:42 +01:00
Alan Woodward 3a1e0edf0a Convert DateFieldMapperTests to MapperTestCase (#61920) 2020-09-03 16:04:02 +01:00
Martijn Laarman cfa54c08bd [7.x] Version bump 7.9.1 release 2020-09-03 16:41:58 +02:00
Alan Woodward e2f006eeb4
Merge FetchSubPhase hitsExecute and hitExecute methods (#60907) (#61893)
FetchSubPhase has two 'execute' methods, one which takes all hits to be examined,
and one which takes a single HitContext. It's not obvious which one should be implemented
by a given sub-phase, or if implementing both is a possibility; nor is it obvious that we first
run the hitExecute methods of all subphases, and then subsequently call all the
hitsExecute methods.

This commit reworks FetchSubPhase to replace these two variants with a processor class,
`FetchSubPhaseProcessor`, that is returned from a single `getProcessor` method.  This
processor class has two methods, `setNextReader()` and `process`.  FetchPhase collects
processors from all its subphases (if a subphase does not need to execute on the current
search context, it can return `null` from `getProcessor`).  It then sorts its hits by docid, and
groups them by lucene leaf reader.  For each reader group, it calls `setNextReader()` on
all non-null processors, and then passes each doc id to `process()`.

Implementations of fetch sub phases can divide their concerns into per-request, per-reader
and per-document sections, and no longer need to worry about sorting docs or dealing with
reader slices.

FetchSubPhase now provides a FetchSubPhaseExecutor that exposes two methods,
setNextReader(LeafReaderContext) and execute(HitContext). The parent FetchPhase collects all
these executors together (if a phase should not be executed, then it returns null here); then
it sorts hits, and groups them by reader; for each reader it calls setNextReader, and then
execute for each hit in turn. Individual sub phases no longer need to concern themselves with
sorting docs or keeping track of readers; global structures can be built in
getExecutor(SearchContext), per-reader structures in setNextReader and per-doc in execute.
2020-09-03 12:20:55 +01:00
Alan Woodward af01ccee93
Add specific test for serializing all mapping parameter values (#61844) (#61877)
This commit adds a test to MapperTestCase that explicitly checks that a mapper can
serialize all its default values, and that this serialization can then be re-parsed. Note that
the test is disabled for non-parametrized mappers as their serialization may in some cases
output parameters that are not accepted. Gradually moving all mappers to parametrized
form will address this.

The commit also contains a fix to keyword mappers, which were not correctly serializing
the similarity parameter; this partially addresses #61563. It also enables `null` as a
value for `null_value` on `scaled_float`, as a follow-up to #61798
2020-09-03 09:20:26 +01:00
Nik Everett c19f67ce30
Support longs in BitArray (backport of #61867) (#61871)
We frequently use `long`s with `BitArray` in aggs and right now we have
to assert that the `long` fits in an `int`. This adds support for `long`
to `BitArray` so we don't need those assertions.
2020-09-02 17:24:31 -04:00
Henning Andersen 867d5f1c68
Search memory leak (#61788) (#61862)
Search could leak memory if global ordinals were calculated as part of
a search with low level cancellation enabled. QueryPhase registers a
cancellation on the reader that is never removed, which ends up being
referenced from the global ordinals cache entry. This keeps an indirect
reference to the search context. A significant leak can occur when a
heavy aggregation (cardinality for instance) is used and a failure occurs
during search, in particular if the pages backing the hyperlog++ structure
are not recycled when it is closed.

This commit also fixes an issue with an unclosed resource and request
breaker adjustment in the cardinality aggregation.
2020-09-02 18:51:14 +02:00
Jim Ferenczi a0e4331c49 Cleanup usages of QueryPhaseResultConsumer (#61713)
This commit generalizes how QueryPhaseResultConsumer is initialized.
The query phase always uses this consumer so it doesn't need to be hidden behind
an abstract class.
2020-09-02 14:41:02 +02:00
Alan Woodward d59343b4ba
Allow [null] values in [null_value] (#61798) (#61807)
Several field mappers have a null_value parameter, that allows you to specify a placeholder
value to insert into a document if the incoming value for that field is null. The default value
for this is always null, meaning "add no placeholder". However, we explicitly bar users from
setting this parameter directly to null (done in #7978, in order to fix an NPE).

This exclusion means that if a mapper is serialized with include_defaults, then we either need
to special-case null_value to ensure that it is not output when it holds the default value, or
we find that the resulting serialized form cannot be used to create a mapping. This stops us
doing some useful generic testing of mappers.

This commit permits null as a parameter value for null_value, and changes the tests to check
that it is a) permissible and b) applied without throwing errors. As part of the testing changes,
a new base class MapperServiceTestCase is refactored from MapperTestCase, holding
the various helper methods related to building mappings but not the single-mapper specific
abstract methods.

Closes #58823
2020-09-02 10:42:19 +01:00
Igor Motov 48e53cca94
Fix wrong NaN comparison (#61795) (#61811)
Fixes wrong NaN comparison in error message generator in GeoPolygonDecomposer and PolygonBuilder.

Supersedes #48207

Co-authored-by: Pedro Luiz Cabral Salomon Prado <pedroprado010@users.noreply.github.com>
2020-09-01 15:50:38 -04:00
Tim Brooks e573fa9abc
Add data.path fast path for FilePermission (#61302)
The recursive data.path FilePermission check is an extremely hot
codepath in Elasticsearch. Unfortunately the FilePermission check in
Java is extremely allocation heavy. As it iterates through different
file permissions, it allocates byte arrays for each Path component that
must be compared. This PR improves the situation by adding the recursive
data.path FilePermission it its own PermissionsCollection object which
is checked first.
2020-09-01 12:03:22 -06:00
Armin Braun 28710c985d
Dry up Settings from Map Construction (#61778) (#61803)
We used the same hack all over the place. At least drying it up to a single place.

Co-authored-by: Jay Modi <jaymode@users.noreply.github.com>
2020-09-01 19:46:10 +02:00
Tanguy Leroux 6e944d9e21
Throws IndexNotFoundException in TransportGetAction for unknown System indices (#61785) (#61791)
The change #57936 introduced a dedicated thread pool for reads in system indices. 
It also introduced a potential NPE in the case the index to read in not yet present in 
the cluster state. This commit fixes that bug by using the getIndexSafe() instead of 
just index() method when retrieving the index's metadata so that an INFE is thrown 
if the index does not exist.
2020-09-01 17:41:57 +02:00
Dan Hermann 88a448f1cd
Fix wrong result when executing bulk requests with and without pipeline (#60818) (#61777) 2020-09-01 07:05:25 -05:00
Armin Braun 3fd25bfa87
Fix Concurrent Snapshot Create+Delete + Delete Index (#61770) (#61773)
We had a bug here were we put a `null` value into the shard
assignment mapping when reassigning work after a snapshot delete
had gone through. This only affects partial snaphots but essentially
dead-locks the snapshot process.

Closes #61762
2020-09-01 13:20:25 +02:00
Tanguy Leroux 787dfda4c1
Prevent snapshots to be mounted as system indices (#61517) (#61727)
System indices can be snapshotted and are therefore potential candidates 
to be mounted as searchable snapshot indices. As of today nothing 
prevents a snapshot to be mounted under an index name starting with . 
and this can lead to conflicting situations because searchable snapshot 
indices are read-only and Elasticsearch expects some system indices 
to be writable; because searchable snapshot indices will soon use an 
internal system index (#60522) to speed up recoveries and we should
prevent the system index to be itself a searchable snapshot index 
(leading to some deadlock situation for recovery).

This commit introduces a changes to prevent snapshots to be mounted 
as a system index.
2020-09-01 11:13:28 +02:00
Boice Huang 8fdd3d158b Remove redundant symbol in msearch tests (#61353) 2020-09-01 10:58:22 +02:00