Commit Graph

5449 Commits

Author SHA1 Message Date
Nhat Nguyen 38c8a55df8
Better UUID for reader context (#62799)
We can use a single and stronger UUID for all reader contexts
created by the same SearchService.

Backport of #62715
2020-09-23 12:50:18 -04:00
Julie Tibshirani 7ba0c95191 Mute ClusterHealthIT.testHealthOnMasterFailover while we await a fix. 2020-09-23 09:17:45 -07:00
Alan Woodward 7984e4e89f
Fix test bug in SpanMultiTermQueryBuilderTests (#62833)
This test checks to see if the index has been created before version 6.4, in which
case index prefixes are unavailable and so it expects to see a span multi-term
wrapper. However, the production code doesn't bother with checking for versions,
because if the field in question is configured with index_prefixes then it knows that
it must have been created post 6.4 (you can't merge in a new index_prefixes
configuration).

This commit alters the test to remove the random version checks, as we know we
will always have a prefix field available in this scenario.

Fixes #58199
2020-09-23 17:02:12 +01:00
Martijn van Groningen 0baefc8ddc
Always validate that only a create op is allowed in bulk api for data streams (#62820)
Backport #62766 to 7.x branch.

The bulk api cache the resolved concrete indices when resolving the user provided
index name into the actual index name. The validation that prevents write ops other
than create from being executed in a data stream was only performed if the result
wasn't cached. In case of cached resolvings, the validation never occurs.

The validation would be skipped for all bulk items for a data stream after a create
operation for that same data stream. This commit ensures that the validation is always
performed for all bulk items (whether the concrete index resolution has been cached or
not cached).

Closes #62762
2020-09-23 16:27:54 +02:00
Armin Braun a754fd8020
Fix CoordinatorTests.testLogsMessagesIfPublicationDelayed (#62815) (#62822)
We need to account for an addional `DEFAULT_DELAY_VARIABILITY` timeout for
the lag detector task to be executed after its scheduled.

Closes #62383
2020-09-23 14:23:28 +02:00
Christoph Büscher 29074e7055
Add case insensitive prefix and wildcard to 'version' field (#62754) (#62782)
This change adds support for the recently introduced case insensitivity flag for
wildcard and prefix queries. Since version field values are encoded differently we
need to adapt our own AutomatonQuery variation to add both cases if case insensitivity
is turned on.
2020-09-23 11:48:34 +02:00
Ignacio Vera 81645ec2cc
nextSetBit should check if the underlaying array contains the current word (#62805) (#62812)
This is a recent addition and it is missing a check as the underlaying array can be smaller that the numBits capacity.
2020-09-23 11:17:26 +02:00
Luca Cavanna 862fab06d3
Share same existsQuery impl throughout mappers (#57607)
Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers.

There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available.

This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method.

At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.
2020-09-23 11:00:53 +02:00
Luca Cavanna 5ca86d541c
Move stored flag from TextSearchInfo to MappedFieldType (#62717) (#62770) 2020-09-23 09:40:34 +02:00
Nhat Nguyen 663b85b98f Make keep alive optional in PointInTimeBuilder (#62720)
Remove the keepAlive parameter from the constructor of PointInTimeBuilder
as it's optional.
2020-09-22 18:52:54 -04:00
Jay Modi cb1dc5260f
Dedicated threadpool for system index writes (#62792)
This commit adds a dedicated threadpool for system index write
operations. The dedicated resources for system index writes serves as
a means to ensure that user activity does not block important system
operations from occurring such as the management of users and roles.

Backport of #61655
2020-09-22 15:31:38 -06:00
Benjamin Trent 77bfb32635
[7.x] [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694) (#62784)
* [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694)

* [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls
 global parameters, outside of the global index, are ignored for internal callers in certain cases.
If the interal caller is adding requests via the following methods:
```
- BulkRequest#add(IndexRequest)
- BulkRequest#add(UpdateRequest)
- BulkRequest#add(DocWriteRequest)
- BulkRequest#add(DocWriteRequest[])
```
It is better to specifically set the desired parameters on the requests before they are added
to the bulk request object.

This commit addresses this issue for the ML plugin

* unmuting test
2020-09-22 15:07:08 -04:00
Rory Hunter 3f856d1c81 Prioritise recovery of system index shards (#62640)
Closes #61660. When ordering shard for recovery, ensure system index shards are
ordered first so that their recovery will be started first.

Note that I rewrote PriorityComparatorTests to use IndexMetadata instead of its
local IndexMeta POJO.
2020-09-22 15:48:27 +01:00
markharwood a0df0fb074
Search - add case insensitive flag for "term" family of queries #61596 (#62661)
Backport of fe9145f

Closes #61546
2020-09-22 13:56:51 +01:00
Armin Braun 0d5250c99b
Add Trace Logging to File Restore (#62755) (#62761)
Requested by the performance team and generally potentially useful
to log each file at `TRACE` like we do for snapshot create.
2020-09-22 14:44:40 +02:00
Amogh Mishra bc6bea5924 Remove node from cluster when node locks broken (#61400)
In #52680 we introduced a mechanism that will allow nodes to remove
themselves from the cluster if they locally determine themselves to be
unhealthy. The only check today is that their data paths are all
empirically writeable. This commit extends this check to consider a
failure of `NodeEnvironment#assertEnvIsLocked()` to be an indication of
unhealthiness.

Closes #58373
2020-09-22 10:08:41 +01:00
Armin Braun aa0dc56412
Ensure MockRepository is Unblocked on Node Close (#62711) (#62748)
`RepositoriesService#doClose` was never called which lead to
mock repositories not unblocking until the `ThreadPool` interrupts
all threads. Thus stopping a node that is blocked on a mock repository operation wastes `10s`
in each test that does it (which is quite a few as it turns out).
2020-09-22 11:00:18 +02:00
Armin Braun 4bdbc39e9f
Fix testQueuedSnapshotOperationsAndBrokenRepoOnMasterFailOverMultiple (#62713) (#62747)
There's possible retries here that work out if both the snapshot and the delete
operation are retried when master shuts down and hits the unlikely case of the retried delete
executing before the retried snapshot, making both operations pass.

Closes #62686
2020-09-22 10:42:11 +02:00
Luca Cavanna 9ae29713fd
Dense vector field type minor fixes (#62631)
The dense vector field is not aggregatable although it produces fielddata through its BinaryDocValuesField. It should pass up hasDocValues set to true to its parent class in its constructor, and return isAggregatable false. Same for the sparse vector field (only in 7.x).

This may not have consequences today, but it will be important once we try to share the same exists query implementation throughout all of the mappers with #57607.
2020-09-22 10:40:51 +02:00
Ignacio Vera 265387f348
override needsScore() on ValueCountAggregator (#62683) (#62745) 2020-09-22 08:47:16 +02:00
Yang Wang 897d2e8a02
Fix ccs permission for search with a scroll id (#62053) (#62695)
CCS with remote indices only does not require any privileges on the local cluster.
This PR ensures that search with scroll follow the permission model.
2020-09-22 11:49:40 +10:00
Jim Ferenczi 1fc78d430b Fix terms aggregation ordering after the final reduce (#62732)
This commit ensures that the final order of the terms aggregations
is registered correctly after the final reduce.
This bug was introduced in #62028 which is not released yet so this PR is marked
as a non-issue.
This issue was discovered when running a terms aggregation under an auto-date
histogram. In such a case, the auto-date histogram may run multiple final reduce
to merge buckets together. This change makes sure that running multiple final reduces
doesn't create duplicates but it doesn't fix the fact that the final reduce may prune
the list of terms prematurely. This other bug is tracked separately in #62731.
2020-09-22 00:03:04 +02:00
Nhat Nguyen f9f4d87437 Remove invalid assertion in SearchService (#62675)
This assertion does not always hold because there can be a race between
`putReaderContext` and `afterIndexRemoved` when an index is deleted.

Closes #62624
2020-09-21 16:29:00 -04:00
Ignacio Vera cadd5dc53f
Fix bug when initializing HyperLogLogPlusPlusSparse (#62602) (#62702)
This is a follow up of #62480 where we are oversizing one array when initialising. In addition it prevents a possible CircuitBreaker leak during initialisation.
2020-09-21 17:30:40 +02:00
Armin Braun 13e28b85ff
Speed up RepositoryData Serialization (#62684) (#62703)
Make serializing `RepositoryData` a little faster and split up/document the code for it a little
as well given how massive this method has gotten at this point.
2020-09-21 17:29:56 +02:00
Dan Hermann a06339ffae
Fix NPE when deleting multiple backing indices on a data stream (#62274) (#62708) 2020-09-21 10:26:47 -05:00
Alan Woodward 1dde4983f6 Convert ConstantKeywordFieldMapper to parametrized form (#62688)
As part of the conversion, adds the ability to customize merge validation - in this case, we
allow an update to the constant value if it is currently set to null, but refuse further
updates once it has been set once.

This commit also converts ParametrizedMapperTests to use MapperServiceTestCase.
2020-09-21 15:22:56 +01:00
Henning Andersen 0c4cfe4c44 Cardinality request breaker leak (#62685)
If HyperLogLogPlusPlus failed during construction, it would
not release already allocated resources, causing the request
circuit breaker to not be adjusted down.

Closes #62439
2020-09-21 15:54:04 +02:00
Christoph Büscher 803f78ef05
Add field type for version strings (#59773) (#62692)
This PR adds a new 'version' field type that allows indexing string values
representing software versions similar to the ones defined in the Semantic
Versioning definition (semver.org). The field behaves very similar to a
'keyword' field but allows efficient sorting and range queries that take into
accound the special ordering needed for version strings. For example, the main
version parts are sorted numerically (ie 2.0.0 < 11.0.0) whereas this wouldn't
be possible with 'keyword' fields today.

Valid version values are similar to the Semantic Versioning definition, with the
notable exception that in addition to the "main" version consiting of
major.minor.patch, we allow less or more than three numeric identifiers, i.e.
"1.2" or "1.4.6.123.12" are treated as valid too.

Relates to #48878
2020-09-21 14:25:42 +02:00
Alan Woodward 178b25fc4b Fix standard filter BWC check to allow for cacheing bug (#62649)
The `standard` tokenfilter was removed by #33310, and should have been
unuseable in any indexes created since 7.0. However, a cacheing bug fixed
by #51092 meant that it was still possible in certain circumstances to create
indexes referencing the standard filter in versions up to 7.5.2. Our checks
in AnalysisModule still refer to 7.0.0, however, meaning that a cluster that
contains one of these rogue indexes cannot be upgraded.

This commit adjusts the AnalysisModule checks so that we only refuse to
build a mapping referring to standard filter if the index created version is
7.6 or later.

Fixes #62644
2020-09-21 10:12:55 +01:00
Henning Andersen 9a77f41e55 Fix cluster health when closing (#61709)
When master shuts down it's cluster service, a waiting health request
would fail rather than fail over to a new master.
2020-09-19 10:02:36 +02:00
Luca Cavanna 00272ea877
Remove cache key renderer argument from IndicesRequestCache (#62534)
In the context of of a recurring test failure tracked by #32827, we added trace logging and an extra cache key renderer argument to IndicesRequestCache#getOrCompute (see #39475 and #34180).

We addressed the issue with #54071, but the extra argument was left behind, with a NORELEASE comment saying it should be removed.

With this commit, we remove the extra cache key rendered argument and the corresponding log lines which are not so useful without it.

Closes #55837
2020-09-19 00:24:02 +02:00
Lee Hinman 4a08928c47
[7.x] Add index.routing.allocation.include._tier_preference setting (#62589) (#62667)
This commit adds the `index.routing.allocation.prefer._tier` setting to the
`DataTierAllocationDecider`. This special-purpose allocation setting lets a user specify a
preference-based list of tiers for an index to be assigned to. For example, if the setting were set
to:

```
"index.routing.allocation.prefer._tier": "data_hot,data_warm,data_content"
```

If the cluster contains any nodes with the `data_hot` role, the decider will only allow them to be
allocated on the `data_hot` node(s). If there are no `data_hot` nodes, but there are `data_warm` and
`data_content` nodes, then the index will be allowed to be allocated on `data_warm` nodes.

This allows us to specify an index's preference for tier(s) without causing the index to be
unassigned if no nodes of a preferred tier are available.

Subsequent work will change the ILM migration to make additional use of this setting.

Relates to #60848
2020-09-18 15:41:36 -06:00
Christos Soulios 6a298970fd
[7.x] Allow metadata fields in the _source (#62616)
Backports #61590 to 7.x

    So far we don't allow metadata fields in the document _source. However, in the case of the _doc_count field mapper (#58339) we want to be able to set

    This PR adds a method to the metadata field parsers that exposes if the field can be included in the document source or not.
    This way each metadata field can configure if it can be included in the document _source
2020-09-18 19:56:41 +03:00
Alan Woodward 17aabaed15 Fix warning on boost docs and warning message on non-implementing fieldmappers 2020-09-18 16:45:08 +01:00
Alan Woodward 43ace5f80d Emit deprecation warnings when boosts are defined in mappings (#62623)
We removed index-time boosting back in 5x, and we no longer document the 'boost'
parameter on any of our mapping types. However, it is still possible to define an
index-time boost on a field mapper for a surprisingly large number of field types, and
they even have an effect (sometimes, on some queries).

As a first step in finally removing all traces of index time boosting, this comment emits
a deprecation warning whenever a boost parameter is found on a mapping definition.
2020-09-18 15:40:53 +01:00
Igor Motov 260c11d89e
Add an additional cancellation check to the fetch phase (#62577) (#62587)
In #62357 we introduced an additional optimization that allows us to skip the
most of the fetch phase early if no results are found. This change caused
some cancellation test failures that were relying on definitive cancellation
during the fetch phase. This commit adds an additional quick cancellation
check at the very beginning of the fetch phase to make cancellation process
more deterministic.

Fixes #62530
2020-09-18 10:00:36 -04:00
Ignacio Vera 18a52f7477
Use BitArray instead of FixedBitSet for collecting ordinals in Cardinality Aggregator (#62600) (#62619)
Changes the way we collecting ordinals in the Cardinality aggregation from Lucene FixedBitSet to BitArray. The benefit is that BitArray is tracked by our Circuit breakers so it is safer.
2020-09-18 14:16:31 +02:00
Tanguy Leroux 9f5e95505b
Also abort ongoing file restores when snapshot restore is aborted (#62441) (#62607)
Today when a snapshot restore is aborted (for example when the index is 
explicitly deleted) while the restoration of the files from the repository has 
already started the file restores are not interrupted. It means that Elasticsearch 
will continue to read the files from the repository and will continue to write 
them to disk until all files are restored; the store will then be closed and 
files will be deleted from disk at some point but this can take a while. This 
will also take some slots in the SNAPSHOT thread pool too. The Recovery 
API won't show any files actively being recovered, the only notable 
indicator would be the active threads in the SNAPSHOT thread pool.

This commit adds a check before reading a file to restore and before 
writing bytes on disk so that a closing store can be detected more 
quickly and the file recovery process aborted. This way the file 
restores just stops and for most of the repository implementations
 it means that no more bytes are read (see #62370 for S3), finishing 
threads in the SNAPSHOT thread pool more quickly too.
2020-09-18 14:04:58 +02:00
Armin Braun 73d19271a9
Fix Races in testQueuedSnapshotOperationsAndBrokenRepoOnMasterFailOverMultipleRepos (#62431) (#62614)
This test (in-part) verifies that snapshot creation is not
retried on master fail-over once a snaphot has been started already.
Unless we wait for the snapshot creation to show up in the cluster
state before failing the master node though, we could run into a
race where the snapshot wasn't yet in the cluster state and a retry goes through
successfully.
2020-09-18 12:20:23 +02:00
Przemyslaw Gomulka d87268a264
Round up parsers should be based on a list of parsers backport(#62290) (#62604)
a dateformatter can be created with a list of parsers which are iterated
during parsing and the first one that passes will return a parsed date.
DateMathParser should do the same, when created based on a list of
non-rounding parsers it should also iterate over all of them - it is at
the moment only taking first element

closing #62207
2020-09-18 12:03:20 +02:00
Adrien Grand 4de8579455
Upgrade to lucene-8.7.0-snapshot-830bd186a8d. (#62596) 2020-09-18 09:51:34 +02:00
David Turner 06d5d360f9 Tidy up fillInStackTrace implementations (#62555)
Removes the unnecessary `synchronized` introduced in #62433 and adjusts
the others to return `this` not `null` as required by the parent
method's Javadocs.
2020-09-18 08:29:48 +01:00
Ignacio Vera 6a3d731be1
Only call reduce on a single InternalAggregation when needed (#62525) (#62594)
Adds a new abstract method in InternalAggregation that flags the framework if it needs to reduce on a single InternalAggregation.
2020-09-18 08:43:58 +02:00
Nhat Nguyen 0127b71901 Adjust keep alive assertion in ShardSearchRequest (#62582)
Relates #62184
2020-09-17 16:09:54 -04:00
Lee Hinman 9bb7ce0b22
[7.x] Allocate new indices on "hot" or "content" tier depending on data stream inclusion (#62338) (#62557)
Backports the following commits to 7.x:

    Allocate new indices on "hot" or "content" tier depending on data stream inclusion (#62338)
2020-09-17 13:29:23 -06:00
Martijn van Groningen 5f643433c6
Prohibit the usage of create index api in namespaces managed by data stream templates (#62574)
Backport of #62527 to 7.x branch.

This commit adds validation that prohibits the creation of regular indices
in the namespace of templates with data streams enabled.

It shouldn't be possible to create ordinary indices when the name of the index
matches with a composable index template that enables data streams. Auto creation
has logic that creates data streams instead of regular indices. However validation
logic for the create index api was missing.
2020-09-17 20:10:42 +02:00
Jim Ferenczi df93b31b15
Faster sequential access for stored fields (#62509) (#62573)
Faster sequential access for stored fields

Spinoff of #61806
Today retrieving stored fields at search time is optimized for random access.
So we make no effort to keep state in order to not decompress the same data
multiple times because two documents might be in the same compressed block.
This strategy is acceptable when retrieving a top N sorted by score since
there is no guarantee that documents will be on the same block.
However, we have some use cases where the document to retrieve might be
completely sequential:

Scrolls or normal search sorted by document id.
Queries on Runtime fields that extract from _source.
This commit exposes a sequential stored fields reader in the
custom leaf reader that we use at search time.
That allows to leverage the merge instances of stored fields readers that
are optimized for sequential access.
This change focuses on the fetch phase for now and leverages the merge instances
for stored fields only if all documents to retrieve are adjacent.
Applying the same logic in the source lookup of runtime fields should
be trivial but will be done in a follow up.

The speedup on queries sorted by doc id is significant.
I played with the scroll task of the http_logs rally track
on my laptop and had the following result:

|                                                        Metric |   Task |    Baseline |   Contender |     Diff |    Unit |
|--------------------------------------------------------------:|-------:|------------:|------------:|---------:|--------:|
|                                            Total Young Gen GC |        |       0.199 |       0.231 |    0.032 |       s |
|                                              Total Old Gen GC |        |           0 |           0 |        0 |       s |
|                                                    Store size |        |     17.9704 |     17.9704 |        0 |      GB |
|                                                 Translog size |        | 2.04891e-06 | 2.04891e-06 |        0 |      GB |
|                                        Heap used for segments |        |    0.820332 |    0.820332 |        0 |      MB |
|                                      Heap used for doc values |        |    0.113979 |    0.113979 |        0 |      MB |
|                                           Heap used for terms |        |     0.37973 |     0.37973 |        0 |      MB |
|                                           Heap used for norms |        |     0.03302 |     0.03302 |        0 |      MB |
|                                          Heap used for points |        |           0 |           0 |        0 |      MB |
|                                   Heap used for stored fields |        |    0.293602 |    0.293602 |        0 |      MB |
|                                                 Segment count |        |         541 |         541 |        0 |         |
|                                                Min Throughput | scroll |     12.7872 |     12.8747 |  0.08758 | pages/s |
|                                             Median Throughput | scroll |     12.9679 |     13.0556 |  0.08776 | pages/s |
|                                                Max Throughput | scroll |     13.4001 |     13.5705 |  0.17046 | pages/s |
|                                       50th percentile latency | scroll |     524.966 |     251.396 |  -273.57 |      ms |
|                                       90th percentile latency | scroll |     577.593 |     271.066 | -306.527 |      ms |
|                                      100th percentile latency | scroll |      664.73 |     272.734 | -391.997 |      ms |
|                                  50th percentile service time | scroll |     522.387 |     248.776 | -273.612 |      ms |
|                                  90th percentile service time | scroll |     573.118 |      267.79 | -305.328 |      ms |
|                                 100th percentile service time | scroll |     660.642 |     268.963 | -391.678 |      ms |
|                                                    error rate | scroll |           0 |           0 |        0 |       % |
Closes #62024
2020-09-17 19:58:18 +02:00
Alan Woodward 5421a743a7 Move SearchLookup into FetchContext (#62549)
FetchSubPhase#getProcessor currently takes a SearchLookup parameter. This
however is only needed by a couple of subphases, and will almost certainly change in
future as we want to simplify how fetch phases retrieve values for individual hits.

To future-proof against further signature changes, this commit moves the SearchLookup
reference into FetchContext instead.
2020-09-17 17:39:02 +01:00
Alan Woodward e3e3aef3d8 Load version metadata even when stored fields are disabled (#62533)
Currently we throw an error if stored fields are disabled, but hit version metadata is
requested on a search. This doesn't make much sense, as the version information
is stored in docvalues and so has no connection with stored fields.

This commit removes the link between the two, allowing version metadata to be loaded
even when stored fields are disabled in a request.

Fixes #62456
2020-09-17 17:39:02 +01:00
Alan Woodward 91e2330529 Warn on badly-formed null values for date and IP field mappers (#62487)
In #57666 we changed when null_value was parsed for ip and date fields. Previously,
the null value was stored as a string, and parsed into a date or InetAddress whenever
a document containing a null value was encountered. Now, the values are parsed when
the mappings are built, which means that bad values are detected up front; if you try and
add a mapping with a badly-parsed ip or date for a null_value, the mapping will be
rejected.

This causes problems for upgrades in the case when you have a badly-formed null_value
in a pre-7.9 cluster. This commit fixes the upgrade case by changing the logic to only
logging a warning on the badly formed value, replicating the earlier behaviour.

Fixes #62363
2020-09-17 16:38:08 +01:00
Ignacio Vera 901000891a
Fix test error in InternalCardinalityTests#testEqualsAndHashcode (#62542) (#62554)
Make sure the the new HLL++ is different to the original one
2020-09-17 17:09:13 +02:00
Alan Woodward 63afc61b08 Introduce FetchContext (#62357)
We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.

This commit introduces a new FetchContext class that exposes only those
methods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.
2020-09-17 09:57:43 +01:00
Adrien Grand e0a4a94985
Speed up merging when source is disabled. (#62443) (#62474)
The CodecReader wrapper we use to remove the `_recovery_source` field
doesn't override `StoredFieldsreader#getMergeInstance`, which has the
undesired side-effect of preventing the wrapped stored fields reader
from optimizing merging.
2020-09-17 10:53:31 +02:00
David Turner 62dcc5b1ae Suppress stack in VersionConflictEngineException (#62433)
`VersionConflictEngineException` is thrown on the hot path for updates,
but stack traces are expensive to compute and transport and rarely
useful for this kind of exception. This commit avoids computing the
stack trace for these exceptions.
2020-09-17 09:40:07 +01:00
Adrien Grand 9a8225bbc1
Upgrade to lucene-8.7.0-snapshot-9cd3af50f80. (#62450) (#62476)
This new snapshot contains the following JIRAs that we're interested in:
 - [LUCENE-9525](https://issues.apache.org/jira/browse/LUCENE-9525)
Better handling of small documents. This should improve retrieval times
when documents are less than ~1kB.
 - [LUCENE-9510](https://issues.apache.org/jira/browse/LUCENE-9510)
Faster flushes when index sorting is enabled by not compressing the
temporary files that store stored fields and term vectors.
2020-09-17 10:28:20 +02:00
Armin Braun 5112c17319
Add WARN Logging on Slow Transport Message Handling (#62444) (#62521)
Add simple WARN logging on slow inbound TCP messages.
2020-09-17 10:12:20 +02:00
David Turner 14aec44cd8 Log if recovery affected by disconnect (#62437)
Today we only emit `DEBUG` logs if the source disconnects from the
target during a recovery. This deserves to be noisier by default since
it should be rare and may help users identify other problems with their
network or with their shard movements.

This commit promotes this message to `INFO`. There's no need for `WARN`
since these days we will normally resume the recovery where it left off.
2020-09-17 08:22:40 +01:00
Ignacio Vera 2d3ca9c155
Introduce a sparse HyperLogLogPlusPlus class for cloning and serializing low cardinality buckets (#62480) (#62520)
Reduces the memory footprint of an HLL++ structure that uses Linear counting when cloning or deserialising the data structure.
2020-09-17 08:54:50 +02:00
Julie Tibshirani e1da558206 Remove unused test search context for significant_terms. 2020-09-16 14:27:11 -07:00
Jay Modi 5da922064f
LocalNodeMasterListener is a regular listener (#62485)
This commit makes the LocalNodeMasterListener interface extend the
ClusterStateListener interface and use a default implementation for
detecting whether the local node master status changed.

Backport of #62422
2020-09-16 11:42:53 -06:00
Tanguy Leroux 8a2e9e66d4
Wait for relocations and disk threshold monitor in DiskThresholdDeciderIT (#62358) (#62467)
Closes #62326
2020-09-16 17:40:20 +02:00
Armin Braun f6a8599cf8
Don't Start Redundant ConsistentSettingsService (#62283) (#62428)
The consistent settings service is only used in tests so far. No need to start it
unless it's actually used.
2020-09-16 09:43:04 +02:00
Ignacio Vera f3ed641fc7
Adds bucketOrd back to cardinality algorithms (#62389) (#62427) 2020-09-16 08:41:57 +02:00
Nik Everett 24a24d050a
Implement fields fetch for runtime fields (backport of #61995) (#62416)
This implements the `fields` API in `_search` for runtime fields using
doc values. Most of that implementation is stolen from the
`docvalue_fields` fetch sub-phase, just moved into the same API that the
`fields` API uses. At this point the `docvalue_fields` fetch phase looks
like a special case of the `fields` API.

While I was at it I moved the "which doc values sub-implementation
should I use for fetching?" question from a bunch of `instanceof`s to a
method on `LeafFieldData` so we can be much more flexible with what is
returned and we're not forced to extend certain classes just to make the
fetch phase happy.

Relates to #59332
2020-09-15 20:24:10 -04:00
Nik Everett 0a7f335215
Speed up writeVInt (backport of #62345) (#62419)
This speeds up `StreamOutput#writeVInt` quite a bit which is nice
because it is *very* commonly called when serializing aggregations. Well,
when serializing anything. All "collections" serialize their size as a
vint. Anyway, I was examining the serialization speeds of `StringTerms`
and this saves about 30% of the write time for that. I expect it'll be
useful other places.
2020-09-15 17:14:08 -04:00
Nik Everett 771a8893a6
Add more debugging information for cardinality agg (#62317) (#62397)
This adds two extra bits of info to the profiler:
1. Count of the number of different types of collectors. This lets us figure
   out if we're using the optimization for segment ordinals. It adds a few
   more similar counters just for good measure.
2. Profiles the `getLeafCollector` and `postCollection` methods. These are
   non-trivial for some aggregations, like cardinality.
2020-09-15 13:21:11 -04:00
Armin Braun ffbc64bd10
Log WARN on Response Deserialization Failure (#62368) (#62388)
We never see this exception in the logs even though it's pretty severe.
All we might see is an exception about a transport message not having been read fully
from the logic that follows this code.
Technically we should probably bubble up the exception but that's a bigger change
and needs some carefully reasoning, this change for the time being at least simplifies
tracking down deserialization issues in responses.
2020-09-15 18:27:39 +02:00
Adrien Grand 6db8afefc2
Upgrade to lucene-8.7.0-snapshot-cdfdc1e0851. (#62376)
Upgrade to a new Lucene snapshot that (at least partially) addresses the
indexing rate regression when index sorting is enabled.

Backport of #62334.
2020-09-15 17:48:07 +02:00
Alan Woodward f89fa421e2 Remove unnecessary IndexSearcher field on HitContext (#62378)
FastVectorHighlighter uses the top-level reader to rewrite queries against, which
it gets via an IndexSearcher field on HitContext. However, we can already access
this top-level reader via HitContext's existing LeafReaderContext field.

This commit removes the unnecessary field and constructor parameter, and
changes the implementation of topLevelReader to go via ReaderUtils and
the leaf reader context.
2020-09-15 15:46:14 +01:00
Christoph Büscher 0ca9829867 Muting CoordinatorTests#testLogsMessagesIfPublicationDelayed 2020-09-15 15:40:51 +02:00
Albert Zaharovits aeed1c05b0
Ensure authz operation overrides transient authz headers (#61621)
AuthorizationService#authorize uses the thread context to carry the result of the
authorisation as transient headers. The listener argument to the `authorize` method
must necessarily observe the header values. This PR makes it so that
the authorisation transient headers (`_indices_permissions` and `_authz_info`, but
NOT `_originating_action_name`) of the child action override the ones of the parent action.

Co-authored-by: Tim Vernum tim@adjective.org
2020-09-15 16:37:38 +03:00
Armin Braun eae6a3b18e
Fix testMappingVersionAfterDynamicMappingUpdate (#62352) (#62360)
There is a race in this test where the index request will return
once the dynamic mapping update has been observed by the cluster
state observer internally used by the indexing but not hit all
state appliers and thus isn't showing up as the applied state returned
by `clusterService.state()` yet.
2020-09-15 11:59:22 +02:00
Alan Woodward a68f7077c7 Rationalise fetch phase exceptions (#62230)
We have a special FetchPhaseExecutionException which contains some useful
information about which shard and doc a fetch phase has failed in. However, this
is not used in many places - currently only the ExplainPhase and the highlighters
throw one, and the FetchPhase itself catches IOExceptions and just passes them
to the ExceptionsHelper with no extra context.

This commit changes FetchPhase to throw FetchPhaseExecutionException if it
encounters problems in any of its subphases, and removes the special handling
from the explain and highlight phases. It also removes the need to pass shard ids
around when building HitContext objects.
2020-09-15 09:28:19 +01:00
Alan Woodward 8089210815 Some small cleanups in TermVectorsService (#62292)
We removed the use of aggregated stats from term vectors back in #16452, but there is
a bunch of dead code left here which can be stripped out.
2020-09-15 09:01:49 +01:00
Ignacio Vera 3536f7f7c2
Initialize BitArray storage as number of bits (#62327) (#62354) 2020-09-15 08:34:22 +02:00
Armin Braun c81a076f5a
Improve Efficiency of ClusterApplierService Iteration (#62282) (#62350)
The complexity of removing a timeout listener was `O(n)` which
means that in case of many queued up CS update tasks (such as in the
case of an avalanche of dynamic mapping updates) we're dealing with
quadratic complexity for timing out N tasks which was observed to be
an issue in practice.

This PR makes the complexity of timing out a task `O(1)` and generally
simplifies the iteration logic of listeners and applies to be a little
more efficient and inline better.
2020-09-15 05:59:48 +02:00
Julie Tibshirani f56ce4f39b
Fix failure in InnerHitBuilderTests around 'fields' option. (#62344)
The case InnerHitBuilderTests#testEqualsAndHashcode creates a copy of the object
by serializing + deserializing it, then applies a modification. If the 'fields'
list is empty, then deserializing it results in Collections.emptyList. Because
this is immutable, then modifying it can throw an UnsupportedOperationException.

This PR takes the same approach as for docvalue_fields, where we create a new
list instead of trying to add to an empty one.
2020-09-14 15:39:03 -07:00
Julie Tibshirani 4a19bdb2ea
Support the 'fields' option in inner_hits and top_hits. (#62337)
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation

Addresses #61949.
2020-09-14 11:51:45 -07:00
David Turner 9acd2fd1fd Minor cleanups to BytesReferenceStreamInput (#62302)
Followup to #61681:

- reuse the current iterator in `reset()` if possible
- simply some integer-overflow-avoidance in `skip()`
- clarify some comments
- address some IntelliJ warnings
2020-09-14 17:02:27 +01:00
Christoph Büscher e2eada2498
Fix disabling `allow_leading_wildcard` (#62300) (#62318)
Disabling the `query_string` queries `allow_leading_wildcard` parameter didn't
work after a change probably introduced in #60959 because the various field types
`wildcardQuery` don't check the leading characters like
QueryParserBase#getWildcardQuery does. This PR adds the missing check also
before calling the field types wildcard generating method.

Closes #62267
2020-09-14 17:13:17 +02:00
Alan Woodward 5358cee29c Cut over more mapping tests to MapperServiceTestCase (#62312)
Shaves a few more seconds off the build.
2020-09-14 16:00:37 +01:00
Armin Braun 95766da345
Save Some Allocations when Working with ClusterState (#62060) (#62303)
Just a number of obvious spots where we were allocating
duplicate empty structures or otherwise inefficient that I
found while investigating snapshot cluster state update performance.
2020-09-14 15:09:54 +02:00
Armin Braun 875af1c976
Remove Dead Variable in BlobStoreIndexShardSnapshots. (#62285) (#62295)
This was never used.

Co-authored-by: Howard <danielhuang@tencent.com>
2020-09-14 13:40:39 +02:00
Luca Cavanna 53bf057a53 [TEST] avoid double null check in TransportSearchActionTests 2020-09-11 10:10:09 +02:00
Nhat Nguyen aafb2cb812 Support point in time cross cluster search (#61827)
This commit integrates point in time into cross cluster search.

Relates #61062
Closes #61790
2020-09-10 19:25:48 -04:00
Nhat Nguyen 808c8689ac Always include the matching node when resolving point in time (#61658)
If shards are relocated to new nodes, then searches with a point in time
will fail, although a pit keeps search contexts open. This commit solves
this problem by reducing info used by SearchShardIterator and always
including the matching nodes when resolving a point in time.

Closes #61627
2020-09-10 19:25:48 -04:00
Nhat Nguyen 035f0638f4 Support point in time in async_search (#61560)
This commit integrates point in time into async search and
ensures that it works correctly with security enabled.

Relates #61062
2020-09-10 19:25:48 -04:00
Nhat Nguyen 063a6d047c Release search context when scroll keep_alive is too large (#62179)
Previously, we close related search contexts if the keep_alive of a scroll is too large. 
But we accidentally change this behavior in #62061.
2020-09-10 19:25:48 -04:00
Nhat Nguyen 2eb1e8bc84 Make keep alive of point in time optional in search (#62184)
A search request should not be required to extend the keep_alive of a point in time. 
This change makes that parameter optional.
2020-09-10 19:25:48 -04:00
Jim Ferenczi 3fc35aa76e Shard Search Scroll failures consistency (#62061)
Today some uncaught shard failures such as RejectedExecutionException skips the release of shard context
and let subsequent scroll requests access the same shard context again. Depending on how the other shards advanced,
this behavior can lead to missing data since scrolls always move forward.
In order to avoid hidden data loss, this commit ensures that we always release the context of shard search scroll requests whenever a failure
occurs locally. The shard search context will no longer exist in subsequent scroll requests which will lead to consistent shard failures
in the responses.
This change also modifies the retry tests of the reindex feature. Reindex retries scroll search request that contains a shard failure and
move on whenever the failure disappears. That is not compatible with how scrolls work and can lead to missing data as explained above.
That means that reindex will now report scroll failures when search rejection happen during the operation instead of skipping document
silently.
Finally this change removes an old TODO that was fulfilled with #61062.
2020-09-10 19:25:48 -04:00
Jim Ferenczi 4d528e91a1 Ensure validation of the reader context is executed first (#61831)
This change makes sure that reader context is validated (`SearchOperationListener#validateReaderContext)
before any other operation and that it is correctly recycled or removed at the end of the operation.
This commit also fixes a race condition bug that would allocate the security reader for scrolls more than once.

Relates #61446

Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
2020-09-10 19:25:48 -04:00
Luca Cavanna 44bd4a6004 Fix point in time toXContent impl (#62080)
PointInTimeBuilder is a ToXContentObject yet it does not print out a whole object (it is rather a fragment). Also, when it is printed out as part of SearchSourceBuilder, an error is thrown because pit should be wrapped into its own object.

This commit fixes this and adds tests for it.
2020-09-10 19:25:47 -04:00
Nhat Nguyen 3d69b5c41e Introduce point in time APIs in x-pack basic (#61062)
This commit introduces a new API that manages point-in-times in x-pack
basic. Elasticsearch pit (point in time) is a lightweight view into the
state of the data as it existed when initiated. A search request by
default executes against the most recent point in time. In some cases,
it is preferred to perform multiple search requests using the same point
in time. For example, if refreshes happen between search_after requests,
then the results of those requests might not be consistent as changes
happening between searches are only visible to the more recent point in
time.

A point in time must be opened before being used in search requests. The
`keep_alive` parameter tells Elasticsearch how long it should keep a
point in time around.

```
POST /my_index/_pit?keep_alive=1m
```

The response from the above request includes a `id`, which should be
passed to the `id` of the `pit` parameter of search requests.

```
POST /_search
{
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    },
    "pit": {
            "id":  "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
            "keep_alive": "1m"
    }
}
```

Point-in-times are automatically closed when the `keep_alive` is
elapsed. However, keeping point-in-times has a cost; hence,
point-in-times should be closed as soon as they are no longer used in
search requests.

```
DELETE /_pit
{
    "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
}
```

#### Notable works in this change:

- Move the search state to the coordinating node: #52741
- Allow searches with a specific reader context: #53989
- Add the ability to acquire readers in IndexShard: #54966

Relates #46523
Relates #26472

Co-authored-by: Jim Ferenczi <jimczi@apache.org>
2020-09-10 19:25:47 -04:00
Armin Braun e0a81f7d14
Speed up Version Checks (#62216) (#62253)
The `fromId` method would show up in profiling and JIT analysis as not-inlinable because it's too large
in the contexts it's used in in many cases and was consuming a surprising amount of cycles for computing the
min compat versions.

-> extract cold path from `fromId` to make JIT happy and cache minimumg compatible versions to fields.
2020-09-10 22:57:06 +02:00
Armin Braun 25db5acb0d
Simplify TimeValue Serialization (#62023) (#62248)
This can be done without map lookups => less code and much smaller methods => better inlining potentially.
2020-09-10 20:16:21 +02:00
Armin Braun 7b941a18e9
Optimize Snapshot Shard Status Update Handling (#62070) (#62219)
Avoiding a number of noop updates that were observed to cause trouble (as in needless noop CS publishing) which can become an issue when working with a large number of concurrent snapshot operations.
Also this sets up some simplifications made in the clone snapshot branch.
2020-09-10 16:29:16 +02:00
Ignacio Vera c8981ea93d
upgrade to lucene-8.7.0-snapshot-b313618cc1d (#62213) (#62222) 2020-09-10 16:23:18 +02:00
Igor Motov b6bff56a56
Fix hard_bounds interval handling (#62129) (#62188)
The hard bounds were incorrectly scaled for intervals, which was
causing incorrect buckets to show up or no buckets at all for
interval other than 1.

Closes #62126
2020-09-09 15:42:12 -04:00
Nik Everett 1104d65465
Fix bug with terms' min_doc_count (#62130) (#62177)
The `global_ordinals` implementation of `terms` had a bug when
`min_doc_count: 0` that'd cause sub-aggregations to have array index out
of bounds exceptions. Ooops. My fault. This fixes the bug by assigning
ordinals to those buckets.

Closes #62084
2020-09-09 13:04:51 -04:00