Commit Graph

3020 Commits

Author SHA1 Message Date
Martijn van Groningen e591d30918
fixed test compile issue 2019-05-27 10:17:00 +02:00
Martijn van Groningen 48a71459c0
Improve how internal representation of pipelines are updated (#42257)
If a single pipeline is updated then the internal representation of
all pipelines was updated. With this change, only the internal representation
of the pipelines that have been modified will be updated.

Prior to this change the IngestMetadata of the previous and current cluster
was used to determine whether the internal representation of pipelines
should be updated. If applying the previous cluster state change failed then
subsequent cluster state changes that have no changes to IngestMetadata
will not attempt to update the internal representation of the pipelines.

This commit, changes how the IngestService updates the internal representation
by keeping track of the underlying configuration and use that to detect
against the new IngestMetadata whether a pipeline configuration has been
changed and if so, then the internal pipeline representation will be updated.
2019-05-27 10:01:15 +02:00
Nhat Nguyen 85e60850af Add debug log for retention leases (#42557)
We need more information to understand why CcrRetentionLeaseIT is
failing. This commit adds some debug log to retention leases and enables
them in CcrRetentionLeaseIT.
2019-05-26 16:04:47 -04:00
Tanguy Leroux 6bec876682 Improve Close Index Response (#39687)
This changes the `CloseIndexResponse` so that it reports closing result
for each index. Shard failures or exception are also reported per index,
and the global acknowledgment flag is computed from the index results
only.

The response looks like:
```
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "indices" : {
    "docs" : {
      "closed" : true
    }
  }
}
```

The response reports shard failures like:
```
{
  "acknowledged" : false,
  "shards_acknowledged" : false,
  "indices" : {
    "docs-1" : {
      "closed" : true
    },
    "docs-2" : {
      "closed" : false,
      "shards" : {
        "1" : {
          "failures" : [
            {
              "shard" : 1,
              "index" : "docs-2",
              "status" : "BAD_REQUEST",
              "reason" : {
                "type" : "index_closed_exception",
                "reason" : "closed",
                "index_uuid" : "JFmQwr_aSPiZbkAH_KEF7A",
                "index" : "docs-2"
              }
            }
          ]
        }
      }
    },
    "docs-3" : {
      "closed" : true
    }
  }
}
```

Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
2019-05-24 21:57:55 -04:00
Julie Tibshirani 3a6c2525ca
Deprecate support for chained multi-fields. (#42330)
This PR contains a straight backport of #41926, and also updates the
migration documentation and deprecation info API for 7.x.
2019-05-24 15:55:06 -07:00
Jason Tedor f2cfd09289
Remove renewal in retention lease recovery test (#42536)
This commit removes the act of renewing some retention leases during a
retention lease recovery test. Having renewal does not add anything
extra to this test, but does allow for some situations where the test
can fail spuriously (i.e., in a way that does not indicate that
production code is broken).
2019-05-24 17:40:59 -05:00
Nhat Nguyen 74d771d8f6 Adjust load SplitIndexIT#testSplitIndexPrimaryTerm (#42477)
SplitIndexIT#testSplitIndexPrimaryTerm sometimes timeout due to
relocating many shards. This change adjusts loads and increases
the timeout.
2019-05-24 15:47:29 -04:00
Nhat Nguyen 02739d038c Mute accounting circuit breaker check after test (#42448)
If we close an engine while a refresh is happening, then we might leak
refCount of some SegmentReaders. We need to skip the ram accounting
circuit breaker check until we have a new Lucene snapshot which includes
the fix for LUCENE-8809.

This also adds a test to the engine but left it muted so we won't forget
to reenable this check.

Closes #30290
2019-05-24 15:42:12 -04:00
Nhat Nguyen 329d1307a5 Add test to verify force primary allocation on closed indices (#42458)
This change adds a test verifying that we can force primary allocation on closed indices.
2019-05-24 17:23:58 +02:00
Henning Andersen 075fd2a0ac Shard CLI tool always check shards (#41480)
The shard CLI tool would not do anything if a corruption marker was not
present. But a corruption marker is only added if a corruption is
detected during indexing/writing, not if a search or other read fails.

Changed the tool to always check shards regardless of corruption marker
presence.

Related to #41298
2019-05-24 16:49:37 +02:00
Marios Trivyzas 523b5bfdb5 Fix sorting on nested field with unmapped (#42451)
Previously sorting on a missing nested field would fail with an
Exception:
`[nested_field] failed to find nested object under path [nested_path]`
despite `unmapped_type` being set on the query.

Fixes: #33644

(cherry picked from commit 631142d5dd088a10de8dcd939b50a14301173283)
2019-05-24 15:47:41 +02:00
Christoph Büscher 12d5642e93 Small internal AnalysisRegistry changes (#42500)
Some internal refactorings to the AnalysisRegistry, spin-off from #40782.
2019-05-24 15:27:35 +02:00
David Turner a5b6ed8d1e Remove AwaitsFix of #41967 following #42504 2019-05-24 14:26:49 +01:00
David Turner 4d02ca1633 Drain master task queue when stabilising (#42504)
Today the default stabilisation time is calculated on the assumption that the
elected master has no pending tasks to process when it is elected, but this is
not a safe assumption to make. This can result in a cluster reaching the end of
its stabilisation time without having stabilised. Furthermore in #36943 we
increased the probability that each step in `runRandomly()` enqueues another
task, vastly increasing the chance that we hit such a situation.

This change extends the stabilisation process to allow time for all pending
tasks, plus a task that might currently be in flight.

Fixes #41967, in which the master entered the stabilisation phase with over 800
tasks to process.
2019-05-24 14:18:02 +01:00
weizijun 40348ab726 Use accurate total hits in IndexPrimaryRelocationIT
By default, we track total hits up to 10k but we might index more than
10k documents `testPrimaryRelocationWhileIndexing`. With this change, we
always request for the accurate total hits in the test.

> java.lang.AssertionError: Count is 10000+ hits but 11684 was expected.
2019-05-24 12:47:21 +02:00
Simon Willnauer 46ccfba808 Remove IndexStore and DirectoryService (#42446)
Both of these classes are basically a bloated wrapper around a simple
construct that can simply be a DirectoryFactory interface. This change
removes both classes and replaces them with a simple stateless interface
that creates a new `Directory` per shard. The concept of `index.store` is preserved
since it makes sense from a configuration perspective.
2019-05-24 12:14:56 +02:00
David Turner f864f6a740 Cluster state from API should always have a master (#42454)
Today the `TransportClusterStateAction` ignores the state passed by the
`TransportMasterNodeAction` and obtains its state from the cluster applier.
This might be inconsistent, showing a different node as the master or maybe
even having no master.

This change adjusts the action to use the passed-in state directly, and adds
tests showing that the state returned is consistent with our expectations even
if there is a concurrent master failover.

Fixes #38331
Relates #38432
2019-05-24 08:45:22 +01:00
David Turner 528f8cc073 Add stack traces to RetentionLeasesIT failures (#42425)
Today `RetentionLeaseIT` calls `fail(e.toString())` on some exceptions, losing
the stack trace that came with the exception. This commit adjusts this to
re-throw the exception wrapped in an `AssertionError` so we can see more
details about failures such as #41430.
2019-05-24 08:37:51 +01:00
David Turner c0974a9813 Add more logging to MockDiskUsagesIT (#42424)
This commit adds a log message containing the routing table, emitted on each
iteration of the failing assertBusy() in #40174. It also modernizes the code a
bit.
2019-05-24 08:28:10 +01:00
Jack Conradson 167f391cfd Bug fix to allow access to top level params in reduce script (#42096) 2019-05-23 16:00:39 -07:00
Ryan Ernst a49bafc194
Split document and metadata fields in GetResult (#38373) (#42456)
This commit makes creators of GetField split the fields into document fields and metadata fields. It is part of larger refactoring that aims to remove the calls to static methods of MapperService related to metadata fields, as discussed in #24422.
2019-05-23 14:01:07 -07:00
Jake Landis 2b22ceac04
Bulk processor concurrent requests (#41451) (#42438)
`org.elasticsearch.action.bulk.BulkProcessor` is a threadsafe class that
allows for simple semantics to deal with sending bulk requests. Once a
bulk reaches it's pre-defined size, documents, or flush interval it will
execute sending the bulk. One configurable option is the number of concurrent
outstanding bulk requests. That concurrency is implemented in
`org.elasticsearch.action.bulk.BulkRequestHandler` via a semaphore. However,
the only code that currently calls into this code is blocked by `synchronized`
methods. This results in the in-ability for the BulkProcessor to behave concurrently
despite supporting configurable amounts of concurrent requests.

This change removes the `synchronized` method in favor an explicit
lock around the non-thread safe parts of the method. The call into
`org.elasticsearch.action.bulk.BulkRequestHandler` is no longer blocking, which
allows `org.elasticsearch.action.bulk.BulkRequestHandler` to handle it's own concurrency.
2019-05-23 14:22:16 -05:00
Simon Willnauer 5a884dac03 Unguice Snapshot / Restore services (#42357)
This removes the @Inject annotations from the Snapshot/Restore infrastructure
classes and registers them manually in Node.java
2019-05-23 17:09:26 +02:00
Jim Ferenczi a497603219 Disable max score optimization for queries with unbounded max scores (#41361)
Lucene 8 has the ability to skip blocks of non-competitive documents.
However some queries don't track their maximum score (`script_score`, `span`, ...)
so they always return Float.POSITIVE_INFINITY as maximum score. This can slow down
some boolean queries if other clauses have bounded max scores. This commit disables
the max score optimization when we detect a mandatory scoring clause with unbounded
 max scores. Optional clauses are not checked since they can still skip documents
 when the unbounded clause is after the current document.
2019-05-23 16:53:57 +02:00
Yannick Welsch f57fdc57e9
Deprecate max_local_storage_nodes (#42426)
Allows this setting to be removed in 8.0, see #42428
2019-05-23 15:59:55 +02:00
Christoph Büscher 85ff9543b7 Prevent normalizer from not being closed on exception (#42375)
Currently AnalysisRegistry#processNormalizerFactory creates a normalizer and
only later checks whether it should be added to the normalizer map passed in. In
case we throw an exception it isn't closed. This can be prevented by moving the
check that throws the exception earlier.
2019-05-23 15:53:55 +02:00
markharwood c2c8d0e637 Test fix - results equality failed because of subtle scoring differences between replicas. (#42366)
Diverging merge policies means the segments and therefore scores are not the same.
Fixed the test by ensuring there are zero replicas.

Closes #32492
2019-05-23 12:00:57 +01:00
Jim Ferenczi b88e80ab89 Upgrade to Lucene 8.1.0 (#42214)
This commit upgrades to the GA release of Lucene 8.1.0
2019-05-23 11:46:45 +02:00
Jim Ferenczi 4ca5649a0d Upgrade to lucene 8.1.0-snapshot-e460356abe (#40952) 2019-05-23 11:45:33 +02:00
Marios Trivyzas 0777223bab
Allow `fields` to be set to `*` (#42301)
Allow for SimpleQueryString, QueryString and MultiMatchQuery
to set the `fields` parameter to the wildcard `*`. If so, set
the leniency to `true`, to achieve the same behaviour as from the
`"default_field" : "*" setting.

Furthermore,  check if `*` is in the list of the `default_field` but
not necessarily as the 1st element.

Closes: #39577
(cherry picked from commit e75ff0c748e6b68232c2b08e19ac4a4934918264)
2019-05-23 10:10:48 +02:00
Yannick Welsch a71d19e92a Ensure testAckedIndexing uses disruption index settings
AbstractDisruptionTestCase set a lower global checkpoint sync interval setting, but this was ignored by
testAckedIndexing, which has led to spurious test failures

Relates #41068, #38931
2019-05-22 19:13:14 +02:00
Jake Landis 496fee3333
bump to 7.3 (#42365) 2019-05-22 11:57:07 -05:00
Luca Cavanna c2af62455f Cut over SearchResponse and SearchTemplateResponse to Writeable (#41855)
Relates to #34389
2019-05-22 18:47:54 +02:00
Luca Cavanna 29c9bb9181 Clean up ShardId usage of Streamable (#41843)
ShardId already implements Writeable so there is no need for it to implement Streamable too. Also the readShardId static method can be
easily replaced with direct usages of the constructor that takes a
StreamInput as argument.
2019-05-22 18:47:54 +02:00
Luca Cavanna 96ba0b13e0 Cut over MultiSearchResponse to Writeable (#41844)
Relates to #34389
2019-05-22 18:47:54 +02:00
Luca Cavanna 1ded45b0a2 Cut over SearchPhaseResult to Writeable (#41853)
Relates to #34389
2019-05-22 18:47:54 +02:00
Luca Cavanna c85f285298 Move InternalAggregations to Writeable (#41841)
Relates to #34389
2019-05-22 18:47:54 +02:00
Luca Cavanna 39d4c7c26f Skip explain fetch sub phase when request holds only suggestions (#41739)
In case a search request holds only the suggest section, the query phase
is skipped and only the suggest phase is executed instead. There will
never be hits returned, and in case the explain flag is set to true, the
 explain sub phase throws a null pointer exception as the query is null.
 Usually a null query is replaced with a match all query as part of SearchContext#preProcess which is though skipped as well with suggest
 only searches. To address this, we skip the explain sub fetch phase
 for search requests that only requested suggestions.

Closes #31260
2019-05-22 18:47:54 +02:00
Luca Cavanna 3416cda8b1 Cut over ClusterSearchShardsGroup to Writeable (#41788) 2019-05-22 18:47:54 +02:00
Guillaume Darmont 3e231bbad6 StackOverflowError when calling BulkRequest#add (#41672)
Removing of payload in BulkRequest (#39843) had a side effect of making
`BulkRequest.add(DocWriteRequest<?>...)` (with varargs) recursive, thus
leading to StackOverflowError. This PR adds a small change in
RequestConvertersTests to show the error and the corresponding fix in
`BulkRequest`.

Fixes #41668
2019-05-22 11:22:14 -05:00
mushao999 d4b5933225 Fix alpha version error message (#40406) 2019-05-22 09:06:10 -07:00
Yannick Welsch eae58c477c Remove testNodeFailuresAreProcessedOnce
This test was not checking the thing it was supposed to anyway.
2019-05-22 14:52:01 +02:00
Yannick Welsch 250973af1d Fix testCannotJoinIfMasterLostDataFolder
Relates to #41047
2019-05-22 14:37:31 +02:00
Simon Willnauer a79cd77e5c Remove IndexShard dependency from Repository (#42213)
* Remove IndexShard dependency from Repository

In order to simplify repository testing especially for BlobStoreRepository
it's important to remove the dependency on IndexShard and reduce it to
Store and MapperService (in the snapshot case). This significantly reduces
the dependcy footprint for Repository and allows unittesting without starting
nodes or instantiate entire shard instances. This change deprecates the old
method signatures and adds a unittest for FileRepository to show the advantage
of this change.
In addition, the unittesting surfaced a bug where the internal file names that
are private to the repository were used in the recovery stats instead of the
target file names which makes it impossible to relate to the actual lucene files
in the recovery stats.

* don't delegate deprecated methods

* apply comments

* test
2019-05-22 14:27:11 +02:00
Ignacio Vera 3a20ff7e86
Fix TopHitsAggregationBuilder adding duplicate _score sort clauses (#42179) (#42343)
When using High Level Rest Client Java API to produce search query, using AggregationBuilders.topHits("th").sort("_score", SortOrder.DESC)
caused query to contain duplicate sort clauses.
2019-05-22 14:02:52 +02:00
Yannick Welsch f338005179 Revert "Mute MinimumMasterNodesIT.testThreeNodesNoMasterBlock()"
This reverts commit 448fc8444559be3145e4a7f65dec794ebbff7b81.
2019-05-22 13:22:09 +02:00
Yannick Welsch 0c7322ebf2 Avoid bubbling up failures from a shard that is recovering (#42287)
A shard that is undergoing peer recovery is subject to logging warnings of the form

org.elasticsearch.action.FailedNodeException: Failed node [XYZ]
...
Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file found in ...

These failures are actually harmless, and expected to happen while a peer recovery is ongoing (i.e.
there is an IndexShard instance, but no proper IndexCommit just yet).
As these failures are currently bubbled up to the master, they cause unnecessary reroutes and
confusion amongst users due to being logged as warnings.

Closes  #40107
2019-05-22 12:26:15 +02:00
Yannick Welsch 770d8e9e39 Remove usage of max_local_storage_nodes in test infrastructure (#41652)
Moves the test infrastructure away from using node.max_local_storage_nodes, allowing us in a
follow-up PR to deprecate this setting in 7.x and to remove it in 8.0.

This also changes the behavior of InternalTestCluster so that starting up nodes will not automatically
reuse data folders of previously stopped nodes. If this behavior is desired, it needs to be explicitly
done by passing the data path from the stopped node to the new node that is started.
2019-05-22 11:04:55 +02:00
Yannick Welsch c9dedf180b Use comparator for Reconfigurator (#42283)
Simplifies the voting configuration reconfiguration logic by switching to an explicit Comparator for
the priorities. Does not make changes to the behavior of the component.
2019-05-22 10:04:51 +02:00
Nhat Nguyen bcbf1aff6b Peer recovery should flush at the end (#41660)
Flushing at the end of a peer recovery (if needed) can bring these
benefits:

1. Closing an index won't end up with the red state for a recovering
replica should always be ready for closing whether it performs the
verifying-before-close step or not.

2. Good opportunities to compact store (i.e., flushing and merging
Lucene, and trimming translog)

Closes #40024
Closes #39588
2019-05-21 22:45:17 -04:00