Commit Graph

3087 Commits

Author SHA1 Message Date
Alan Woodward 8e23e4518a Move construction of custom analyzers into AnalysisRegistry (#42940)
Both TransportAnalyzeAction and CategorizationAnalyzer have logic to build
custom analyzers for index-independent analysis. A lot of this code is duplicated,
and it requires the AnalysisRegistry to expose a number of internal provider
classes, as well as making some assumptions about when analysis components are
constructed.

This commit moves the build logic directly into AnalysisRegistry, reducing the
registry's API surface considerably.
2019-06-10 14:33:25 +01:00
Jim Ferenczi 39cb1abc9d Fix auto fuzziness in query_string query (#42897)
Setting `auto` after the fuzzy operator (e.g. `"query": "foo~auto"`) in the `query_string`
does not take the length of the term into account when computing the distance and always use
a max distance of 1. This change fixes this disrepancy by ensuring that the term is passed when
the fuzziness is computed.
2019-06-10 10:13:16 +02:00
Vigya Sharma 25218733e6 Allow routing commands with ?retry_failed=true (#42658)
We respect allocation deciders, including the `MaxRetryAllocationDecider`, when
executing reroute commands. If you specify `?retry_failed=true` then the retry
counter is reset, but today this does not happen until after trying to execute
the reroute commands. This means that if an allocation has repeatedly failed,
but you want to take control and assign a shard to a particular node to work
around the repeated failures, you cannot execute the routing command in the
same call to `POST /_cluster/reroute` as the one that resets the failure
counter.

This commit fixes this by resetting the failure counter first, meaning that you
can now explicitly allocate a repeatedly-failed shard like this:

```
POST /_cluster/reroute?retry_failed=true
{
  "commands": [
    {
      "allocate_replica": {
        "index": "blahblah",
        "shard": 2,
        "node": "node-4"
      }
    }
  ]
}
```

Fixes #39546
2019-06-10 08:31:05 +01:00
Jason Tedor 63bad28005
Do not allow modify aliases on followers (#43017)
Now that aliases are replicated by a follower from its leader, this
commit prevents directly modifying aliases on follower indices.
2019-06-09 22:53:54 -04:00
Nhat Nguyen 0ebcb21d2c Unmuted testRecoverBrokenIndexMetadata
These tests should be okay as we flush at the end of peer recovery.

Closes #40867
2019-06-09 10:26:57 -04:00
Nhat Nguyen afe65b5988 Fix assertion in ReadOnlyEngine (#43010)
We should execute the assertion before throwing an exception;
otherwise, it's a noop.
2019-06-09 10:26:56 -04:00
Jason Tedor 915d2f2daa
Refactor put mapping request validation for reuse (#43005)
This commit refactors put mapping request validation for reuse. The
concrete case that we are after here is the ability to apply effectively
the same framework to indices aliases requests. This commit refactors
the put mapping request validation framework to allow for that.
2019-06-09 10:19:04 -04:00
Nhat Nguyen 0a982fc57f Mute testLookupSeqNoByIdInLucene
Tracked at #42979
2019-06-08 00:30:12 -04:00
Jason Tedor b580677412
Fix put mapping request validators random test
This commit fixes a test bug in the request validators random test. In
particular, an assertion was not properly nested in a guard that would
ensure that was at least one failure.

Relates #43000
2019-06-07 17:47:51 -04:00
Jason Tedor d6fe4b648d
Fix possible NPE in put mapping validators (#43000)
When applying put mapping validators, we apply all the validators in the
collection. If a failure occurs, we collect that as a top-level
exception, and suppress any additional failures into the top-level
exception. However, if a request passes the validator after a top-level
exception has been collected, we would try to suppress a null exception
into the top-level exception. This is a violation of the
Throwable#addSuppressed API. This commit addresses this, and adds test
to cover the logic of collecting the failures when validating a put
mapping request.
2019-06-07 16:24:12 -04:00
David Turner 5bc0dfce94
Improve translog corruption detection (#42980)
Today we test for translog corruption by incrementing a byte by 1 somewhere in
a file, and verify that this leads to a `TranslogCorruptionException`.
However, we rely on _all_ corruptions leading to this exception in the
`RemoveCorruptedShardDataCommand`: this command fails if a translog file
corruption leads to a different kind of exception, and `EOFException` and
`NegativeArraySizeException` are both possible. This commit strengthens the
translog corruption detection tests by simulating the following:

- a random value is written
- the file is truncated

It also makes sure that we return a `TranslogCorruptionException` in all such
cases.

Fixes #42661
Backport of #42744
2019-06-07 20:28:02 +01:00
Jason Tedor 479a1eeff6
Drop dead code for socket permissions for transport (#42990)
This code has not been needed since the removal of tribe nodes, it was
left behind when those were dropped (note that regular transport
permissions are handled through transport profiles, even if they are not
explicitly in use).
2019-06-07 15:22:10 -04:00
markharwood 0719779a48
Search - enable low_level_cancellation by default. (#42291) (#42857)
Benchmarking on worst-case queries (max agg on match_all or popular-term query with large index) was not noticeably slower.

Closes #26258
2019-06-07 14:53:17 +01:00
Henning Andersen dea935ac31
Reindex max_docs parameter name (#42942)
Previously, a reindex request had two different size specifications in the body:
* Outer level, determining the maximum documents to process
* Inside the source element, determining the scroll/batch size.

The outer level size has now been renamed to max_docs to
avoid confusion and clarify its semantics, with backwards compatibility and
deprecation warnings for using size.
Similarly, the size parameter has been renamed to max_docs for
update/delete-by-query to keep the 3 interfaces consistent.

Finally, all 3 endpoints now support max_docs in both body and URL.

Relates #24344
2019-06-07 12:16:36 +02:00
David Turner 5929803413 Relax timeout in NodeConnectionsServiceTests (#42934)
Today we assert that the connection thread is blocked by the time the test gets
to the barrier, but in fact this is not a valid assertion. The following
`Thread.sleep()` will cause the test to fail reasonably often.

```diff
diff --git a/server/src/test/java/org/elasticsearch/cluster/NodeConnectionsServiceTests.java b/server/src/test/java/org/elasticsearch/cluster/NodeConnectionsServiceTests.java
index 193cde3180d..0e57211cec4 100644
--- a/server/src/test/java/org/elasticsearch/cluster/NodeConnectionsServiceTests.java
+++ b/server/src/test/java/org/elasticsearch/cluster/NodeConnectionsServiceTests.java
@@ -364,6 +364,7 @@ public class NodeConnectionsServiceTests extends ESTestCase {
             final CheckedRunnable<Exception> connectionBlock = nodeConnectionBlocks.get(node);
             if (connectionBlock != null) {
                 try {
+                    Thread.sleep(50);
                     connectionBlock.run();
                 } catch (Exception e) {
                     throw new AssertionError(e);
```

This change relaxes the test to allow some time for the connection thread to
hit the barrier.

Fixes #40170
2019-06-07 10:38:56 +01:00
henryptung 61b62125b8 Wire query cache into sorting nested-filter computation (#42906)
Don't use Lucene's default query cache when filtering in sort.

Closes #42813
2019-06-06 21:16:58 +02:00
Henning Andersen ca5dbf93a5 Fix concurrent search and index delete (#42621)
Changed order of listener invocation so that we notify before
registering search context and notify after unregistering same.

This ensures that count up/down like what we do in ShardSearchStats
works. Otherwise, we risk notifying onFreeScrollContext before notifying
onNewScrollContext (same for onFreeContext/onNewContext, but we
currently have no assertions failing in those).

Closes #28053
2019-06-06 20:10:43 +02:00
Simon Willnauer 7fcca55a3c [TEST] Remove unnecessary log line 2019-06-06 14:17:44 +02:00
Simon Willnauer 2582e1e8ad Fix `InternalEngineTests#testPruneAwayDeletedButRetainedIds`
The test failed because we had only a single document in the index
that got deleted such that some assertions that expected at least
one live doc failed.

Relates to: #40741
2019-06-06 14:16:24 +02:00
Yannick Welsch 9f7be70f7a Fix testPendingTasks (#42922)
Fixes a race in the test which can be reliably reproduced by adding Thread.sleep(100) to the end of
IndicesService.processPendingDeletes

Closes #18747
2019-06-06 14:15:48 +02:00
Yannick Welsch 72735be673 Fix NPE when rejecting bulk updates (#42923)
Single updates use a different internal code path than updates that are wrapped in a bulk request.
While working on a refactoring to bring both closer together I've noticed that bulk updates were
failing some of the tests that single updates passed. In particular, bulk updates cause
NullPointerExceptions to be thrown and listeners not being properly notified when being rejected
from the thread pool.
2019-06-06 14:15:48 +02:00
Simon Willnauer 2c3bd32aff Add a merge policy that prunes ID postings for soft-deleted but retained documents (#40741)
This change adds a merge policy that drops all _id postings for documents that
are marked as soft-deleted but retained across merges. This is usually unnecessary
unless soft-deletes are used with a retention policy since otherwise a merge would
remove deleted documents anyway.

Yet, this merge policy prevents extreme cases where a very large number of soft-deleted
documents are retained and are impacting update performance.
Note, using this merge policy will remove all lookup by ID capabilities for soft-deleted documents.
2019-06-06 13:41:46 +02:00
Gordon Brown 6eb4600e93
Add custom metadata to snapshots (#41281)
Adds a metadata field to snapshots which can be used to store arbitrary
key-value information. This may be useful for attaching a description of
why a snapshot was taken, tagging snapshots to make categorization
easier, or identifying the source of automatically-created snapshots.
2019-06-05 17:30:31 -06:00
Mark Vieira 1f4ff97d7d
Mute failing test
(cherry picked from commit 4952d4facf5949abdb9aae47dbe1ee18cf7eef99)
2019-06-05 13:47:18 -07:00
Przemyslaw Gomulka ab5bc83597
Deprecation info for joda-java migration on 7.x (#42659)
Some clusters might have been already migrated to version 7 without being warned about the joda-java migration changes.
Deprecation api on that version will give them guidance on what patterns need to be changed.
relates. This change is using the same logic like in 6.8 that is: verifying the pattern is from the incompatible set ('y'-Y', 'C', 'Z' etc), not from predifined set, not prefixed with 8. AND was also created in 6.x. Mappings created in 7.x are considered migrated and should not generate warnings

There is no pipeline check (present on 6.8) as it is impossible to verify when the pipeline was created, and therefore to make sure the format is depracated or not
#42010
2019-06-05 19:50:04 +02:00
Simon Willnauer d3524fdd06 Add back import after backport 2019-06-05 11:25:19 +02:00
Simon Willnauer 4dfaeb9046 Remove post Java 9 API usage after backport 2019-06-05 11:24:58 +02:00
Jim Ferenczi de0ea4bbf7 Deduplicate alias and concrete fields in query field expansion (#42328)
The full-text query parsers accept field pattern that are expanded using the mapping.
Alias field are also detected during the expansion but they are not deduplicated with the
concrete fields that are found from other patterns (or the same). This change ensures
that we deduplicate the target fields of the full-text query parsers in order to avoid
adding the same clause multiple times. Boolean queries are already able to deduplicate
clauses during rewrite but since we also use DisjunctionMaxQuery it is preferable to detect
 these duplicates early on.
2019-06-05 11:05:40 +02:00
Simon Willnauer 41a9f3ae3b Use reader attributes to control term dict memory useage (#42838)
This change makes use of the reader attributes added in LUCENE-8671
to ensure that `_id` fields are always on-heap for best update performance
and term dicts are generally off-heap on Read-Only engines.

Closes #38390
2019-06-05 11:01:06 +02:00
David Turner 955aee8a07 More logging in testRerouteOccursOnDiskPassingHighWatermark (#42864)
This test is failing because recoveries of these empty shards are not
completing in a reasonable time, but the reason for this is still obscure. This
commit adds yet more logging.

Relates #40174, #42424
2019-06-05 09:05:44 +01:00
Jason Tedor 78be3dde25
Enable testing against JDK 13 EA builds (#40829)
This commit adds JDK 13 to the CI rotation for testing. For now, we will
be testing against JDK 13 EA builds.
2019-06-04 20:54:24 -04:00
Jason Tedor 117df87b2b
Replicate aliases in cross-cluster replication (#42875)
This commit adds functionality so that aliases that are manipulated on
leader indices are replicated by the shard follow tasks to the follower
indices. Note that we ignore write indices. This is due to the fact that
follower indices do not receive direct writes so the concept is not
useful.

Relates #41815
2019-06-04 20:36:24 -04:00
Mark Vieira e44b8b1e2e
[Backport] Remove dependency substitutions 7.x (#42866)
* Remove unnecessary usage of Gradle dependency substitution rules (#42773)

(cherry picked from commit 12d583dbf6f7d44f00aa365e34fc7e937c3c61f7)
2019-06-04 13:50:23 -07:00
Andrey Ershov 6391f90616 Fix testNoMasterActionsWriteMasterBlock (#42798)
This commit performs the proper restore of network disruption.
Previously disruptionScheme.stopDisrupting() was called that does not
ensure that connectivity between cluster nodes is restored. The test
was checking that the cluster has green status, but it was not checking
that connectivity between nodes is restored.
Here we switch to internalCluster().clearDisruptionScheme(true) which
performs both checks before returning.

Closes #39688

(cherry picked from commit c8988d5cf5a85f9b28ce148dbf100aaa6682a757)
2019-06-04 17:24:03 +02:00
Alan Woodward df124f32db Refactor control flow in TransportAnalyzeAction (#42801)
The control flow in TransportAnalyzeAction is currently spread across two large
methods, and is quite difficult to follow. This commit tidies things up a bit, to make
it clearer when we use pre-defined analyzers and when we use custom built ones.
2019-06-04 14:52:46 +01:00
Yu 428beabc49 Remove "template" field in IndexTemplateMetaData (#42099)
Remove "template" field from XContent parsing in IndexTemplateMetaData
2019-06-03 12:43:11 -05:00
Armin Braun 00db9c1a2f
Make Connection Future Err. Handling more Resilient (#42781) (#42804)
* There were a number of possible (runtime-) exceptions that could be raised in the adjusted code and prevent resolving the listener
* Relates #42350
2019-06-03 19:29:36 +02:00
David Turner df0f0b3d40
Rename autoMinMasterNodes to autoManageMasterNodes (#42789)
Renames the `ClusterScope` attribute `autoMinMasterNodes` to reflect its
broader meaning since 7.0.

Backport of the relevant part of #42700 to `7.x`.
2019-06-03 12:12:07 +01:00
Alan Woodward 2129d06643 Create client-only AnalyzeRequest/AnalyzeResponse classes (#42197)
This commit clones the existing AnalyzeRequest/AnalyzeResponse classes
to the high-level rest client, and adjusts request converters to use these new
classes.

This is a prerequisite to removing the Streamable interface from the internal
server version of these classes.
2019-06-03 09:46:36 +01:00
Alan Woodward d0da30e5f4 Return NO_INTERVALS rather than null from empty TokenStream (#42750)
IntervalBuilder#analyzeText will currently return null if it is passed an
empty TokenStream, which can lead to a confusing NullPointerException
later on during querying. This commit changes the code to return
NO_INTERVALS instead.

Fixes #42587
2019-05-31 17:45:57 +01:00
Jason Tedor 61c6a26b31
Remove locale-dependent string checking
We were checking if an exception was caused by a specific reason "Not a
directory". Alas, this reason is locale-dependent and can fail on
systems that are not set to en_US.UTF-8. This commit addresses this by
deriving what the locale-dependent error message would be and using that
for comparison with the actual exception thrown.

Relates #41689
2019-05-31 12:08:38 -04:00
Jason Tedor 371cb9a8ce
Remove Log4j 1.2 API as a dependency (#42702)
We had this as a dependency for legacy dependencies that still needed
the Log4j 1.2 API. This appears to no longer be necessary, so this
commit removes this artifact as a dependency.

To remove this dependency, we had to fix a few places where we were
accidentally relying on Log4j 1.2 instead of Log4j 2 (easy to do, since
both APIs were on the compile-time classpath).

Finally, we can remove our custom Netty logger factory. This was needed
when we were on Log4j 1.2 and handled logging in our own unique
way. When we migrated to Log4j 2 we could have dropped this
dependency. However, even then Netty would still pick up Log4j 1.2 since
it was on the classpath, thus the advantage to removing this as a
dependency now.
2019-05-30 16:08:07 -04:00
Mark Vieira c1816354ed
[Backport] Improve build configuration time (#42674) 2019-05-30 10:29:42 -07:00
David Turner d14799f0a5 Prevent merging nodes' data paths (#42665)
Today Elasticsearch does not prevent you from reconfiguring a node's
`path.data` to point to data paths that previously belonged to more than one
node. There's no good reason to be able to do this, and the consequences can be
quietly disastrous. Furthermore, #42489 might result in a user trying to split
up a previously-shared collection of data paths by hand and there's definitely
scope for mixing the paths up across nodes when doing this.

This change adds a check during startup to ensure that each data path belongs
to the same node.
2019-05-30 18:08:55 +01:00
Marios Trivyzas ce30afcd01
Deprecate CommonTermsQuery and cutoff_frequency (#42619) (#42691)
Since the max_score optimization landed in Elasticsearch 7,
the CommonTermsQuery is redundant and slower. Moreover the
cutoff_frequency parameter for MatchQuery and MultiMatchQuery
is redundant.

Relates to #27096

(cherry picked from commit 04b74497314eeec076753a33b3b6cc11549646e8)
2019-05-30 18:04:47 +02:00
David Turner 86b1a07887 Log leader and handshake failures by default (#42342)
Today the `LeaderChecker` and `HandshakingTransportAddressConnector` do not log
anything above `DEBUG` level. However there are some situations where it is
appropriate for them to log at a higher level:

- if the low-level handshake succeeds but the high-level one fails then this
  indicates a config error that the user should resolve, and the exception
  will help them to do so.

- if leader checks fail repeatedly then we restart discovery, and the exception
  will help to determine what went wrong.

Resolves #42153
2019-05-30 08:14:19 +01:00
Igor Motov d2f9ccbe18 Geo: Refactor libs/geo parsers (#42549)
Refactors the WKT and GeoJSON parsers from an utility class into an
instantiatable objects. This is a preliminary step in
preparation for moving out coordinate validators from Geometry
constructors. This should allow us to make validators plugable.
2019-05-29 20:07:27 -04:00
Henning Andersen 53f5d313cd Use correct global checkpoint sync interval (#42642)
A disruption test case need to use a lower checkpoint sync interval
since they verify sequence numbers after the test waiting max 10 seconds
for it to stabilize.

Closes #42637
2019-05-29 08:15:53 +02:00
Adrien Grand 38f9e24411
Add 7.1.2 version constant. (#42648)
Relates to #42635
2019-05-28 23:14:10 +02:00
Jim Ferenczi 267e5a1110 fix javadoc of SearchRequestBuilder#setTrackTotalHits (#42219) 2019-05-28 22:12:16 +02:00