Commit Graph

1707 Commits

Author SHA1 Message Date
Armin Braun 433a506d06
SNAPSHOT: Improve Resilience SnapshotShardService (#36113)
* Resolve the index in the snapshotting thread
* Added test for routing table - snapshot state mismatch
2018-12-03 16:39:29 +01:00
Jim Ferenczi 74aca756b8
Remove the distinction between query and filter context in QueryBuilders (#35354)
When building a query Lucene distinguishes two cases, queries that require to produce a score and queries that only need to match. We cloned this mechanism in the QueryBuilders in order to be able to produce different queries based on whether they need to produce a score or not. However the only case in es that require this distinction is the BoolQueryBuilder that sets a different minimum_should_match when a `bool` query is built in a filter context..
This behavior doesn't seem right because it makes the matching of `should` clauses different when the score is not required.

Closes #35293
2018-12-03 11:49:11 +01:00
Luca Cavanna 0ebc17743a
Histogram aggs: add empty buckets only in the final reduce step (#35921)
Empty buckets don't need to be added when performing an incremental reduction step, they can be added later in the final reduction step. This will allow us to later remove the max buckets limit when performing non final reduction.
2018-11-30 20:33:09 +01:00
Tim Brooks ea7ea51050
Make `TcpTransport#openConnection` fully async (#36095)
This is a follow-up to #35144. That commit made the underlying
connection opening process in TcpTransport asynchronous. However the
method still blocked on the process being complete before returning.
This commit moves the blocking to the ConnectionManager level. This is
another step towards the top-level TransportService api being async.
2018-11-30 11:30:42 -07:00
Tim Brooks 26dcbcc8cc
Remove `MockTcpTransport` for ESIntegTestCase (#36089)
This commit removes the `MockTcpTransport` as a transport option for
`ESIntegTestCase`. It is the first step in replacing the usages of
`MockTcpTransport` with `MockNioTransport`.
2018-11-30 09:04:51 -07:00
Adrien Grand fa3d365ee8
Fix CompositeBytesReference#slice to not throw AIOOBE with legal offsets. (#35955)
CompositeBytesReference#slice has two bugs:
 - One that makes it fail if the reference is empty and an empty slice is
   created, this is #35950 and is fixed by special-casing empty-slices.
 - One performance bug that makes it always create a composite slice when
   creating a slice that ends on a boundary, this is fixed by computing `limit`
   as the index of the sub reference that holds the last element rather than
   the next element after the slice.

Closes #35950
2018-11-30 10:32:46 +01:00
Zachary Tong 61c2db5ebb Revert "Deprecate X-Pack centric rollup endpoints (#35962)"
This reverts commit b84f1f6a3a.
2018-11-29 12:58:23 -05:00
Zachary Tong 40c5445480 Revert "[TEST] Use deprecated form of rollup endpoint in mixed cluster (#36000)"
This reverts commit 85cdf4f913.
2018-11-29 12:56:25 -05:00
Tim Brooks c305f9dc03
Make keepalive pings bidirectional and optimizable (#35441)
This is related to #34405 and a follow-up to #34753. It makes a number
of changes to our current keepalive pings.

The ping interval configuration is moved to the ConnectionProfile.

The server channel now responds to pings. This makes the keepalive
pings bidirectional.

On the client-side, the pings can now be optimized away. What this
means is that if the channel has received a message or sent a message
since the last pinging round, the ping is not sent for this round.
2018-11-29 08:55:53 -07:00
Zachary Tong 85cdf4f913
[TEST] Use deprecated form of rollup endpoint in mixed cluster (#36000)
When wiping rollup jobs, if we are in a mixed cluster with < v7.0 nodes
we need to fall back to the deprecated endpoint because we may talk
to a 6.x node.
2018-11-29 07:37:33 -05:00
Jason Tedor b84f1f6a3a
Deprecate X-Pack centric rollup endpoints (#35962)
This commit is part of our plan to deprecate and ultimately remove the
use of _xpack in the REST APIs.
2018-11-27 20:34:17 -05:00
Tim Brooks cc1fa799c8
Remove `TcpChannel#setSoLinger` method (#35924)
This commit removes the dedicated `setSoLinger` method. This simplifies
the `TcpChannel` interface. This method has very little effect as the
SO_LINGER is not set prior to the channels being closed in the abstract
transport test case. We still will set SO_LINGER on the
`MockNioTransport`. However we can do this manually.
2018-11-27 09:08:14 -07:00
Christophe Bismuth b95a4db6e6 Throw a parsing exception when boost is set in span_or query (#28390) (#34112) 2018-11-26 12:15:59 -05:00
Jim Ferenczi e37a0ef844
Upgrade to lucene-8.0.0-snapshot-67cdd21996 (#35816) 2018-11-22 15:42:59 +01:00
Armin Braun 7a210342ab
TESTS: Remove Dead Code in Disruption Tests (#35768)
* Neither this class nor the constructor are used anywhere
2018-11-21 10:33:50 +01:00
Christoph Büscher 5847f8379c
Move ScoreAccessor to test-framework (#35766)
This class is only used by RandomScoreFunctionIT and the MockScriptEngine, so it
shouldn't be part of the server codebase.
2018-11-21 10:28:31 +01:00
Armin Braun 33c713ba60
TESTS: More Logging in LongGcDisruptionTests (#35702)
* The existing logging is not helpful enough to track down which threads hang, we need the hanging thread's stacktraces too
* Relates #35686
2018-11-20 15:36:01 +01:00
Simon Willnauer 29ef442841
Add a `_freeze` / `_unfreeze` API (#35592)
This commit adds a rest endpoint for freezing and unfreezing an index.
Among other cleanups mainly fixing an issue accessing package private APIs
from a plugin that got caught by integration tests this change also adds
documentation for frozen indices.
Note: frozen indices are marked as `beta` and available as a basic feature.

Relates to #34352
2018-11-20 08:03:24 +01:00
Gordon Brown b2057138a7
Remove AbstractComponent from AbstractLifecycleComponent (#35560)
AbstractLifecycleComponent now no longer extends AbstractComponent. In
order to accomplish this, many, many classes now instantiate their own
logger.
2018-11-19 09:51:32 -07:00
Arthur Gavlyukovskiy 022726011c Remove use of AbstractComponent in server (#35444)
Removed extending of AbstractComponent and changed logger usage to
explicit declaration. Abstract classes still have logger
declaration using this.getClass() in order to show implementation class
name in its logs.

See #34488
2018-11-16 16:10:32 -05:00
Jernej Klancic baf33b3162 Removes AbstractComponent from several classes (#35566)
Removes inhertiting from AbstractComponent for some classes (mostly
in the plugins module).

Relates to #34488
2018-11-16 20:50:18 +01:00
Lee Hinman ce35d049e9 [TEST] Fix ClusterApplierServiceTests.testClusterStateUpdateLogging
This changes the test to not use a `CountDownlatch`, instead adding an assertion
for the final logging message and waiting until the `MockAppender` has seen it
before proceeding.

Resolves #23739
2018-11-15 14:15:23 -07:00
Tanguy Leroux c9b4ef0dfd
Use RunOnce when appropriate (#35553)
This pull request replaces some blocks of code that must be run once 
and that are currently based on AtomicBoolean by the convient RunOnce 
class added in #35489.
2018-11-15 09:24:40 +01:00
Yannick Welsch 4cfdb0609e
Adapt InternalCluster#fullRestart to call onNodeStopped when all nodes are stopped (#35494)
Refactors and simplifies the logic around stopping nodes, making sure that for a full cluster restart
onNodeStopped is only called after the nodes are actually all stopped (and in particular not while
starting up some nodes again). This change also ensures that a closed node client is not being used
anymore (which required a small change to a test).

Relates to #35049
2018-11-14 13:24:56 +01:00
Zachary Tong c346a0f027
[Rollup] Add `wait_for_completion` option to StopRollupJob API (#34811)
This adds a `wait_for_completion` flag which allows the user to block 
the Stop API until the task has actually moved to a stopped state, 
instead of returning immediately.  If the flag is set, a `timeout` parameter
can be specified to determine how long (at max) to block the API
call.  If unspecified, the timeout is 30s.

If the timeout is exceeded before the job moves to STOPPED, a
timeout exception is thrown.  Note: this is just signifying that the API
call itself timed out.  The job will remain in STOPPING and evenutally
flip over to STOPPED in the background.

If the user asks the API to block, we move over the the generic
threadpool so that we don't hold up a networking thread.
2018-11-13 16:37:17 -05:00
Julie Tibshirani bc799e4a6f
Ignore warnings related to types deprecation in REST tests. (#35395) 2018-11-13 11:56:01 -08:00
Tim Brooks ba478827ad
Improve MockTcpTransport memory usage (#35402)
The MockTcpTransport is not friendly in regards to memory usage. It must
allocate multiple byte arrays for every message. This improves the
memory situation by failing fast if the message is improperly formatted.
Additionally, it uses reusable big arrays for at least half of the
allocated byte arrays.
2018-11-09 10:12:49 -07:00
Jim Ferenczi 7054e289fa
Add trace log of the request for the query and fetch phases (#34479)
This change adds a logger for the query and fetch phases that prints all requests
before their execution at the trace level. This will help debugging cases where an issue
occurs during the execution since only completed queries are logged by the slow logs.
2018-11-09 09:41:51 +01:00
Tim Brooks 93c2c604e5
Move compression config to ConnectionProfile (#35357)
This is related to #34483. It introduces a namespaced setting for
compression that allows users to configure compression on a per remote
cluster basis. The transport.tcp.compress remains as a fallback
setting. If transport.tcp.compress is set to true, then all requests
and responses are compressed. If it is set to false, only requests to
clusters based on the cluster.remote.cluster_name.transport.compress
setting are compressed. However, after this change regardless of any
local settings, responses will be compressed if the request that is
received was compressed.
2018-11-08 10:37:59 -07:00
Zachary Tong 54b445d74b [Test] Remove obsolete job/cluster cleanup code
Also makes sure the awaitBusy for job stoppage is checked, so
that we can fail if we timed out waiting for a job to stop.

Closes #35295
2018-11-08 10:23:23 -05:00
Simon Willnauer 0cc0fd2d15
Add a frozen engine implementation (#34357)
This change adds a `frozen` engine that allows lazily open a directory reader
on a read-only shard. The engine wraps general purpose searchers in a LazyDirectoryReader
that also allows to release and reset the underlying index readers after any and before
secondary search phases.

Relates to #34352
2018-11-07 20:23:35 +01:00
Alpar Torok 8a85b2eada
Remove build qualifier from server's Version (#35172)
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored  by the build and red from `META-INF`.
2018-11-07 14:01:05 +02:00
Tim Brooks f395b1eace
Open node connections asynchronously (#35144)
This is related to #29023. Additionally at other points we have
discussed a preference for removing the need to unnecessarily block
threads for opening new node connections. This commit lays the groudwork
for this by opening connections asynchronously at the transport level.
We still block, however, this work will make it possible to eventually
remove all blocking on new connections out of the TransportService
and Transport.
2018-11-06 17:58:20 -07:00
Nick Knize a5e1f4d3a2 Upgrade to lucene-8.0.0-snapshot-31d7dfe6b1 (#35224) 2018-11-06 11:55:23 +01:00
Boaz Leskes 28078642b3
Engine.newChangesSnapshot may cause unneeded refreshes if called concurrently (#35169)
When the engine is asked for historical operations, we check if some of the requested operations
are not yet refreshed and if so we refresh before returning the operations. The refresh check is
based on capturing the local checkpoint before each refresh and comparing that value to the one
requested when `newChangesSnapshot` was called. If the requested range is above the captured
local checkpoint we issue a refresh.

This can currently cause unneeded extra refreshes if the method is called concurrently which may cause unwanted degradation in indexing performance. This is especially relevant for CCR where we always ask for a range below the global checkpoint. That range is guaranteed to be below the local
checkpoint of the shard and one refresh is enough to serve multiple changes requests.

This commit fixes this by introducing a dedicated mutex to make sure the test for whether a refresh
is needed actually wait for concurrents for concurrent refreshes that were caused by another
change refresh. 

Note that this is not a big change in semantics as refreshes are serialized by lucene anyway. I also
opted not to keep the synchronization to the changes snapshot request only even if in theory we
can apply it to all refreshes, not matter where they come from.
2018-11-04 13:43:33 +01:00
Nhat Nguyen 855ab3fa1e
Add equals/hashCode to SeqNoStats (#35223)
This commit adds equals/hashCode to SeqNoStats so we can verify it wholly in tests.
2018-11-02 21:31:36 -04:00
Tim Brooks 0166388d74
Use single netty event loop group for transports (#35181)
Currently we create a new netty event loop group for client connections
and all server profiles. Each new group creates new threads for io
processing. This means 2 * num of processors new threads for each group.
A single group should be able to handle all io processing (for the
transports). This also brings the netty module inline with what we do
for nio.

Additionally, this PR renames the worker threads to be the same for
netty and nio.
2018-11-02 16:31:19 -06:00
Colin Goodheart-Smithe fc6e1f7f3f
Merge branch 'master' into index-lifecycle 2018-11-02 10:56:35 +00:00
Alpar Torok f22700812e
Introduce build qualifier parameter (#35155)
* Introduce property to set version qualifier

- VersionProperties.elasticsearch is now a string which can have qualifier
and snapshot too
- The Version class in the build no longer cares about snapshot and
qualifier.
2018-11-02 05:27:40 +02:00
Julie Tibshirani 746d94e299 Unmute AbstractQueryTestCase#testToQuery.
The RangeQueryBuilderTests#testToQuery failures were fixed in #34868 and #35145.
2018-11-01 12:06:36 -07:00
Tal Levy c3cf7dd305 Merge remote-tracking branch 'upstream/master' into index-lifecycle 2018-11-01 10:13:02 -07:00
Nik Everett e28509fbfe
Core: Less settings to AbstractComponent (#35140)
Stop passing `Settings` to `AbstractComponent`'s ctor. This allows us to
stop passing around `Settings` in a *ton* of places. While this change
touches many files, it touches them all in fairly small, mechanical
ways, doing a few things per file:
1. Drop the `super(settings);` line on everything that extends
`AbstractComponent`.
2. Drop the `settings` argument to the ctor if it is no longer used.
3. If the file doesn't use `logger` then drop `extends
AbstractComponent` from it.
4. Clean up all compilation failure caused by the `settings` removal
and drop any now unused `settings` isntances and method arguments.

I've intentionally *not* removed the `settings` argument from a few
files:
1. TransportAction
2. AbstractLifecycleComponent
3. BaseRestHandler

These files don't *need* `settings` either, but this change is large
enough as is.

Relates to #34488
2018-10-31 21:23:20 -04:00
Igor Motov b5e5e93c46
Fixes randomDateTimeZone method (#35145)
The randomDateTimeZone method shouldn't return deprecated timezones
this causes some tests to fail with deprecation warning.
2018-10-31 20:32:18 -04:00
Seong-hyun, Oh 9ef4788c13 Make XContentBuilder in AliasActions build `is_write_index` field (#35071)
Make XContentBuilder in AliasesActions build `is_write_index` field
2018-10-31 14:15:46 -07:00
Armin Braun e6f9f0666e
NETWORKING: MockTransportService Wait for Close (#35038)
* NETWORKING: MockTransportService Wait for Close

* Make `MockTransportService` wait `30s` for close listeners to run before failing the assertion
* Closes #34990
2018-10-31 21:33:49 +01:00
David Turner 0072c90e2a
Pre-populate unicast hosts files (#35136)
Today when ESIntegTestCase starts some nodes it writes out the unicast hosts
files each time a node starts its transport service. This does mean that a
number of nodes can start and perform their first pinging round without any
unicast hosts which, if the timing is unlucky and a lot of nodes are all
started at the same time, can lead to a split brain as in #35052.

Prior to #33554 this was unlikely to happen since the MockUncasedHostsProvider
would always have yielded the existing hosts, so the timing would have to have
been implausibly unlucky. Since #33554, however, it's more likely because the
race occurs between the start of the first round of pinging and the writing of
the unicast hosts file. It is realistic that new nodes will be configured with
the existing nodes from startup, so this change reinstates that behaviour.

Closes #35052.
2018-10-31 19:21:24 +00:00
Tal Levy d5d28420b6 Merge remote-tracking branch 'upstream/master' into index-lifecycle 2018-10-31 10:47:07 -07:00
Luca Cavanna 674225aaa1
[TEST] Enforce skip headers when needed (#34735)
The java yaml test runner supports sending request headers, yet not all clients support headers. This commit makes sure that we enforce adding a skip section with feature "headers" whenever headers are used in a do section as part of a test. That decreases the chance for new tests to break client builds due to the missing skip section.

Closes #34650
2018-10-31 13:07:02 +01:00
Tal Levy 5141084048
rename CRUD api REST path prefix _ilm to _ilm/policy (#35056)
This PR renames the CRUD APIS for ILM

GET _ilm/<policy>, _ilm -> _ilm/policy/<policy>, _ilm/policy
PUT _ilm/<policy> -> _ilm/policy/<policy>
DELETE _ilm/<policy> -> _ilm/policy/<policy>

closes #34929.
2018-10-30 16:19:05 -07:00
Nik Everett 086ada4c08
Core: Drop settings member from AbstractComponent (#35083)
Drops the `Settings` member from `AbstractComponent`, moving it from the
base class on to the classes that use it. For the most part this is a
mechanical change that doesn't drop `Settings` accesses. The one
exception to this is naming threads where it switches from an invocation
that passes `Settings` and extracts the node name to one that explicitly
passes the node name.

This change doesn't drop the `Settings` argument from
`AbstractComponent`'s ctor because this change is big enough as is.
We'll do that in a follow up change.
2018-10-30 16:10:38 -04:00