Commit Graph

874 Commits

Author SHA1 Message Date
Nik Everett e042c77301 Add tests for reducing top hits ()
Also adds many `equals` and `hashCode` implementations and moves
the failure printing in `MatchAssertion` into a common spot and
exposes it over `assertEqualsWithErrorMessageFromXContent` which
does an object equality test but then uses `toXContent` to print
the differences.

Relates to 
2017-01-27 20:54:11 -05:00
Nik Everett 2e48fb8294 Move delete by query helpers into core ()
This moves the building blocks for delete by query into core. This
should enabled two thigns:
1. Plugins other than reindex to implement "bulk by scroll" style
operations.
2. Plugins to directly call delete by query. Those plugins should
be careful to make sure that task cancellation still works, but
this should be possible.

Notes:
1. I've mostly just moved classes and moved around tests methods.
2. I haven't been super careful about cohesion between these core
classes and reindex. They are quite interconnected because I wanted
to make the change as mechanical as possible.

Closes 
2017-01-27 16:09:18 -05:00
Ryan Ernst aad51d44ab S3 repository: Add named configurations ()
* S3 repository: Add named configurations

This change implements named configurations for s3 repository as
proposed in . The access/secret key secure settings which were
added in  are reverted, and the only secure settings are those
with the new named configs. All other previously used settings for the
connection are deprecated.

closes 
2017-01-27 10:42:45 -08:00
Nik Everett 8abd4101eb Add tests for reducing top hits
Also adds many `equals` and `hashCode` implementations and moves
the failure printing in `MatchAssertion` into a common spot and
exposes it over `assertEqualsWithErrorMessageFromXContent` which
does an object equality test but then uses `toXContent` to print
the differences.

Relates to 
2017-01-27 12:32:17 -05:00
Jason Tedor 930282e161 Introduce sequence-number-based recovery
This commit introduces sequence-number-based recovery. When a replica
has fallen out of sync, rather than performing a file-based recovery we
first attempt to replay operations since the last local checkpoint on
the replica. To do this, at the start of recovery the replica tells the
primary what its local checkpoint is. The primary will then wait for all
operations between that local checkpoint and the current maximum
sequence number to complete; this is to ensure that there are no gaps in
the operations that will be replayed from the primary to the
replica. This is a best-effort attempt as we currently have no
guarantees on the primary that these operations will be available; if we
are not able to replay all operations in the desired range, we just
fallback to file-based recovery. Later work will strengthen the
guarantees.

Relates 
2017-01-27 08:16:38 -08:00
Jim Ferenczi e48bc2eed7 Add field collapsing for search request ()
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
Simon Willnauer 27b5c2ad54 Pass `forceExecution` flag to transport interceptor ()
To effectively allow a plugin to intercept a transport handler it needs
to know if the handler must be executed even if there is a rejection on the
thread pool in the case the wrapper forks a thread to execute the actual handler.
2017-01-23 11:04:27 +01:00
Simon Willnauer 824beea89d Fix handling of document failure expcetion in InternalEngine ()
Today we try to be smart and make a generic decision if an exception should
be treated as a document failure but in some cases concurrency in the index writer
make this decision very difficult since we don't have a consistent state in the case
another thread is currently failing the IndexWriter/InternalEngine due to a tragic event.

This change simplifies the exception handling and makes specific decisions about document failures
rather than using a generic heuristic. This prevent exceptions to be treated as document failures
that should have failed the engine but backed out of failing since since some other thread has
already taken over the failure procedure but didn't finish yet.
2017-01-20 16:55:00 +01:00
Ryan Ernst c5b4bba30b S3 repository: Deprecate specifying credentials through env vars, sys props, and remove profile files ()
* S3 repository: Deprecate specifying credentials through env vars and sys props

This is a follow up to , where storing credentials secure way was
added.
2017-01-19 12:36:32 -08:00
Simon Willnauer 24e2847af2 Streamline foreign stored context restore and allow to perserve response headers ()
Today we do not preserve response headers if they are present on a transport protocol
response. While preserving these headers is not always desired, in the most cases we
should pass on these headers to have consistent results for depreciation headers etc.
yet, this hasn't been much of a problem since most of the deprecations are detected early
ie. on the coordinating node such that this bug wasn't uncovered until 

This commit allow to optionally preserve headers when a context is restored and also streamlines
the context restore since it leaked frequently into the callers thread context when the callers
context wasn't restored again.
2017-01-18 16:17:54 +01:00
Simon Willnauer 19f9cb307a Merge branch 'master' into feature/multi_cluster_search 2017-01-18 09:24:35 +01:00
Luca Cavanna bc5b604cbd [TEST] parse global parameters from _common.json ()
Replace the hardcoded global parameters in the yaml test suite with parameters parsed from the newly added _common.json file.

Relates to 
2017-01-17 16:13:09 +01:00
Ali Beyad e2977889b8 Allow comma delimited array settings to have a space after each entry ()
Previously, certain settings that could take multiple comma delimited
values would pick up incorrect values for all entries but the first if
each comma separated value was followed by a whitespace character.  For
example, the multi-value "A,B,C" would be correctly parsed as
["A", "B", "C"] but the multi-value "A, B, C" would be incorrectly parsed
as ["A", " B", " C"].

This commit allows a comma separated list to have whitespace characters
after each entry.  The specific settings that were affected by this are:

  cluster.routing.allocation.awareness.attributes
  index.routing.allocation.require.*
  index.routing.allocation.include.*
  index.routing.allocation.exclude.*
  cluster.routing.allocation.require.*
  cluster.routing.allocation.include.*
  cluster.routing.allocation.exclude.*
  http.cors.allow-methods
  http.cors.allow-headers

For the allocation filtering related settings, this commit also provides
validation of each specified entry if the filtering is done by _ip,
_host_ip, or _publish_ip, to ensure that each entry is a valid IP
address.

Closes 
2017-01-17 08:51:04 -06:00
Simon Willnauer 709cb9a39e Merge branch 'master' into feature/multi_cluster_search 2017-01-17 12:34:36 +01:00
Michael McCandless ebd38e2a6a Expose FlattenGraphTokenFilter ()
FlattenGraphTokenFilter is necessary for using graph-based token streams (e.g. the new SynonymGraphFilter) during indexing.
2017-01-16 16:53:32 -05:00
Simon Willnauer f30b1f82ee Remove HttpServer and HttpServerAdapter in favor of a simple dispatch method ()
Today we have quite some abstractions that are essentially providing a simple
dispatch method to the plugins defining a `HttpServerTransport`. This commit
removes `HttpServer` and `HttpServerAdaptor` and introduces a simple `Dispatcher` functional
interface that delegate to `RestController` by default.

Relates to 
2017-01-16 21:06:08 +01:00
Luca Cavanna 193111919c move ignore parameter support from yaml test client to low level rest client ()
All the language clients support a special ignore parameter that doesn't get passed to elasticsearch with the request, but used to indicate which error code should not lead to an exception if returned for a specific request.

Moving this to the low level REST client will allow the high level REST client to make use of it too, for instance so that it doesn't have to intercept ResponseExceptions when the get api returns a 404.
2017-01-16 18:54:44 +01:00
Simon Willnauer 895124e67e Merge branch 'master' into feature/multi_cluster_search 2017-01-16 13:20:45 +01:00
Simon Willnauer 5f0344a918 Pass ThreadContext to transport interceptors to allow header modification ()
TransportInterceptors are commonly used to enrich requests with headers etc.
which requires access the the thread context. This is not always easily possible
since threadpools are hard to access for instance if the interceptor is used on a transport client.

This commit passes on the thread context to all the interceptors for further consumption.

Closes 
2017-01-15 13:35:39 +01:00
Simon Willnauer 3f784a4424 Merge branch 'master' into feature/multi_cluster_search 2017-01-15 10:28:34 +01:00
Simon Willnauer 2dd0ec57b2 [TEST] Remove connection listener from all transports in AbstractSimpleTransportTestCase#testSendRandomRequests 2017-01-13 23:19:04 +01:00
Simon Willnauer 63e4552c0d Merge branch 'master' into feature/multi_cluster_search 2017-01-13 23:07:20 +01:00
Simon Willnauer 4c1ee018f6 Remove setLocalNode from ClusterService and TransportService ()
ClusterService and TransportService expect the local discovery node to be set
before they are started but this requires manual interaction and is error prone since
to work absolutely correct they should share the same instance (same ephemeral ID).

TransportService also has 2 modes of operation, mainly realted to transport client vs. internal
to a node. This change removes the mode where we don't maintain a local node and uses a dummy local
node in the transport client since we don't bind to any port in such a case.

Local discovery node instances are now managed by the node itself and only suppliers and factories that allow
creation only once are passed to TransportService and ClusterService.
2017-01-13 16:12:27 +01:00
Simon Willnauer d5fa84f869 Harder close and remove reference concurrency in MockTcpTransport ()
There was still small race in MockTcpTransport where channesl that are concurrently
closing are not yet removed from the reference tracking causing tests to fail. Compared to
the other races before this is a rather small windown and requires very very short test durations.
2017-01-13 16:04:05 +01:00
Simon Willnauer 6779ea9c2a Merge branch 'master' into feature/multi_cluster_search 2017-01-13 12:10:23 +01:00
Simon Willnauer acf2d2f86f Ensure new connections won't be opened if transport is closed or closing ()
Today there are several races / holes in TcpTransport and MockTcpTransport
that can allow connections to be opened and remain unclosed while the actual
transport implementation is closed. A recently added assertions in  exposes
these problems. This commit fixes several issues related to missed locks or channel
creations outside of a lock not checking if the resource is still open.
2017-01-12 20:27:09 +01:00
javanna 8072f168a3 Remove ParseFieldMatcher usages from QueryParseContext 2017-01-12 14:43:35 +01:00
Luca Cavanna 7674de9e1f Move human flag under always accepted query_string params ()
There are some parameters that are accepted by each and every api we expose. Those (pretty, source, error_trace and filter_path)  are not explicitly listed in the spec of every api, rather whitelisted in clients test runners so that they are always accepted. The `human` flag has been treated up until now as a parameter that's accepted by only some stats and info api, but that doesn't reflect reality as es core treats it exactly like `pretty` (relevant especially now that we validate params and throw exception when we find one that is not supported). Furthermore, the human flag has effect on every api that outputs a date, time, percentage or byte size field. For instance the tasks api outputs a date field although they don't have the human flag explicitly listed in their spec. There are other similar cases. This commit removes the human flag from the rest spec and makes it an always accepted query_string param.
2017-01-12 10:04:45 +01:00
Simon Willnauer 00781d24ce Merge branch 'master' into feature/multi_cluster_search 2017-01-11 23:40:46 +01:00
Simon Willnauer 8a0393f718 Move assertion for open channels under TcpTransport lock
TcpTransport has an actual mechanism to stop resources in subclasses.
Instead of overriding `doStop` subclasses should override `stopInternal`
that is executed under the connection lock guaranteeing that there is no
concurrency etc.

Relates to 
2017-01-11 23:37:12 +01:00
Ryan Ernst 8015fbbf25 Make s3 repository sensitive settings use secure settings ()
* Settings: Make s3 repository sensitive settings use secure settings

This change converts repository-s3 to use the new secure settings. In
order to support the multiple ways we allow aws creds to be configured,
it also moves the main methods for the keystore wrapper into a
SecureSettings interface, in order to allow settings prefixing to work.
2017-01-11 11:19:46 -08:00
Simon Willnauer d3124dd62b Merge branch 'master' into feature/multi_cluster_search 2017-01-11 17:03:30 +01:00
Simon Willnauer 6810125a8b Prevent open channel leaks if handshake times out or is interrupted ()
The low level TCP handshake can cause channel / connection leaks if it's interrupted
since the caller doesn't close the channel / connection if the handshake was not successful.
This commit fixes the channel leak and adds general test infrastructure to detect channel leaks
in the future.
2017-01-11 17:02:36 +01:00
Simon Willnauer 6d2d878068 Merge branch 'master' into feature/multi_cluster_search 2017-01-11 09:28:00 +01:00
Tanguy Leroux 2dcb05fca8 Add fromxcontent methods to index response ()
This commit adds the parsing fromXContent() methods to the IndexResponse class. The method is based on a ObjectParser because it is easier to use when parsing parent abstract classes like DocWriteResponse.

It also changes the ReplicationResponse.ShardInfo so that it now implements ToXContentObject. This way, the ShardInfo.fromXContent() method can be used by the IndexResponse's ObjectParser.
2017-01-10 20:25:32 +01:00
Yannick Welsch c35277e623 [TEST] Fix JSON generation of failure in InternalTestCluster
Relates to 
2017-01-10 17:53:04 +01:00
Boaz Leskes f387848f83 MockTransportService.doClose assertions should check openConnections under lock 2017-01-10 14:03:31 +01:00
Yannick Welsch 9fc1a735cc Keep NodeConnectionsService in sync with current nodes in the cluster state ()
The NodeConnectionsService currently determines which nodes to connect to / disconnect from by inspecting cluster state changes and connecting to added nodes / disconnecting from removed nodes. When a master steps down (for example due to another master-eligible node shutting down which brings the number of master-eligible nodes below minimum_master_master), and the connection to other existing nodes was dropped while pinging, however, the connection to these nodes is not re-established while publishing the first cluster state that establishes the node as master.
This commit changes the NodeConnectionsService connect / disconnect logic to always rely on the state that is to be / was published, looking not only at the added / removed nodes, but validating that exactly all nodes that are currently registered in NodeConnectionsService are connected (corresponds to a NOOP if the node is already connected).
2017-01-10 13:29:49 +01:00
Simon Willnauer 1ef98ede17 Merge branch 'master' into feature/multi_cluster_search 2017-01-09 12:09:23 +01:00
Nik Everett 12923ef896 Close and flush refresh listeners on shard close
Right now closing a shard looks like it strands refresh listeners,
causing tests like
`delete/50_refresh/refresh=wait_for waits until changes are visible in search`
to fail. Here is a build that fails:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+multi_cluster_search+multijob-darwin-compatibility/4/console

This attempts to fix the problem by implements `Closeable` on
`RefreshListeners` and rejecting listeners when closed. More importantly
the act of closing the instance flushes all pending listeners
so we shouldn't have any stranded listeners on close.

Because it was needed for testing, this also adds the number of
pending listeners to the `CommonStats` object and all API to which
that flows: `_cat/nodes`, `_cat/indices`, `_cat/shards`, and
`_nodes/stats`.
2017-01-06 20:03:32 -05:00
Ryan Ernst cd6e3f4cea Merge branch 'master' into keystore 2017-01-06 09:32:08 -08:00
Tim B b9c2c2f6f0 Move IfConfig.logIfNecessary call into bootstrap ()
This is related to . A logIfNecessary() call makes a call to
NetworkInterface.getInterfaceAddresses() requiring SocketPermission
connect privileges. By moving this to bootstrap the logging call can be
made before installing the SecurityManager.
2017-01-06 11:10:53 -06:00
Simon Willnauer 418ec62bfb Merge branch 'master' into feature/multi_cluster_search 2017-01-06 10:24:40 +01:00
Ryan Ernst eb596d7270 more renames 2017-01-06 01:03:45 -08:00
javanna d87a30647b remove ParseFieldMatcher usages from SearchAfterBuilder 2017-01-05 19:33:04 +01:00
Simon Willnauer 0183b0c5a8 More cleanups 2017-01-05 15:23:55 +01:00
Simon Willnauer 80bf01d3c0 Merge branch 'master' into feature/multi_cluster_search 2017-01-05 08:00:03 +01:00
Simon Willnauer a5daa5d3a2 Execute low level handshake in #openConnection ()
Today we execute the low level handshake on the TCP layer in #connectToNode.
If #openConnection is used directly, which is truly expert, no handshake is executed
which allows connecting to nodes that are not necessarily compatible. This change
moves the handshake to #openConnection to prevent bypassing this logic.
2017-01-05 07:32:53 +01:00
Tim B be22a250b6 Replace Socket, ServerSocket, and HttpServer usages in tests with mocksocket versions ()
This integrates the mocksocket jar with elasticsearch tests. Mocksocket wraps actions requiring SocketPermissions in doPrivilege blocks. This will eventually allow SocketPermissions to be assigned to the mocksocket jar opposed to the entire elasticsearch codebase.
2017-01-04 14:38:51 -06:00
Adrien Grand f8998fece5 Upgrade to lucene-6.4.0-snapshot-084f7a0. () 2017-01-04 19:03:52 +01:00