2252 Commits

Author SHA1 Message Date
Luca Cavanna
0ebc17743a
Histogram aggs: add empty buckets only in the final reduce step (#35921)
Empty buckets don't need to be added when performing an incremental reduction step, they can be added later in the final reduction step. This will allow us to later remove the max buckets limit when performing non final reduction.
2018-11-30 20:33:09 +01:00
Tim Brooks
ea7ea51050
Make TcpTransport#openConnection fully async (#36095)
This is a follow-up to #35144. That commit made the underlying
connection opening process in TcpTransport asynchronous. However the
method still blocked on the process being complete before returning.
This commit moves the blocking to the ConnectionManager level. This is
another step towards the top-level TransportService api being async.
2018-11-30 11:30:42 -07:00
Armin Braun
986bf52d1f
[Zen2] Allow Setting a List of Bootstrap Nodes to Wait for (#35847) 2018-11-30 18:53:08 +01:00
Tim Brooks
da100c5479
Remove Lifecycle from ConnectionManager (#36092)
Prior to #35441 `ConnectionManager` had a `Lifecycle` object to support
the ping runnable. After that commit, the connection amanger only needs
the existing `AtomicBoolean` to indicate if it is running.
2018-11-30 09:04:32 -07:00
Luca Cavanna
43ea498f2f [TEST] Reduce number of buckets created in InternalDateHistogramTests
New that we test with min_doc_count set to 0 as well, we may end up generating a lot more buckets. This commit adjusts the min bound and max bound, as well as the offset for each randomly generated agg instance so that we don't end up hitting the 10.000 max buckets limit.

Relates to #36064
2018-11-30 14:16:32 +01:00
Luca Cavanna
e3eb05b14b
[TEST] Increase InternalDateHistogramTests coverage (#36064)
In this test we were randomizing different values but minDocCount was hardcoded to 1. It's important to test other values, especially `0` as it's the default. The test needed some adapting in the way buckets are randomly generated: all aggs need to share the same interval, minDocCount and emptyBucketInfo. Also assertions need to take into account that more (or less) buckets are expected depending on minDocCount.
2018-11-30 11:21:46 +01:00
Adrien Grand
fa3d365ee8
Fix CompositeBytesReference#slice to not throw AIOOBE with legal offsets. (#35955)
CompositeBytesReference#slice has two bugs:
 - One that makes it fail if the reference is empty and an empty slice is
   created, this is #35950 and is fixed by special-casing empty-slices.
 - One performance bug that makes it always create a composite slice when
   creating a slice that ends on a boundary, this is fixed by computing `limit`
   as the index of the sub reference that holds the last element rather than
   the next element after the slice.

Closes #35950
2018-11-30 10:32:46 +01:00
Jake Landis
f8636e58f9
Support content type application/x-ndjson in DeprecationRestHandler (#36025)
org.elasticsearch.rest.RestController#hasContentType checks to see if the
RestHandler supports the `application/x-ndjson` Content-Type. DeprecationRestHandler
is a wrapper around the real RestHandler, and prior to this change
would always return `false` due to the interface's default supportsContentStream().
This prevents API's that use multi-line JSON from properly being deprecated
resulting in an HTTP 406 error.

This change ensures that the DeprecationRestHandler honors the
supportsContentStream() of the wrapped RestHandler.

Relates to #35958
2018-11-29 11:45:45 -06:00
Jim Ferenczi
8a7f3f75f3
Add support for rest_total_hits_as_int (#36051)
The support for rest_total_hits_as_int has already been merged to 6x
in #35848 so this change adds this new option to master. The plan was
to add this new option as part of #35848 but we've decided to wait a few
days before merging this breaking change so this commit just handles
the new option as a noop exactly like 6x for now. This will allow
users to migrate to this parameter before #35848 is merged.

Relates #33028
2018-11-29 18:36:16 +01:00
Tim Brooks
c305f9dc03
Make keepalive pings bidirectional and optimizable (#35441)
This is related to #34405 and a follow-up to #34753. It makes a number
of changes to our current keepalive pings.

The ping interval configuration is moved to the ConnectionProfile.

The server channel now responds to pings. This makes the keepalive
pings bidirectional.

On the client-side, the pings can now be optimized away. What this
means is that if the channel has received a message or sent a message
since the last pinging round, the ping is not sent for this round.
2018-11-29 08:55:53 -07:00
Jim Ferenczi
ecd29089a8
Cache the score of the parent document in the nested agg (#36019)
The nested agg can defer the collection of children if it is nested
under another aggregation. In such case accessing the score in the children
aggregation throws an error because the scorer has already advanced to the next
parent. This change fixes this error by caching the score of the parent in the
nested aggregation. Children aggregations that work on nested documents will be
able to access the _score. Also note that the _score in this case is always the
parent's score, there is no way to retrieve the score of a nested docs in aggregations.

Closes #35985
Closes #34555
2018-11-29 14:35:25 +01:00
Armin Braun
48dc6c3442
[Zen2] Implement Tombstone REST APIs (#36007)
* [Zen2] Implement Tombstone REST APIs

* Adds REST API for withdrawing votes and clearing vote withdrawls
* Tests added to Netty4 module since we need a real Network impl. for Http endpoints
2018-11-29 14:34:10 +01:00
David Turner
7f257187af
[Zen2] Update default for USE_ZEN2 to true (#35998)
Today the default for USE_ZEN2 is false and it is overridden in many places. By
defaulting it to true we can be sure that the only places in which Zen2 does
not work are those in which it is explicitly set to false.
2018-11-29 12:18:35 +00:00
David Turner
277ccba3bd
[Zen2] fix NodeJoinTests#testConcurrentJoining() (#36033)
Today we sometimes create a setup in which the node is a quorum on its own,
which allows it to win a pre-voting round and schedule an election essentially
at will, causing it to discard all the joins it just received and fail the
test. This change excludes this case, preventing stray elections from ruining
things.
2018-11-29 12:11:08 +00:00
David Turner
87408b04d4
[Zen2] Only elect master-eligible nodes (#35996)
Today any node can win an election. However, the whole point of
master-eligibility is that master-ineligible nodes should not be elected as the
leader; furthermore master-ineligible nodes do not have any outgoing STATE
channels so cannot publish cluster states, so their leadership is ineffective
and disruptive.

This change ensures that the elected leader is master-eligible by preventing
master-ineligible nodes from scheduling an election.
2018-11-29 12:10:43 +00:00
Alan Woodward
a646f85a99
Ensure TokenFilters only produce single tokens when parsing synonyms (#34331)
A number of tokenfilters can produce multiple tokens at the same position.  This
is a problem when using token chains to parse synonym files, as the SynonymMap
requires that there are no stacked tokens in its input.

This commit ensures that when used to parse synonyms, these tokenfilters either produce
a single version of their input token, or that they throw an error when mappings are 
generated.  In indexes created in elasticsearch 6.x deprecation warnings are emitted in place 
of the error. 

* asciifolding and cjk_bigram produce only the folded or bigrammed token
* decompounders, synonyms and keyword_repeat are skipped
* n-grams, word-delimiter-filter, multiplexer, fingerprint and phonetic throw errors

Fixes #34298
2018-11-29 10:35:38 +00:00
Tanguy Leroux
0967620641
ActiveShardCount should not fail when closing the index (#35936)
The ActiveShardCount is used by cluster state observers to wait for a 
given number of shards to be active before returning to the caller. The 
current implementation does not work when an index is closed while an 
observer is waiting on shards to be active. In this case, a NPE is thrown 
and the observer is never notified that the shards won't become active.

This commit fixes the ActiveShardCount.enoughShardsActive() so that it 
does not fail when an index is closed, similarly to what is done when an 
index is deleted.
2018-11-29 09:08:30 +01:00
Alpar Torok
e0a678f0c4
Remove version.qualified from MainResponse (#35412)
The fully qualified version will be returned as `version.number`
2018-11-29 08:41:39 +02:00
Ryan Ernst
afd42df15f
Core: Deguice RepositoriesService (#36016)
This commit moves the RepositoriesService to be created outside of
guice.
2018-11-28 21:20:44 -08:00
Jim Ferenczi
9ca3a06475
Remove custom QueryBuilder#analyzeGraphPhrase (#35983)
Now that https://issues.apache.org/jira/browse/LUCENE-8479 is fixed
we can remove the custom implementation of QueryBuilder#analyzeGraphPhrase
in the match QueryBuilder.
2018-11-28 20:15:27 +01:00
Luca Cavanna
4b85769d24
Increase InternalHistogramTests coverage (#36004)
In `InternalHistogramTests` we were randomizing different values but `minDocCount` was hardcoded to `1`. It's important to test other values, especially `0` as it's the default. To make this possible, the test needed some adapting in the way buckets are randomly generated: all aggs need to share the same `interval`, `minDocCount` and `emptyBucketInfo`. Also assertions need to take into account that more (or less) buckets are expected depending on `minDocCount`.

This was originated by #35921 and its need to test adding empty buckets as part of the reduce phase.

Also relates to #26856 as one more key comparison needed to use `Double.compare` to properly handle `NaN` values, which was triggered by the increased test coverage.
2018-11-28 20:06:40 +01:00
Nik Everett
0588dad80b
Tasks: Only require task permissions (#35667)
Right now using the `GET /_tasks/<taskid>` API and causing a task to opt
in to saving its result after being completed requires permissions on
the `.tasks` index. When we built this we thought that that was fine,
but we've since moved towards not leaking details like "persisting task
results after the task is completed is done by saving them into an index
named `.tasks`." A more modern way of doing this would be to save the
tasks into the index "under the hood" and to have APIs to manage the
saved tasks. This is the first step down that road: it drops the
requirement to have permissions to interact with the `.tasks` index when
fetching task statuses and when persisting statuses beyond the lifetime
of the task.

In particular, this moves the concept of the "origin" of an action into
a more prominent place in the Elasticsearch server. The origin of an
action is ignored by the server, but the security plugin uses the origin
to make requests on behalf of a user in such a way that the user need
not have permissions to perform these actions. It *can* be made to be
fairly precise. More specifically, we can create an internal user just
for the tasks API that just has permission to interact with the `.tasks`
index. This change doesn't do that, instead, it uses the ubiquitus
"xpack" user which has most permissions because it is simpler. Adding
the tasks user is something I'd like to get to in a follow up change.

Instead, the majority of this change is about moving the "origin"
concept from the security portion of x-pack into the server. This should
allow any code to use the origin. To keep the change managable I've also
opted to deprecate rather than remove the "origin" helpers in the
security code. Removing them is almost entirely mechanical and I'd like
to that in a follow up as well.

Relates to #35573
2018-11-28 09:28:27 -05:00
Christoph Büscher
51a7dc54ec
Fix custom AUTO issue with Fuzziness#toXContent (#35807)
Currently when a Fuzziness instance with custom AUTO distance values gets
written to XContent, the customized lower and upper distance values are ommited
and can consequently not be parsed back. This changes this to write the String
including the optional custom values when writing to XContent and fixes the
tests that should have caught this in the first place, e.g. by adding the custom
low and high distance values to the equality check.
2018-11-28 15:07:11 +01:00
Andrey Ershov
0b45fb98b9
[Zen2] Generate coordinationMetaData with different configs (#35991)
This PR fixes test failure, which is caused by equal randomly generated lastAcceptedConfiguration and lastCommittedConfguration.
2018-11-28 14:49:07 +01:00
Yannick Welsch
5f0c036183 Disable testDeleteCreateInOneBulk on Zen2
This test needs adaptation to run with Zen2
2018-11-28 13:34:48 +01:00
Christoph Büscher
2f547bac65
Remove deprecated methods from QueryStringQueryBuilder (#35912)
This change removes the deprecated useDisMax() and useAllFields() methods from
the QueryStringQueryBuilder and related tests. The disMax parameter has already
been a no-op since 6.0 and also the useAllFields has been deprecated since 6.0
and there is a direct replacement via defaultField.
2018-11-28 11:09:03 +01:00
Jeff Hajewski
49087f16f5 Adds deprecation logging to ScriptDocValues#getValues. (#34279)
`ScriptDocValues#getValues` was added for backwards compatibility but no
longer needed. Scripts using the syntax `doc['foo'].values` when
`doc['foo']` is a list should be using `doc['foo']` instead.

Closes #22919
2018-11-27 14:30:13 -05:00
Julie Tibshirani
25c416b12d
Deprecate types in search and multi search templates. (#35669)
This PR adds deprecation warnings to the relevant `Rest*Action` classes, plus tests in `Rest*ActionTests`. No updates to REST tests, the Java HLRC, or documentation were necessary, since they didn't make use of types.
2018-11-27 10:19:19 -08:00
Simon Willnauer
ad1f0dccd4
Validate metdata on _msearch (#35938)
MultiSearchRequests issues through `_msearch` now validate all keys
in the metadata section. Previously unknown keys were ignored
while now an exception is thrown.

Closes #35869
2018-11-27 17:08:24 +01:00
Tim Brooks
cc1fa799c8
Remove TcpChannel#setSoLinger method (#35924)
This commit removes the dedicated `setSoLinger` method. This simplifies
the `TcpChannel` interface. This method has very little effect as the
SO_LINGER is not set prior to the channels being closed in the abstract
transport test case. We still will set SO_LINGER on the
`MockNioTransport`. However we can do this manually.
2018-11-27 09:08:14 -07:00
Christophe Bismuth
adc0b560c0 Raise a 404 exception when document source is not found (#33384) (#34083)
This pull request makes the `RestGetSourceAction` return a `ResourceNotFoundException` with a proper JSON response when source or document itself is missing (see issue #33384).

Here is below a sample JSON output:

```
{
  "error": {
    "root_cause": [
      {
        "type": "resource_not_found_exception",
        "reason": "Source not found [index1]/[_doc]/[1]"
      }
    ],
    "type": "resource_not_found_exception",
    "reason": "Source not found [index1]/[_doc]/[1]"
  },
  "status": 404
}
```
2018-11-27 10:35:45 -05:00
Andrey Ershov
0e283f9670
[Zen2] PersistedState interface implementation (#35819)
Today GatewayMetaState is capable of atomically storing MetaData to
disk. We've also moved fields that are needed to be persisted in Zen2
from ClusterState to ClusterState.MetaData.CoordinationMetaData.

This commit implements PersistedState interface.

version and currentTerm are persisted as a part of Manifest.
GatewayMetaState now implements both ClusterStateApplier and
PersistedState interfaces. We started with two descendants
Zen1GatewayMetaState and Zen2GatewayMetaState, but it turned
out to be not easy to glue it.
GatewayMetaState now constructs previousClusterState (including
MetaData) and previousManifest inside the constructor so that all
PersistedState methods are usable as soon as GatewayMetaState
instance is constructed. Also, loadMetaData is renamed to
getMetaData, because it just returns
previousClusterState.metaData().
Sadly, we don't have access to localNode (obtained from 
TransportService in the constructor, so getLastAcceptedState
should be called, after setLocalNode method is invoked.
Currently, when deciding whether to write IndexMetaData to disk,
we're comparing current IndexMetaData version and received
IndexMetaData version. This is not safe in Zen2 if the term has changed.
So updateClusterState now accepts incremental write
method parameter. When it's set to false, we always write
IndexMetaData to disk.
Things that are not covered by GatewayMetaStateTests are covered
by GatewayMetaStatePersistedStateTests.
This commit also adds an option to use GatewayMetaState instead of
InMemoryPersistedState in TestZenDiscovery. However, by default
InMemoryPersistedState is used and only one test in PersistedStateIT
used GatewayMetaState. In order to use it for other tests, proper
state recovery should be implemented.
2018-11-27 15:04:52 +01:00
Martijn van Groningen
447e5d212a
Changed versions in serialization code after backporting #35535 2018-11-27 08:00:06 +01:00
Gordon Brown
119835decd
Always enforce cluster-wide shard limit (#34892)
This removes the option to run a cluster without enforcing the
cluster-wide shard limit, making strict enforcement the default and only
behavior.  The limit can still be adjusted as desired using the cluster
settings API.
2018-11-26 17:05:12 -07:00
Igor Motov
663563f64b
Geo: better handling of malformed geo_points (#35554)
Improves handling of malformed geo_points when `ignore_malformed` is
set to true

Closes #35419
2018-11-26 09:44:42 -10:00
Jim Ferenczi
900caa20ef
Handles exists query in composite aggs (#35758)
This commit adds the support for exists query in the sorted execution mode
of the `composite` aggregation. We'll execute the aggregation from the sorted
points and use early termination if the main query is an `exists` query over the
first source of the `composite` aggregation.
2018-11-26 19:08:14 +01:00
Christophe Bismuth
b95a4db6e6 Throw a parsing exception when boost is set in span_or query (#28390) (#34112) 2018-11-26 12:15:59 -05:00
Simon Willnauer
ca9b2b9931
Repsect indices options on _msearch (#35887)
Today we don't respect the indices options when they are passed
as request parameters to the `_msearch` endpoint. This is unintuitive
and doesn't cause any errors. This changes uses the top-level indices
options as the defaults for each sub search-request.

Closes #35851
2018-11-26 14:26:39 +01:00
Christophe Bismuth
04ebc63e34 RoutingMissingException in more like this (#33974)
More like this query allows to provide identifiers of documents to be retrieved as like/unlike items. 
It can happen that at retrieval time an error is thrown, for instance caused by missing routing value when `_routing` is set required in the mapping. 
Instead of ignoring such error and returning no documents for the query, the error should be re-thrown and returned to users. As part of this 
change also mget and mtermvectors are unified in the way they throw such exception like it happens in other places, so that a `RoutingMissingException` is raised.

Closes #29678
2018-11-26 13:57:57 +01:00
Tanguy Leroux
9bdbba23f8 [Tests] Fix IndexShardTests.testAcquirePrimaryAllOperationsPermits()
This test fails on CI because of an inappropriate assertion, which is
I think a leftover and has no real value.
2018-11-26 13:44:12 +01:00
Luca Cavanna
e44390ac20
InitialSearchPhase minor cleanups (#35864)
This commit simplifies  the throttling logic in InitialSearchPhase and removes some asserts from it. Also, a few formatting changes are applied to its code and surrounding classes.
2018-11-26 13:42:41 +01:00
David Turner
a68a46450b
[Zen2] Add lag detector (#35685)
A publication can succeed and complete before all nodes have applied the
published state and acknowledged it, thanks to the publication timeout; however
we need every node eventually either to apply the published state (or a later
state) or be removed from the cluster. This change introduces the LagDetector
which achieves this liveness property by removing any lagging nodes from the
cluster.
2018-11-26 10:52:49 +00:00
iverase
401b814d1a [CI] Muting method testOperationPermitOnReplicaShards in IndexShardTests
Relates to #35850
2018-11-26 09:33:23 +01:00
Martijn van Groningen
7624734f14
Added wait_for_metadata_version parameter to cluster state api. (#35535)
The `wait_for_metadata_version` parameter will instruct the cluster state
api to only return a cluster state until the metadata's version is equal or
greater than the version specified in `wait_for_metadata_version`. If  
the specified `wait_for_timeout` has expired then a timed out response 
is returned. (a response with no cluster state and wait for timed out flag set to true)
In  the case metadata's version is equal or higher than  `wait_for_metadata_version`
then the api will immediately return.

This feature is useful to avoid external components from constantly
polling the cluster state to whether somethings have changed in the
cluster state's metadata.
2018-11-26 08:50:08 +01:00
Simon Willnauer
4711c5cdf3
Always return false from refreshNeeded on ReadOnlyEngine (#35837)
Acquiring a searcher is unnecessary to determine if a refresh is
necessary since read-only engines never refresh.

Closes #35785
2018-11-24 09:25:42 +01:00
Simon Willnauer
e46e44ce38
Wrap can_match reader with ElasticsearchDirectoryReader (#35857)
Code that operates on-top of the engine requires all readers returned to be
unwrapped into ElasticsearchDirectoryReader. The special reader
the FrozenEngine uses wasn't wrapped.
2018-11-24 09:23:53 +01:00
Andrey Ershov
f47636b254
[Zen2] Introduce VotingTombstone class (#35832)
Today voting tombstones are stored in CoordinationMetaData as
Set<DiscoveryNode>.
DiscoveryNode is not a lightweight object and have a lot of fields.
It also has toXContent method, but no fromXContent method and the
output of toXContent is not enough to re-create DiscoveryNode
object.
And votingTombstone set should be persisted as a part of MetaData.
On the other hand, the only thing required from the tombstone is the
nodeId.
This PR adds VotingTombstone class for voting tombstones, which
consists of two fields for now - nodeId and nodeName. It could be
extended/shrank in the future if needed.
This PR also resolves TODO's related to the voting tombstones xcontent
story.
Example of CoordinationMetaData.toXContent with voting tombstones:

{
  "term": 1,
  "last_committed_config": [
    "fkwLdOBvXSlgRTBfgNAL",
    "tmQiPGHvUxXzPkkCDSJo",
    "HhOmtQBZAThpHIGWhxpz",
    "qZHWGpoDNPYRNIiqKsDl"
  ],
  "last_accepted_config": [
    "lhqacKmriwhHGFZcvqbx",
    "MYysmBuROkvJRlDcusyd"
  ],
  "voting_tombstones": [
    {
      "node_id": "McjbZbRkEz",
      "node_name": "pdKIWeNJUO"
    },
    {
      "node_id": "cpXkVibGwo",
      "node_name": "UnCvFgdVsc"
    },
    {
      "node_id": "EylRNOztbc",
      "node_name": "ohOhkbMWZX"
    }
  ]
}
2018-11-23 18:34:06 +01:00
Yannick Welsch
51d2e986c5 Remove BWC conditions after backport of #35731
This PR was backported to 6.x, so the extra BWC conditions are not needed anymore
2018-11-23 17:11:06 +01:00
Adrien Grand
5b370316a6
Remove some legacy code from when indices could have multiple types. (#35815)
This code is only necessary up to indices created with version 5.x while 7.0
only supports indices created with 6.x or 7.0.
2018-11-23 15:15:26 +01:00
Yannick Welsch
2970abfce9
Add read-only repository verification (#35731)
Adds a verification mode for read-only repositories. It also makes the extra bucket check on
repository creation obsolete, which fixes #35703.
2018-11-23 14:45:05 +01:00