Commit Graph

1249 Commits

Author SHA1 Message Date
Jim Ferenczi f6d38d480a Integrate UnifiedHighlighter (#21621)
* Integrate UnifiedHighlighter

This change integrates the Lucene highlighter called "unified" in the list of supported highlighters for ES.
This highlighter can extract offsets from either postings, term vectors, or via re-analyzing text.
The best strategy is picked automatically at query time and depends on the field and the query to highlight.
2017-01-31 19:06:03 +01:00
Tim Brooks 719e75bb3f Add repository-url module and move URLRepository (#22752)
This is related to #22116. URLRepository requires SocketPermission
connect. This commit introduces a new module called "repository-url"
where URLRepository will reside. With the new module, permissions can
be removed from core.
2017-01-25 17:09:25 -06:00
Jim Ferenczi 868b12b548 Add BWC tests for field collapsing
Field collapsing is supported from version 5.3
2017-01-24 08:34:16 +01:00
Jim Ferenczi e48bc2eed7 Add field collapsing for search request (#22337)
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
markharwood f01784205f New AdjacencyMatrix aggregation
Similar to the Filters aggregation but only supports "keyed" filter buckets and automatically "ANDs" pairs of filters to produce a form of adjacency matrix.
The intersection of buckets "A" and "B" is named "A&B" (the choice of separator is configurable). Empty intersection buckets are removed from the final results.

Closes #22169
2017-01-20 15:49:31 +00:00
Boaz Leskes 5d806bf93e Index creation and setting update may not return deprecation logging (#22702)
Those services validate their setting before submitting an AckedClusterStateUpdateTask to the cluster state service. An acked cluster state may be completed by a networking thread when the last acks as received. As such it needs special care to make sure that thread context headers are handled correctly.
2017-01-20 10:14:13 +01:00
Daniel Mitterdorfer aece89d6a1 Make boolean conversion strict (#22200)
This PR removes all leniency in the conversion of Strings to booleans: "true"
is converted to the boolean value `true`, "false" is converted to the boolean
value `false`. Everything else raises an error.
2017-01-19 07:59:18 +01:00
Nicholas Knize 84e4f91253 Add geo_point to FieldStats
This commit adds a new GeoPoint class to FieldStats for computing field stats over geo_point field types.
2017-01-18 14:37:03 -06:00
Boaz Leskes 1227044ddd Add a deprecation notice to shadow replicas (#22647)
Relates to #22024

On top of documentation, the PR adds deprecation loggers and deals with the resulting warning headers.

The yaml test is set exclude versions up to 6.0. This is need to make sure bwc tests pass until this is backported to 5.2.0 . Once that's done, I will change the yaml test version limits
2017-01-18 12:28:09 +01:00
Greg Marzouka e0f8d88d5c Include global query string parameters in the REST spec
Closes #11638
2017-01-17 07:35:14 -05:00
Lee Hinman 2db01b6127 Merge remote-tracking branch 'dakrone/disable-all-by-default' 2017-01-12 10:17:51 -07:00
Luca Cavanna 7674de9e1f Move human flag under always accepted query_string params (#22562)
There are some parameters that are accepted by each and every api we expose. Those (pretty, source, error_trace and filter_path)  are not explicitly listed in the spec of every api, rather whitelisted in clients test runners so that they are always accepted. The `human` flag has been treated up until now as a parameter that's accepted by only some stats and info api, but that doesn't reflect reality as es core treats it exactly like `pretty` (relevant especially now that we validate params and throw exception when we find one that is not supported). Furthermore, the human flag has effect on every api that outputs a date, time, percentage or byte size field. For instance the tasks api outputs a date field although they don't have the human flag explicitly listed in their spec. There are other similar cases. This commit removes the human flag from the rest spec and makes it an always accepted query_string param.
2017-01-12 10:04:45 +01:00
Lee Hinman 7a18bb50fc Disable _all by default
This change disables the _all meta field by default.

Now that we have the "all-fields" method of query execution, we can save both
indexing time and disk space by disabling it.

_all can no longer be configured for indices created after 6.0.

Relates to #20925 and #21341
Resolves #19784
2017-01-11 16:47:13 -07:00
Jim Ferenczi 433c822d4f Promote longs to doubles when a terms agg mixes decimal and non-decimal numbers (#22449)
* Promote longs to doubles when a terms agg mixes decimal and non-decimal number

This change makes the terms aggregation work when the buckets coming from different indices are a mix of decimal numbers and non-decimal numbers. In this case non-decimal number (longs) are promoted to decimal (double) which can result in a loss of precision for big numbers.

Fixes #22232
2017-01-10 11:50:56 +01:00
Martijn van Groningen cb2333dacd percolator: remove deprecated percolate and mpercolate apis 2017-01-10 11:18:27 +01:00
Karel Minarik 4f4b76cd41 [TEST] Fixed the incorrect indentation for the `skip` clauses in the REST tests
This patch fixes the incorrect indentation in the REST tests, which makes tests in language runners (eg. Ruby, Python) to fail, since the skip clause is parsed as an empty value. Tha Java YAML parser is smarter/lenient about whitespace, so it doesn't catch this.
2017-01-08 14:21:02 +01:00
Nik Everett 12923ef896 Close and flush refresh listeners on shard close
Right now closing a shard looks like it strands refresh listeners,
causing tests like
`delete/50_refresh/refresh=wait_for waits until changes are visible in search`
to fail. Here is a build that fails:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+multi_cluster_search+multijob-darwin-compatibility/4/console

This attempts to fix the problem by implements `Closeable` on
`RefreshListeners` and rejecting listeners when closed. More importantly
the act of closing the instance flushes all pending listeners
so we shouldn't have any stranded listeners on close.

Because it was needed for testing, this also adds the number of
pending listeners to the `CommonStats` object and all API to which
that flows: `_cat/nodes`, `_cat/indices`, `_cat/shards`, and
`_nodes/stats`.
2017-01-06 20:03:32 -05:00
Ali Beyad a487b90498 [TEST] fix explain API rest test that assumes there is only a single
node in the cluster (incorrect assumption)
2017-01-06 11:07:40 -05:00
Ali Beyad 2f510b38c3 [TEST] explain API rest test may have shard allocation throttled 2017-01-04 14:34:00 -05:00
Ali Beyad 85b754f0e0 [TEST] 5.x snapshot build is working again, so update the backwards
compatibility tests for the allocation explain API to include 5.2.0
2017-01-04 12:07:17 -05:00
Jim Ferenczi 1f35d2532b Fix BWC layer with field_stats and geo_point 2017-01-04 13:14:09 +01:00
Jim Ferenczi 360ce532eb Implement stats for geo_point and geo_shape field (#22391)
Currently `geo_point` and `geo_shape` field are treated as `text` field by the field stats API and we
try to extract the min/max values with MultiFields.getTerms.
This is ok in master because a `geo_point` field is always a Point field but it can cause problem in 5.x (and 2.x) because the legacy
 `geo_point` are indexed as terms.
 As a result the min and max are extracted and then printed in the FieldStats output using BytesRef.utf8ToString
 which can throw an IndexOutOfBoundException since it's not valid UTF8 strings.
 This change ensure that we never try to extract min/max information from a `geo_point` field.
 It does not add a new type for geo points in the fieldstats API so we'll continue to use `text` for this kind of field.
 This PR is targeted to master even though we could only commit this change to 5.x. I think it's cleaner to have it in master too before we make any decision on
  https://github.com/elastic/elasticsearch/pull/21947.

Fixes #22384
2017-01-04 10:42:22 +01:00
Ali Beyad 91917d6e91 [TEST] mute backwards compatability tests for explain API until 5.2
snapshot builds can be published again
2017-01-02 18:26:19 -05:00
Ali Beyad 20ab4be59f Cluster Explain API uses the allocation process to explain shard allocation decisions (#22182)
This PR completes the refactoring of the cluster allocation explain API and improves it in the following two high-level ways:

 1. The explain API now uses the same allocators that the AllocationService uses to make shard allocation decisions. Prior to this PR, the explain API would run the deciders against each node for the shard in question, but this was not executed on the same code path as the allocators, and many of the scenarios in shard allocation were not captured due to not executing through the same code paths as the allocators.

 2. The APIs have changed, both on the Java and JSON level, to accurately capture the decisions made by the system. The APIs also now report on shard moving and rebalancing decisions, whereas the previous API did not report decisions for moving shards which cannot remain on their current node or rebalancing shards to form a more balanced cluster.

Note: this change affects plugin developers who may have a custom implementation of the ShardsAllocator interface. The method weighShards has been removed and no longer has any utility. In order to support the new explain API, however, a custom implementation of ShardsAllocator must now implement ShardAllocationDecision decideShardAllocation(ShardRouting shard, RoutingAllocation allocation) which provides a decision and explanation for allocating a single shard. For implementations that do not support explaining a single shard allocation via the cluster allocation explain API, this method can simply return an UnsupportedOperationException.
2017-01-02 12:28:32 -06:00
Jim Ferenczi 02d4cbfeea Fix bwc integ test that tries to perform a term aggs on a scaled_float. This is broken when a node with version prior to 5.2.0 is used with another node > 5.2.0. This is because scaled_float fields are considered as longs in version < 5.2.0. This is fixed in 5.2.0 where scaled_float are recognized as doubles. 2016-12-27 21:52:27 +01:00
Jim Ferenczi e7444f7d77 Fix scaled_float numeric type in aggregations (#22351)
`scaled_float` should be used as DOUBLE in aggregations but currently they are used as LONG.
This change fixes this issue and adds a simple it test for it.

Fixes #22350
2016-12-27 09:23:22 +01:00
Ali Beyad 8261bd358a Synchronize snapshot deletions on the cluster state (#22313)
Before, snapshot/restore would synchronize all operations on the cluster
state except for deleting snapshots.  This meant that only one
snapshot/restore operation would be allowed in the cluster at any given
time, except for deletions - there could be two or more snapshot
deletions running at the same time, or a deletion could be running,
unbeknowest to the rest of the cluster, and thus a snapshot or restore
would be allowed at the same time as the snapshot deletion was still in
progress.  This could cause any number of synchronization issues,
including the situation where a snapshot that was deleted could reappear
in the index-N file, even though its data was no longer present in the
repository.

This commit introduces a new custom type to the cluster state to
represent deletions in progress.  Now, another deletion cannot start if
a deletion is currently in progress.  Similarily, a snapshot or restore
cannot be started if a deletion is currently in progress.  In each case,
if attempting to run another snapshot/restore operation while a deletion
is in progress, a ConcurrentSnapshotExecutionException will be thrown.
This is the same exception thrown if trying to snapshot while another
snapshot is in progress, or restore while a snapshot is in progress.

Closes #19957
2016-12-25 19:00:20 -05:00
Adrien Grand 70594a66c7 Only run the unmapped+missing tests on 5.2+. 2016-12-23 09:38:20 +01:00
Adrien Grand e39942fc02 `value_type` is useful regardless of scripting. (#22160)
Today we only expose `value_type` in scriptable aggregations, however it is
also useful with unmapped fields. I suspect we never noticed because
`value_type` was not documented (fixed) and most aggregations are scriptable.

Closes #20163
2016-12-22 14:35:12 +01:00
Boaz Leskes 6249f1092f x_refresh.yaml tests should use unique index names and doc ids to ease debugging
This is to make it easier to grep the node logs
2016-12-21 10:25:33 +01:00
Boaz Leskes b857b316b6 Add BWC layer to seq no infra and enable BWC tests (#22185)
Sequence BWC logic consists of two elements:

1) Wire level BWC using stream versions.
2) A changed to the global checkpoint maintenance semantics.

For the sequence number infra to work with a mixed version clusters, we have to consider situation where the primary is on an old node and replicas are on new ones (i.e., the replicas will receive operations without seq#) and also the reverse (i.e., the primary sends operations to a replica but the replica can't process the seq# and respond with local checkpoint). An new primary with an old replica is a rare because we do not allow a replica to recover from a new primary. However, it can occur if the old primary failed and a new replica was promoted or during primary relocation where the source primary is treated as a replica until the master starts the target.

1) Old Primary & New Replica - this case is easy as is taken care of by the wire level BWC. All incoming requests will have their seq# set to `UNASSIGNED_SEQ_NO`, which doesn't confuse the local checkpoint logic (keeping it at `NO_OPS_PERFORMED`) 
2) New Primary & Old replica - this one is trickier as the global checkpoint service currently takes all in sync replicas into consideration for the global checkpoint calculation. In order to deal with old replicas, we change the semantics to say all *new node* in sync replicas. That means the replicas on old nodes don't count for the global checkpointing. In this state the seq# infra is not fully operational (you can't search on it, because copies may miss it) but it is maintained on shards that can support it. The old replicas will have to go through a file based recovery at some point and will get the seq# information at that point. There is still an edge case where a new primary fails and an old replica takes over. I'lll discuss this one with @ywelsch as I prefer to avoid it completely.

This PR also re-enables the BWC tests which were disabled. As such it had to fix any BWC issue that had crept in. Most notably an issue with the removal of the `timestamp` field in #21670.

The commit also includes a fix for the default value of the seq number field in replicated write requests (it was 0 but should be -2), that surface some other minor bugs which are fixed as well.

Last - I added some debugging tools like more sane node names and forcing replication request to implement a `toString`
2016-12-19 13:08:24 +01:00
Areek Zillur d44de0cecc Remove deprecated _suggest endpoint (#22203)
In #20305, _suggest endpoint was deprecated
in favour of using _search endpoint. This
commit removes the dedicated _suggest endpoint
entirely from master.
2016-12-16 12:06:02 -05:00
Masaru Hasegawa a0185c83a7 Merge pull request #21393 from masaruh/alias_boost
Resolve index names in indices_boost
2016-12-16 15:07:51 +09:00
Jason Tedor 43f71015a8 Add skip for include segment file sizes REST tests
This commit adds a skip for the include segment file sizes REST tests on
nodes less than or equal to version 5.1.1 as the stats APIs did not
correctly account for this parameter prior to version 5.1.2.

Relates #21879
2016-12-15 21:08:51 -05:00
Aaron Spiegel 80d3d790ae Fix handling of segment file sizes in stats API
This commit addresses an issue in the stats APIs where
include_segment_file_sizes was not being consumed leading to requests
containing this parameter being rejected.

Relates #21879
2016-12-15 07:29:11 -05:00
Areek Zillur cdd5fbe3a1 Deprecate _suggest endpoint in favour of _search (#20305)
* Replace _suggest endpoint to _search in docs

In 5.0, the _suggest endpoint is just sugar for _search
with suggestions specified. Users should move away from
using the _suggest endpoint, as it is marked as deprecated in 5.x and
will be removed in 6.0

* update docs to use _search endpoint instead of _suggest

* Add deprecation logging to RestSuggestAction

* Use search endpoint instead of suggest endpoint in rest tests
2016-12-14 21:49:53 -05:00
Simon Willnauer b7bcb5bb3a [TEST] Skip term / int partitioning tests in bwc tests pre 5.2.0 2016-12-13 22:20:44 +01:00
markharwood 4c6d17a176 Added tests for toXContent and fromXContent for IncludeExclude class.
New REST test revealed an issue with inconsistent hashing in partitioned
term tests which is also fixed in this change.

Closes #22102
2016-12-13 15:23:09 +00:00
Artur Nowosielski 726f5dccc0 Rewrite filter queries in FiltersAggregationBuilder (#22076)
Queries must be rewritten before the query phase executes otherwise non-executable queries like `wrapper` query or `terms`  will fail or queries that require resources like script service can't access these service unless rewritten.

Relates to #21303
2016-12-11 14:37:12 +01:00
Masaru Hasegawa 3df2a086d4 Resolve index names in indices_boost
This change allows specifying alias/wildcard expression in indices_boost.
And added another format for specifying indices_boost. It accepts array of index name and boost pair.
If an index is included in multiple aliases/wildcard expressions, the first match will be used.
With new format, old format is marked as deprecated.

Closes #4756
2016-12-11 21:41:49 +09:00
Nik Everett 7a74a41a0c Fix test for changed message
Message is only changed in 5.2.0 so we shouldn't assert on it
if we're running with any nodes less than that version.
2016-12-10 10:35:14 -05:00
Nik Everett ddade1b5ac Improve the error message if task and node isn't found (#22062)
Improves the error message returned when looking up a task that
belongs to a node that is no longer part of the cluster. The new
error message tells the user that the node isn't part of the cluster.
This is useful because if you start a task and the node goes down
there isn't a record of the task at all. This hints to the user that
the task might have died with the node.

Relates to #22027
2016-12-09 15:50:46 -05:00
Yannick Welsch fca4f92fee Fix BWC condition on REST test
Adds a missing skip section to a REST test that was forgotten in #21998
2016-12-09 19:05:00 +01:00
Yannick Welsch db0660a7ea Reject external versioning and explicit version numbers on create (#21998)
Fixes an issue where indexing requests with operation type "create" auto-convert external versioning to internal versioning and silently ignore the version number instead of failing with an error message.
2016-12-09 14:21:22 +01:00
Jason Tedor 4aae017891 Skip IP range query REST test prior to 5.1.2
This commit adds a skip for the IP range query REST test on version
prior to 5.1.2 due to a exclusive bug on the top end of the range.
2016-12-08 16:40:39 -05:00
Adrien Grand 8fe4bc1b74 Fix REST test for ip range aggregations.
Relates to #22018
2016-12-08 18:00:56 +01:00
Jason Tedor 7861a9dc3e Add skip for missing types REST test
This commit adds a skip for the missing types REST test. There was a bug
for an unclosed XContent object on version prior to version 5.0.3 which
is going to lead to different responses on vresions prior to 5.0.3 and
versions on or after version 5.0.3.
2016-12-01 22:10:47 -05:00
Ryan Ernst a6ad89bee0 Mappings: Fix get mapping when no indexes exist to not fail in response generation (#21924)
When there are no indexes, get mapping has a series of special cases.
Two of those expect the response object already started, and the other
two respond with an exception. Those two cases (types passed in but no
indexes and vice versa) would fail in their error response generation
because it did not expect an object to already be started in the json
generator. This change moves the object start to where it is needed for
the empty responses.

closes #21916
2016-12-01 16:57:12 -08:00
Yannick Welsch 590a6372ad Disable disk watermarks on REST tests (#21803)
REST tests use the default OOTB low/high disk watermarks of 85%/90%, which can make some tests fail if run on a machine with a fuller disk. This commit changes the watermarks in the same way as in IntegTestCase so that they're essentially ignored.
2016-11-25 19:52:52 +01:00
Luca Cavanna 720b165350 Search shards to print out aliases array together with alias filter (#21784)
With #21738 we added an indices section to the search shards api, that will return the concrete indices hit by the request, and eventually the corresponding alias filter.

The java API returns the AliasFilter object, which holds the filter itself and an array of aliases that pointed to the index in the original request. The REST layer doesn't print out the aliases array though. This commit adds the aliases array as well and tests for this.
2016-11-25 10:58:06 +01:00