7900 Commits

Author SHA1 Message Date
Simon Willnauer
489f38918d Fix incremental reduce randomization in base tests cases
We can and should randomly reduce down to a single result before
we passing the aggs to the final reduce. This commit changes the logic
to do that and ensures we don't trip the assertions the previous imple tripped.

Relates to #23253
2017-02-21 17:13:46 +01:00
Nik Everett
74c33823ab Comment 2017-02-21 10:43:29 -05:00
Nik Everett
0dee1f85e6 Remove closeAgg 2017-02-21 10:31:42 -05:00
Tanguy Leroux
3a0fc526bb UpdateRequest implements ToXContent (#23289)
This commit changes UpdateRequest so that it implements the ToXContentObject interface.
2017-02-21 15:20:15 +01:00
Jim Ferenczi
cc865cbc96 Add unit tests for stats and extended stats aggregations (#23287)
Add tests for InternalStats, InternalExtendedStats and StatsAggregator/ExtendedStatsAggregator

Relates #22278
2017-02-21 15:14:54 +01:00
Simon Willnauer
f933f80902 First step towards incremental reduction of query responses (#23253)
Today all query results are buffered up until we received responses of
all shards. This can hold on to a significant amount of memory if the number of
shards is large. This commit adds a first step towards incrementally reducing
aggregations results if a, per search request, configurable amount of responses
are received. If enough query results have been received and buffered all so-far
received aggregation responses will be reduced and released to be GCed.
2017-02-21 13:02:48 +01:00
Tanguy Leroux
39ed76c58b Add parsing method to bulk response (#23234)
This commit adds the `fromXContent()` parsing method to BulkResponse.
2017-02-21 10:49:40 +01:00
Tanguy Leroux
c88eb00b83 Add javadoc for DocWriteResponse.Builders (#23267) 2017-02-21 10:19:01 +01:00
Martin Scholz
24bf18b610 Upgrade HDRHistogram to 2.1.9 (#23254) 2017-02-21 08:50:26 +01:00
Martin Scholz
3e292d5245 Migrate TermsQuery to TermInSetQuery (#23229) 2017-02-21 08:49:43 +01:00
Jim Ferenczi
1ff5b318be Fix for IpRangeAggregatorTests#testRanges
Handle null from/to ranges.

Closes #23272
2017-02-20 21:16:14 +01:00
Jason Tedor
4c2bd5feab Introduce sequence-number-aware translog
Today, the relationship between Lucene and the translog is rather
simple: every document not in Lucene is guaranteed to be in the
translog. We need a stronger guarantee from the translog though, namely
that it can replay all operations after a certain sequence number. For
this to be possible, the translog has to made sequence-number aware. As
a first step, we introduce the min and max sequence numbers into the
translog so that each generation knows the possible range of operations
contained in the generation. This will enable future work to keep around
all generations containing operations after a certain sequence number
(e.g., the global checkpoint).

Relates #22822
2017-02-20 15:05:24 -05:00
Jason Tedor
15f5810774 Mark IP range aggregator test as awaits fix
This test reliably fails with the seed 4AC319F8A6B0329B.
2017-02-20 14:42:16 -05:00
Christoph Büscher
ea7deace5d Adding fromXContent to Suggest and Suggestion class (#23226)
A follow up to #23202, this adds parsing from xContent and tests to the four Suggestion implementations
and the top level suggest element to be used later when parsing the entire SearchResponse.
2017-02-20 15:45:10 +01:00
Christoph Büscher
ea9d51114c Tests: Add unit test for InternalChildren (#23261)
Relates to #22278
2017-02-20 14:02:56 +01:00
Jim Ferenczi
76d6b872dd Add unit tests for GeoBoundsAggregator/InternalGeoBounds (#23259)
* Add unit tests for GeoBoundsAggregator/InternalGeoBounds

Relates #22278
2017-02-20 12:04:30 +01:00
Jim Ferenczi
69b1463f7c Add unit tests for BinaryRangeAggregator/InternalBinaryRange (#23255)
* Add unit tests for BinaryRangeAggregator/InternalBinaryRange

Relates #22278
2017-02-20 11:55:48 +01:00
Tanguy Leroux
872412f645 [Tests] Cleans up DocWriteResponse parsing tests (#23233)
This commit cleans up some parsing tests added from the High Level Rest Client: IndexResponseTests, DeleteResponseTests, UpdateResponseTests, BulkItemResponseTests.

These tests are now more uniform with the others test-from-to-XContent tests we have, they now shuffle the XContent fields before parsing, the asserting method for parsed objects does not used a Map<String, Object> anymore, and buggy equals/hasCode methods in ShardInfo and ShardInfo.Failure have been removed.
2017-02-20 09:45:33 +01:00
Nik Everett
d9c37ce195 Adds unit test for sampler aggregation
Relates to #22278
2017-02-17 16:16:04 -05:00
Nik Everett
d1de9574ea Checkstyle: Fix link lengths in sampler aggregation 2017-02-17 15:03:57 -05:00
Jay Modi
b234644035 Enforce Content-Type requirement on the rest layer and remove deprecated methods (#23146)
This commit enforces the requirement of Content-Type for the REST layer and removes the deprecated methods in transport
requests and their usages.

While doing this, it turns out that there are many places where *Entity classes are used from the apache http client
libraries and many of these usages did not specify the content type. The methods that do not specify a content type
explicitly have been added to forbidden apis to prevent more of these from entering our code base.

Relates #19388
2017-02-17 14:45:41 -05:00
Adrien Grand
3bd1d46fc7 Add unit tests for terms aggregation objects. (#23149)
Relates #22278
2017-02-17 18:01:40 +01:00
javanna
578853f264 Remove stale comment about setting routing before parent
Order does not matter anymore since we merged #15371
2017-02-17 17:10:53 +01:00
Yuhao Bi
576e698613 Minor fix of _cat output (#23211) (#23213)
One line was missing a trailing "\n"
2017-02-17 10:46:20 +01:00
Jason Tedor
00a8b8799f Fix control group pattern
The file /proc/self/cgroup lists the control groups to which the process
belongs. This file is a colon separated list of three fields:
 1. a hierarchy ID number
 2. a comma-separated list of hierarchies
 3. the pathname of the control group in the hierarchy

The regex pattern for this contains a bug for the second field. It
allows one or two entries in the comma-separated list, but not
more. This commit fixes the pattern to allow one or more entires in the
comma-separated list.

Relates #23219
2017-02-16 15:31:18 -05:00
Christoph Büscher
268d15ec4c Adding fromXContent to Suggestion.Entry and subclasses (#23202)
This adds parsing from xContent to Suggestion.Entry and its subclasses for Terms-, Phrase-
and CompletionSuggestion.Entry.
2017-02-16 17:59:55 +01:00
markharwood
1cd1ff6010 Test fix - faulty assumptions about when exceptions are thrown in relation to number of failing shards. (#23205)
Search exceptions are thrown only when all shards report failure. Fix changes assertion logic to reflect this.

Closes #23203
2017-02-16 13:48:17 +00:00
Jason Tedor
0a5917d182 Fix get HEAD requests
Get HEAD requests incorrectly return a content-length header of 0. This
commit addresses this by removing the special handling for get HEAD
requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23186
2017-02-15 13:07:29 -05:00
Christoph Büscher
458ca09e70 Fix checkstyle issue with modifier order in DocWriteResponse 2017-02-15 17:53:39 +01:00
Tanguy Leroux
e8d669f50c Add parsing methods to BulkItemResponse (#22859)
This commit adds a parsing method to the BulkItemResponse class. In order to do that, the way DocWriteResponses are parsed has to be changed: ConstructingObjectParser/ObjectParser is removed in favor of a simpler and more readable way to parse these objects.

DocWriteResponse now provides the parseInnerToXContent() method that can be used by subclasses (IndexResponse, UpdateReponse and DeleteResponse) to parse the current token/field and potentially update a DocWriteResponseBuilder. The DocWriteResponseBuilder is a simple POJO used
to contain parsed values. It can be passed around from one parsing method to another parsing method. For example, this is what is done in IndexResponse: a IndexResponseBuilder is created in IndexResponse.fromXContent(), it get passed to IndexResponse.parseXContentFields() that
parses fields specific to IndexResponse (like "created") and updates the context, delegating to DocWriteResponse.parseInnerToXContent() the parsing of any other field. Once all XContent is parsed, IndexResponse.fromXContent() uses the method
IndexResponseBuilder.build() to create the new instance of IndexResponse.

This behavior allow to reuse parsing code among the class hierarchy while keeping the current behavior. It also allows other objects like BulkItemResponse to reuse the same parsing code to parse DocWriteResponses.

Finally, IndexResponseTests, UpdateResponseTests and DeleteResponseTests have been updated to introduce some random shuffling of fields before the XContent is parsed in order to ensure that the parsing code does not rely on field order.
2017-02-15 17:33:10 +01:00
Christoph Büscher
b963144254 Add xcontent parsing to completion suggestion option (#23071)
This adds parsing from xContent to the CompletionSuggestion.Entry.Option.
The completion suggestion option also inlines the xContent rendering of the
containes SearchHit, so in order to reuse the SearchHit parser this also changes
the way SearchHit is parsed from using a loop-based parser to using a
ConstructingObjectParser that creates an intermediate map representation and
then later uses this output to create either a single SearchHit or use it with
additional fields defined in the parser for the completion suggestion option.
2017-02-15 16:52:17 +01:00
Jim Ferenczi
3c26754f87 Add BWC index for new released version 5.2.1 2017-02-15 11:14:37 +01:00
Jim Ferenczi
f1aaa71a7f Create version constants for next bug fix version v5.2.2 2017-02-15 11:13:09 +01:00
Ryan Ernst
048c87d8a5 Improve setting deprecation message (#23156)
This change modifies the deprecation log message emitted when a setting
is found which is deprecated. The new message indicates docs for the
deprecated settings can be found in the breaking changes docs for the
next major version.

closes #22849
2017-02-14 21:33:13 -08:00
Jason Tedor
6ac1cb660b Cleanup RestGetIndicesAction.java
This commit is just a code cleanup of RestGetIndicesAction.java. For
example, we remove an unnecessary class, remove some unnecessary local
variables, and simplify some code flow.

Relates #23129
2017-02-14 16:51:27 -05:00
Jason Tedor
673754b1d5 Fix get source HEAD requests
Get source HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
source HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23151
2017-02-14 16:37:22 -05:00
Martijn van Groningen
cab43707dc [percolator] Removed old 2.x bwc logic. 2017-02-14 22:17:17 +01:00
Areek Zillur
e178dc5493 Add request version asserting during replica operation (#23167) 2017-02-14 15:40:55 -05:00
Simon Willnauer
a7a3729596 Add ExpandSearchPhase as a successor for the FetchSearchPhase (#23165)
Now that we have more flexible search phases we should move the rather
hacky integration of the collapse feature as a real search phase that can
be tested and used by itself. This commit adds a new ExpandSearchPhase
including a unittest for the phase. It's integrated into the fetch phase
as an optional successor.
2017-02-14 17:14:17 +01:00
Adrien Grand
8d6a41f671 Nested queries should avoid adding unnecessary filters when possible. (#23079)
When nested objects are present in the mappings, many queries get deoptimized
due to the need to exclude documents that are not in the right space. For
instance, a filter is applied to all queries that prevents them from matching
non-root documents (`+*:* -_type:__*`). Moreover, a filter is applied to all
child queries of `nested` queries in order to make sure that the child query
only matches child documents (`_type:__nested_path`), which is required by
`ToParentBlockJoinQuery` (the Lucene query behing Elasticsearch's `nested`
queries).

These additional filters slow down `nested` queries. In 1.7-, the cost was
somehow amortized by the fact that we cached filters very aggressively. However,
this has proven to be a significant source of slow downs since 2.0 for users
of `nested` mappings and queries, see #20797.

This change makes the filtering a bit smarter. For instance if the query is a
`match_all` query, then we need to exclude nested docs. However, if the query
is `foo: bar` then it may only match root documents since `foo` is a top-level
field, so no additional filtering is required.

Another improvement is to use a `FILTER` clause on all types rather than a
`MUST_NOT` clause on all nested paths when possible since `FILTER` clauses
are more efficient.

Here are some examples of queries and how they get rewritten:

```
"match_all": {}
```

This query gets rewritten to `ConstantScore(+*:* -_type:__*)` on master and
`ConstantScore(_type:AutomatonQuery {\norg.apache.lucene.util.automaton.Automaton@4371da44})`
with this change. The automaton is the complement of `_type:__*` so it matches
the same documents, but is faster since it is now a positive clause. Simplistic
performance testing on a 10M index where each root document has 5 nested
documents on average gave a latency of 420ms on master and 90ms with this change
applied.

```
"term": {
  "foo": {
    "value": "0"
  }
}
```

This query is rewritten to `+foo:0 #(ConstantScore(+*:* -_type:__*))^0.0` on
master and `foo:0` with this change: we do not need to filter nested docs out
since the query cannot match nested docs. While doing performance testing in
the same conditions as above, response times went from 250ms to 50ms.

```
"nested": {
  "path": "nested",
  "query": {
    "term": {
      "nested.foo": {
        "value": "0"
      }
    }
  }
}
```

This query is rewritten to
`+ToParentBlockJoinQuery (+nested.foo:0 #_type:__nested) #(ConstantScore(+*:* -_type:__*))^0.0`
on master and `ToParentBlockJoinQuery (nested.foo:0)` with this change. The
top-level filter (`-_type:__*`) could be removed since `nested` queries only
match documents of the parent space, as well as the child filter
(`#_type:__nested`) since the child query may only match nested docs since the
`nested` object has both `include_in_parent` and `include_in_root` set to
`false`. While doing performance testing in the same conditions as above,
response times went from 850ms to 270ms.
2017-02-14 16:05:19 +01:00
Adrien Grand
a969dad43e Integrate IndexOrDocValuesQuery. (#23119)
This gives Lucene the choice to use index/point-based queries or
doc-values-based queries depending on which one is more efficient. This commit
integrates this feature for:
 - long/integer/short/byte/double/float/half_float/scaled_float ranges,
 - date ranges,
 - geo bounding box queries,
 - geo distance queries.
2017-02-14 15:57:12 +01:00
Jun Ohtani
12bbe6e660 Merge pull request #23161 from johtani/support_keyword_to_analyze_api
[Analyze]Support Keyword type in Analyze API
2017-02-14 23:22:32 +09:00
Christoph Büscher
abc8cd6c5f Remove unused sourceAsBytes field in SearchHit 2017-02-14 14:08:38 +01:00
Simon Willnauer
aef0665ddb Detach SearchPhases from AbstractSearchAsyncAction (#23118)
Today all search phases are inner classes of AbstractSearchAsyncAction or one of it's
subclasses. This makes unit testing of these classes practically impossible. This commit
Extracts `DfsQueryPhase` and `FetchSearchPhase` or of the code that composes the actual
query execution types and moves most of the fan-out and collect code into an `InitialSearchPhase`
class that can be used to build initial search phases (phases that retry on shards). This will
make modification to these classes simpler and allows to easily compose or add new search phases
down the road if additional roundtrips are required.
2017-02-14 12:34:25 +01:00
Jun Ohtani
34ebb88650 [Analyze]Support Keyword type in Analyze API
Add comment and clarify
2017-02-14 17:56:36 +09:00
Jun Ohtani
4d823d69f4 [Analyze]Support Keyword type in Analyze API 2017-02-14 16:41:16 +09:00
Jason Tedor
5343b87502 Handle bad HTTP requests
When Netty decodes a bad HTTP request, it marks the decoder result on
the HTTP request as a failure, and reroutes the request to GET
/bad-request. This either leads to puzzling responses when a bad request
is sent to Elasticsearch (if an index named "bad-request" does not exist
then it produces an index not found exception and otherwise responds
with the index settings for the index named "bad-request"). This commit
addresses this by inspecting the decoder result on the HTTP request and
dispatching the request to a bad request handler preserving the initial
cause of the bad request and providing an error message to the client.

Relates #23153
2017-02-13 17:39:25 -05:00
Jay Modi
61e383813d Make the version of the remote node accessible on a transport channel (#23019)
This commit adds a new method to the TransportChannel that provides access to the version of the
remote node that the response is being sent on and that the request came from. This is helpful
for serialization of data attached as headers.
2017-02-13 15:15:57 -05:00
Lee Hinman
b42d47770c Fix total disk bytes returning negative value (#23093)
* Fix total disk bytes returning negative value

This adds a workaround for JDK-8162520 -
https://bugs.openjdk.java.net/browse/JDK-8162520

Some filesystems can be so large that they return a negative value for their
free/used/available disk bytes due to being larger than `Long.MAX_VALUE`.

This adds protection for our `FsProbe` implementation and adds a test that it
does the right thing.
2017-02-13 11:20:15 -07:00
jaymode
d8d03f45c2
Fix communication with 5.3.0 nodes
This commit fixes communication with 5.3.0 nodes to send XContentType to these nodes since #22691 was backported to the
5.3 branch.
2017-02-13 13:15:51 -05:00