7441 Commits

Author SHA1 Message Date
Nik Everett
8abd4101eb Add tests for reducing top hits
Also adds many `equals` and `hashCode` implementations and moves
the failure printing in `MatchAssertion` into a common spot and
exposes it over `assertEqualsWithErrorMessageFromXContent` which
does an object equality test but then uses `toXContent` to print
the differences.

Relates to #22278
2017-01-27 12:32:17 -05:00
Nik Everett
1baa884ab7 Fix TophitsAggregatorTests
It needs a DirectoryReader so it has to be careful.

Closes #22818
2017-01-26 14:08:30 -05:00
Nik Everett
f8c28711be Merge some equivalent interfaces (#22816)
Remove `FromXContent` and use `CheckedFunction` instead.
Remove `FromXContentWithContext` and use `ContentParser` instead.
2017-01-26 13:15:29 -05:00
Simon Willnauer
a475323aa1 Invalidate cached query results if query timed out (#22807)
Today we cache query results even if the query timed out. This is obviously
problematic since results are not complete. Yet, the decision if a query timed
out or not happens too late to simply not cache the result since if we'd just throw
an exception all currently waiting requests with the same request / cache key would
fail with the same exception without the option to access the result or to re-execute.
Instead, this change will allow the request to enter the cache but invalidates it immediately.
Concurrent request might not get executed and return the timed out result which is not absolutely
correct but very likely since identical requests will likely timeout as well. As a side-effect
we won't hammer the node with concurrent slow searches but rather only execute one of them
and return shortly cached result.

Closes #22789
2017-01-26 16:45:29 +01:00
Tanguy Leroux
1fa2734566 [TEST] Fix ElasticsearchExceptionTests
Some test failures can happen in ElasticsearchExceptionTests, this commit fixes them.
2017-01-26 16:33:56 +01:00
Tanguy Leroux
be96278c95 Add parsing method for ElasticsearchException.generateThrowableXContent() (#22783)
The output of the ElasticsearchException.generateThrowableXContent() method can be parsed back by the ElasticsearchException.fromXContent() method.

This commit adds unit tests in the style of the to-and-from-xcontent tests we already have for other parsing methods. It also relax the strict parsing of the ElasticsearchException.fromXContent() so that it does not throw an exception when custom metadata and headers are parsed, as long as they are either strings or arrays of strings. Every other type is ignored at parsing time.
2017-01-26 15:17:07 +01:00
Simon Willnauer
f128b7a7fe Improve connection closing in RemoteClusterConnection (#22804)
Some tests verify that all connection have been closed but due to the
async / concurrent nature of `RemoteClusterConnection` there are situations
where we notify listeners that trigger tests to finish before we actually
closed all connections. The race is very very small and has no impact on the
code correctness. This commit documents and improves the way we close
connections to ensure test won't fail with false positives.

Closes #22803
2017-01-26 13:58:26 +01:00
Simon Willnauer
281250dec9 Remove DFS_QUERY_AND_FETCH as a search type (#22787)
This commit removes the search type `dfs_query_and_fetch` without a
replacement. We don't allow to use this type via REST since 2.x
but still keep it around for no particular reason. There we no users
complaining about the availability. This should now be removed from the
codebase. `query_and_fetch` is still used internally to safe a roundtrip
if there is only one shard but it can't be used via the rest interface.
2017-01-26 09:14:44 +01:00
Tim Brooks
719e75bb3f Add repository-url module and move URLRepository (#22752)
This is related to #22116. URLRepository requires SocketPermission
connect. This commit introduces a new module called "repository-url"
where URLRepository will reside. With the new module, permissions can
be removed from core.
2017-01-25 17:09:25 -06:00
Nik Everett
d704a880e7 Add tests for top_hits aggregation (#22754)
Add unit tests for `TopHitsAggregator` and convert some snippets in
docs for `top_hits` aggregation to `// CONSOLE`.

Relates to #22278
Relates to #18160
2017-01-25 16:15:50 -05:00
Martijn van Groningen
f6ed39aa08 Merge branch 'pr/22772' 2017-01-25 17:15:24 +01:00
Martijn van Groningen
81e40e3139 [TEST] Added this for 93a28b0acfff5332e97616f53b5d52fe8b933306 submitted via #22772 2017-01-25 17:08:17 +01:00
Jason Tedor
cb822b4670 Fix typo in comment in OsProbe.java
This commit fixes a silly typo in a comment relating to cgroups in
OsProbe.java.
2017-01-25 06:30:51 -05:00
Colin Goodheart-Smithe
a9135cd636 RangeQuery WITHIN case now normalises query (#22431)
Previous to his change when the range query was rewritten to an unbounded range (`[* TO *]`) it maintained the timezone and format for the query. This means that queries with different timezones and format which are rewritten to unbounded range queries actually end up as different entries in the search request cache.

This is inefficient and unnecessary so this change nulls the timezone and format in the rewritten query so that regardless of the timezone or format the rewritten query will be the same.

Although this does not fix #22412 (since it deals with the WITHIN case rather than the INTERSECTS case) it is born from the same arguments
2017-01-25 10:37:15 +00:00
Boaz Leskes
ed94f75a15 Remove EngineClosedException
All usage has been removed in https://github.com/elastic/elasticsearch/pull/22631, which is back ported to 5.x. This means 6.x will never get it on the wire and we can remove it
2017-01-25 11:00:50 +01:00
javanna
5103b76610 update version checks in ElasticsearchException serialization code
5.3.0 is the first version that contains the split from headers to metadata, updated the check to reflect that. It was previously after to be able to commit to master first, and only after that backport the change. Otherwise master tests would have failed until the change was backported.
2017-01-24 20:40:17 +01:00
Lee Hinman
304296ea6a Fix BulkItemResponse serialization for 6.x <-> 5.3.x
Previously the behavior where the `OpType` byte was serialized was only in
master, but it was recently backported to 5.x, so the serialization version
checks need to be updated as well.
2017-01-24 12:04:11 -07:00
srgclr
93a28b0acf skip parentid if child document is an orphan
#22770
2017-01-24 17:49:53 +00:00
Luca Cavanna
47c0e13a3b Stop returning "es." internal exception headers as http response headers (#22703)
move "es." internal headers to separate metadata set in ElasticsearchException and stop returning them as response headers

Closes #17593

* [TEST] remove ESExceptionTests, move its methods to ElasticsearchExceptionTests or ExceptionSerializationTests
2017-01-24 16:12:45 +01:00
Jason Tedor
bcffc6fa49 Add hack for Docker cgroups
Docker cgroups are mounted in the wrong place (i.e., inconsistently with
/proc/self/cgroup). This commit adds an undocumented hack for working
around, for now.

Relates #22757
2017-01-24 06:36:03 -05:00
Christoph Büscher
59aefe5a38 Include human readable responses in response parsing tests (#22717)
As a follow up to #22649, this changes the resent tests for parsing parts of search 
responses to randomly set the humanReadable() flag of the XContentBuilder that 
is used to render the responses. This should help to test that we can parse back 
thoses classes if the user specifies `?human=true` in the request url.
2017-01-24 11:17:58 +01:00
Jim Ferenczi
b0c2a5da30 Remove unused field in CollapseBuilder 2017-01-24 09:26:23 +01:00
Jim Ferenczi
868b12b548 Add BWC tests for field collapsing
Field collapsing is supported from version 5.3
2017-01-24 08:34:16 +01:00
Nik Everett
2e399e5505 Rename constant
It deserves a new name after the cleanup in #22749
2017-01-23 16:30:42 -05:00
javanna
8065531236 [TEST] add test to verify that the SMILE format works within the _bulk api 2017-01-23 19:40:24 +01:00
Nik Everett
ee264c6957 Fix parsing for max_determinized_states (#22749)
There was a typo in the `ParseField` declaration. I know
we want to port these parsers to `ObjectParser` eventually
but I don't have the energy for that today and want to get
this fixed.

Closes #22722
2017-01-23 11:57:43 -05:00
Jim Ferenczi
e48bc2eed7 Add field collapsing for search request (#22337)
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
Tanguy Leroux
11164b394b Add unit tests for ValueCountAggregator and InternalValueCount (#22741)
Adds unit tests for the value count aggregator.

Relates #22278
2017-01-23 16:24:55 +01:00
Simon Willnauer
27b5c2ad54 Pass forceExecution flag to transport interceptor (#22739)
To effectively allow a plugin to intercept a transport handler it needs
to know if the handler must be executed even if there is a rejection on the
thread pool in the case the wrapper forks a thread to execute the actual handler.
2017-01-23 11:04:27 +01:00
Alexander Reelsen
6159ca28ae Version: Add missing releases from 2.x in Version.java (#22594) 2017-01-23 09:53:21 +01:00
Tim Brooks
a4ac29c005 Add single static instance of SpecialPermission (#22726)
This commit adds a SpecialPermission constant and uses that constant
opposed to introducing new instances everywhere.

Additionally, this commit introduces a single static method to check that
the current code has permission. This avoids all the duplicated access
blocks that exist currently.
2017-01-21 12:03:52 -06:00
Simon Willnauer
3ad6d6ebcc Simplify InternalEngine#innerIndex (#22721)
Today `InternalEngine#innerIndex` is a pretty big method (> 150 SLoC). This
commit merged `#index` and `#innerIndex` and splits it up into smaller contained
methods.
2017-01-21 08:51:35 +01:00
Jim Ferenczi
8028578305 Upgrade to Lucene 6.4.0 (#22724)
* Upgrade to Lucene 6.4.0

`ValueSource`s are now converted to `DoubleValueSource`s using the Lucene adapter made for the migration to the new API in 6.4.0.
2017-01-21 04:48:01 +01:00
Igor Motov
cfb415de7b Fix broken TaskInfo.toString()
Related to #22387
2017-01-20 20:57:42 -05:00
Tim Brooks
d86f97c428 Add CheckedSupplier and CheckedRunnable to core (#22725)
Introduce CheckedSupplier and CheckedRunnable functional interfaces
into core. These offer a checked version of the Supplier and Runnable
interfaces for use with lambda apis.
2017-01-20 19:17:44 -06:00
Ali Beyad
3bf06d1440 Fixes retrieval of the latest snapshot index blob (#22700)
This commit ensures that the index.latest blob is first examined to
determine the latest index-N blob id, before attempting to list all
index-N blobs and picking the blob with the highest N.

It also fixes the MockRepository#move so that tests are able to handle
non-atomic moves.  This is done by adding a special setting to the
MockRepository that requires the test to specify if it can handle
non-atomic moves.  If so, then the MockRepository#move operation will be
non-atomic to allow testing for against such repositories.
2017-01-20 17:00:46 -06:00
Jim Ferenczi
4ec4bad908 Fix script score function that combines _score and weight (#22713)
The weight factor function does not check if the delegate score function needs to access the score of the query.
This results in a _score equals to 0 for all score function that set a weight.
This change modifies the WeightFactorFunction#needsScore to delegate the call to its underlying score function.

Fix #21483
2017-01-20 19:50:57 +01:00
Nik Everett
6265ef1c1b Deguice rest handlers (#22575)
There are presently 7 ctor args used in any rest handlers:
* `Settings`: Every handler uses it to initialize a logger and
  some other strange things.
* `RestController`: Every handler registers itself with it.
* `ClusterSettings`: Used by `RestClusterGetSettingsAction` to
  render the default values for cluster settings.
* `IndexScopedSettings`: Used by `RestGetSettingsAction` to get
  the default values for index settings.
* `SettingsFilter`: Used by a few handlers to filter returned
  settings so we don't expose stuff like passwords.
* `IndexNameExpressionResolver`: Used by `_cat/indices` to
  filter the list of indices.
* `Supplier<DiscoveryNodes>`: Used to fill enrich the response
  by handlers that list tasks.

We probably want to reduce these arguments over time but
switching construction away from guice gives us tighter
control over the list of available arguments.

These parameters are passed to plugins using
`ActionPlugin#initRestHandlers` which is expected to build and
return that handlers immediately. This felt simpler than
returning an reference to the ctors given all the different
possible args.

Breaks java plugins by moving rest handlers off of guice.
2017-01-20 11:48:51 -05:00
Simon Willnauer
824beea89d Fix handling of document failure expcetion in InternalEngine (#22718)
Today we try to be smart and make a generic decision if an exception should
be treated as a document failure but in some cases concurrency in the index writer
make this decision very difficult since we don't have a consistent state in the case
another thread is currently failing the IndexWriter/InternalEngine due to a tragic event.

This change simplifies the exception handling and makes specific decisions about document failures
rather than using a generic heuristic. This prevent exceptions to be treated as document failures
that should have failed the engine but backed out of failing since since some other thread has
already taken over the failure procedure but didn't finish yet.
2017-01-20 16:55:00 +01:00
markharwood
f01784205f New AdjacencyMatrix aggregation
Similar to the Filters aggregation but only supports "keyed" filter buckets and automatically "ANDs" pairs of filters to produce a form of adjacency matrix.
The intersection of buckets "A" and "B" is named "A&B" (the choice of separator is configurable). Empty intersection buckets are removed from the final results.

Closes #22169
2017-01-20 15:49:31 +00:00
Tim Brooks
bc16162d21 Remove accept SocketPermissions from core (#22622)
This is related to #22116. Core no longer needs SocketPermission 
accept. This permission is relegated to the transport-netty4 module 
and (for tests) to the mocksocket jar.
2017-01-20 09:27:45 -06:00
Tanguy Leroux
239ed0c912 Add unit tests for DateHistogramAggregator (#22714)
Adds unit tests for the date histogram aggregator.

Relates #22278
2017-01-20 14:18:30 +01:00
Christoph Büscher
54105f3ddd Add parsing from xContent to ShardSearchFailure (#22699)
In preparation for being able to parse SearchResponse from its rest
representation, this adds fromXContent to ShardSearchFailure.
2017-01-20 12:49:54 +01:00
Yannick Welsch
1f0e0a2170 Close InputStream when receiving cluster state in PublishClusterStateAction (#22711)
Not closing the InputStream will leak native memory as the DeflateCompressor/Inflater won't be closed.
2017-01-20 12:26:07 +01:00
Boaz Leskes
5d806bf93e Index creation and setting update may not return deprecation logging (#22702)
Those services validate their setting before submitting an AckedClusterStateUpdateTask to the cluster state service. An acked cluster state may be completed by a networking thread when the last acks as received. As such it needs special care to make sure that thread context headers are handled correctly.
2017-01-20 10:14:13 +01:00
David Pilato
fc4dc5ef21 Fix comment 2017-01-20 10:13:13 +01:00
David Pilato
ad5b8def26 Merge branch 'pr/delete-from-xcontent' 2017-01-20 09:16:34 +01:00
Lee Hinman
eb8a41ef94 Add missing serialization BWC for disk usage estimates
Relates to #22081
2017-01-19 15:37:06 -07:00
Lee Hinman
4eb32e9d86 Expose disk usage estimates in nodes stats
This exposes the least and most used disk usage estimates within the "fs" nodes
stats output:

```json
GET /_nodes/stats/fs?pretty&human
{
  "nodes" : {
    "34fPVU0uQ_-wWitDzDXX_g" : {
      "fs" : {
        "timestamp" : 1481238723550,
        "total" : {
          "total" : "396.1gb",
          "total_in_bytes" : 425343254528,
          "free" : "140.6gb",
          "free_in_bytes" : 151068725248,
          "available" : "120.5gb",
          "available_in_bytes" : 129438912512
        },
        "least_usage_estimate" : {
          "path" : "/home/hinmanm/es/elasticsearch/distribution/build/cluster/run node0/elasticsearch-6.0.0-alpha1-SNAPSHOT/data/nodes/0",
          "total" : "396.1gb",
          "total_in_bytes" : 425343254528,
          "available" : "120.5gb",
          "available_in_bytes" : 129438633984,
          "used_disk_percent" : 69.56842912023208
        },
        "most_usage_estimate" : {
          "path" : "/home/hinmanm/es/elasticsearch/distribution/build/cluster/run node0/elasticsearch-6.0.0-alpha1-SNAPSHOT/data/nodes/0",
          "total" : "396.1gb",
          "total_in_bytes" : 425343254528,
          "available" : "120.5gb",
          "available_in_bytes" : 129438633984,
          "used_disk_percent" : 69.56842912023208
        },
        "data" : [{...}],
        "io_stats" : {...}
      }
    }
  }
}
```

Resolves #8686
Resolves #22081
2017-01-19 13:56:52 -07:00
Jason Tedor
9781b88a38 Fix deprecation logging for lenient booleans
This commit fixes an issue with deprecation logging for lenient
booleans. The underlying issue is that adding deprecation logging for
lenient booleans added a static deprecation logger to the Settings
class. However, the Settings class is initialized very early and in CLI
tools can be initialized before logging is initialized. This leads to
status logger error messages. Additionally, the deprecation logging for
a lot of the settings does not provide useful context (for example, in
the token filter factories, the deprecation logging only produces the
name of the setting, but gives no context which token filter factory it
comes from). This commit addresses both of these issues by changing the
call sites to push a deprecation logger through to the lenient boolean
parsing.

Relates #22696
2017-01-19 12:30:33 -05:00