Commit Graph

881 Commits

Author SHA1 Message Date
Christoph Büscher c541a0c60e Add skip versions for rank_eval yaml tests 2017-12-14 22:18:37 +01:00
Christoph Büscher bb14b8f7c5 Merge branch 'rankeval'
This commit adds a new module that provides an endpoint that can be used to
evaluate search ranking results.

Closes #19195
2017-12-14 16:45:03 +01:00
Nik Everett cc1a301b5e Packaging test: add guard for too many files
If you assert that a pattern of files exists but it matches more then
one file the "assert this file exists" code failed with a misleading
error message. This tests if the patter resolved to multiple files and
prints a better error message if it did.
2017-12-12 11:10:49 -05:00
Christoph Büscher 97b25f3b0c Merge branch 'master' into rankeval 2017-12-11 15:19:16 +01:00
Boaz Leskes 64dd21af11 remove await fix from FullClusterRestartIT.testRecovery 2017-12-08 16:45:45 +01:00
Christoph Büscher 52cb6c8ef2 Merge branch 'master' into rankeval 2017-12-07 14:22:46 +01:00
Boaz Leskes 83c8daa5e8 FullClusterRestartIT.testRecovery - add more info on failure 2017-12-05 13:00:42 +01:00
Lee Hinman d3721c48c3 [TEST] Add AwaitsFix for FullClusterRestartIT.testRecovery
See: #27649
2017-12-04 15:42:43 -07:00
Boaz Leskes 0b50b313d2 RecoveryIT.testRecoveryWithConcurrentIndexing should check for 110 docs in an upgraded cluster
Closes #27650
2017-12-04 18:06:02 +01:00
Boaz Leskes f58a3d0b96 testRelocationWithConcurrentIndexing: wait for green (on relevan index) and shard initialization to settle down before starting relocation 2017-12-04 13:18:42 +01:00
Boaz Leskes 1a976ea7a4 Cherry pick tests and seqNo recovery hardning from #27580 2017-12-04 13:15:40 +01:00
Christoph Büscher bbec33d35c Merge branch 'master' into rankeval 2017-12-04 12:57:19 +01:00
Jason Tedor cd67f6a8d7
Enable GC logs by default
For too long we have been groping around in the dark when faced with GC
issues because we rarely have GC logs at our disposal. This commit
enables GC logging by default out of the box.

Relates #27610
2017-12-03 08:33:21 -05:00
Christoph Büscher 35688f6441 Merge branch 'master' into rankeval 2017-11-29 15:24:06 +01:00
Jason Tedor 0519fa223c
Ensure logging is configured for CLI commands
Any CLI commands that depend on core Elasticsearch might touch classes
(directly or indirectly) that depends on logging. If they do this and
logging is not configured, Log4j will dump status error messages to the
console. As such, we need to ensure that any such CLI command configures
logging (with a trivial configuration that dumps log messages to the
console). Previously we did this in the base CLI command but with the
refactoring of this class out of core Elasticsearch, we no longer
configure logging there (since we did not want this class to depend on
settings and logging). However, this meant for some CLI commands (like
the plugin CLI) we were no longer configuring logging. This commit adds
base classes between the low-level command and multi-command classes
that ensure that logging is configured. Any CLI command that depends on
core Elasticsearch should use this infrastructure to ensure logging is
configured. There is one exception to this: Elasticsearch itself because
it takes reponsibility into its own hands for configuring logging from
Elasticsearch settings and log4j2.properties. We preserve this special
status.

Relates #27523
2017-11-25 11:40:08 -05:00
Christoph Büscher 5661b1c3df Merge branch 'master' into rankeval 2017-11-24 16:25:05 +01:00
David Turner 89ba8996c6 Consolidate version numbering semantics (#27397)
Fixes to the build system, particularly around BWC testing, and to make future
version bumps less painful.
2017-11-23 20:21:53 +00:00
Simon Willnauer fadbe0de08
Automatically prepare indices for splitting (#27451)
Today we require users to prepare their indices for split operations.
Yet, we can do this automatically when an index is created which would
make the split feature a much more appealing option since it doesn't have
any 3rd party prerequisites anymore.

This change automatically sets the number of routinng shards such that
an index is guaranteed to be able to split once into twice as many shards.
The number of routing shards is scaled towards the default shard limit per index
such that indices with a smaller amount of shards can be split more often than
larger ones. For instance an index with 1 or 2 shards can be split 10x
(until it approaches 1024 shards) while an index created with 128 shards can only
be split 3x by a factor of 2. Please note this is just a default value and users
can still prepare their indices with `index.number_of_routing_shards` for custom
splitting.

NOTE: this change has an impact on the document distribution since we are changing
the hash space. Documents are still uniformly distributed across all shards but since
we are artificually changing the number of buckets in the consistent hashign space
document might be hashed into different shards compared to previous versions.

This is a 7.0 only change.
2017-11-23 09:48:54 +01:00
javanna 3eeccb7791 Update version check for CCS optional remote clusters
also fixed the remote.info yaml test to clean up the registered remote cluster once the test is completed.

Relates to #27182
2017-11-21 16:52:45 +01:00
Christoph Büscher d979ccace9 Merge branch 'master' into rankeval 2017-11-21 14:11:02 +01:00
Christoph Büscher 3348d2317f Reworking javadocs, minor changes in some implementation classes 2017-11-21 14:09:04 +01:00
Christoph Büscher 0a6c6ac360 Remove usage of types in rank_eval endpoint 2017-11-21 14:07:41 +01:00
Luca Cavanna 29450de7b5
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.

This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.

Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.

The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:

"_clusters" : {
    "total" : 3,
    "successful" : 2,
    "skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.

The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 11:41:47 +01:00
Michael Basnight cb3e8f4763
Move the CLI into its own subproject (#27114)
Projects the depend on the CLI currently depend on core. This should not
always be the case. The EnvironmentAwareCommand will remain in :core,
but the rest of the CLI components have been moved into their own
subproject of :core, :core:cli.
2017-11-18 21:42:57 -06:00
Nhat Nguyen db688e1a17
Uses TransportMasterNodeAction to update shard snapshot status (#27165)
Currently, we are using a plain TransportRequestHandler to post snapshot
status messages to the master. However, it doesn't have a robust retry
mechanism as TransportMasterNodeAction. This change migrates from
TransportRequestHandler to TransportMasterNodeAction for the new
versions and keeps the current implementation for the old versions.

Closes #27151
2017-11-17 11:54:44 -05:00
Tanguy Leroux 0b5899c647 [Test] Change Elasticsearch startup timeout to 120s in packaging tests
When the vagrant box is very very slow, the elasticsearch service can
take more than 60 sec to start. This commit changes the timeout to 120.

closes #27372
2017-11-15 11:58:47 +01:00
Clinton Gormley 1caa5c8e32 Rest test fixes (#27354)
* REST: Rename ingest.processor.grok to ingest.processor_grok
* REST: Rename remote.info to cluster.remote_info
* REST: Fixed bad YAML comments
* REST: Force dummy scripts to be strings, not numbers
* REST: Fix bad YAML in search/110_field_collapsing.yml
* REST: Adjust percentile tests to work with Perl number handling
2017-11-14 11:14:14 +01:00
Tanguy Leroux 91a23de55e [Test] Fix POI version in packaging tests
POI version has not been updated in packaging tests in #25003.

Closes #27340
2017-11-13 14:20:10 +01:00
Martijn van Groningen 4f43fe70cb
test: Sort hits by _id instead of _doc and
cleanup tests by removing unneeded parameter and settings.
2017-11-10 12:11:51 +01:00
Martijn van Groningen b4048b4e7f
Use CoveringQuery to select percolate candidate matches and
extract all clauses from a conjunction query.

When clauses from a conjunction are extracted the number of clauses is
also stored in an internal doc values field (minimum_should_match field).
This field is used by the CoveringQuery and allows the percolator to
reduce the number of false positives when selecting candidate matches and
in certain cases be absolutely sure that a conjunction candidate match
will match and then skip MemoryIndex validation. This can greatly improve
performance.

Before this change only a single clause was extracted from a conjunction
query. The percolator tried to extract the clauses that was rarest in order
(based on term length) to attempt less candidate queries to be selected
in the first place. However this still method there is still a very high
chance that candidate query matches are false positives.

This change also removes the influencing query extraction added via #26081
as this is no longer needed because now all conjunction clauses are extracted.

https://www.elastic.co/guide/en/elasticsearch/reference/6.x/percolator.html#_influencing_query_extraction

Closes #26307
2017-11-10 07:44:42 +01:00
Yannick Welsch e04e5ab037 Increase logging on qa:mixed-cluster tests
Hopefully helps to figure out why the nodes have trouble starting up.
2017-11-09 15:18:53 +01:00
Jason Tedor d5451b2037
Die with dignity while merging
If an out of memory error is thrown while merging, today we quietly
rewrap it into a merge exception and the out of memory error is
lost. Instead, we need to rethrow out of memory errors, and in fact any
fatal error here, and let those go uncaught so that the node is torn
down. This commit causes this to be the case.

Relates #27265
2017-11-06 17:55:11 -05:00
Jason Tedor 766d29e7cf
Correctly encode warning headers
The warnings headers have a fairly limited set of valid characters
(cf. quoted-text in RFC 7230). While we have assertions that we adhere
to this set of valid characters ensuring that our warning messages do
not violate the specificaion, we were neglecting the possibility that
arbitrary user input would trickle into these warning headers. Thus,
missing here was tests for these situations and encoding of characters
that appear outside the set of valid characters. This commit addresses
this by encoding any characters in a deprecation message that are not
from the set of valid characters.

Relates #27269
2017-11-06 13:20:30 -05:00
David Roberts 749c3ec716
Remove the single argument Environment constructor (#27235)
Only tests should use the single argument Environment constructor.  To
enforce this the single arg Environment constructor has been replaced with
a test framework factory method.

Production code (beyond initial Bootstrap) should always use the same
Environment object that Node.getEnvironment() returns.  This Environment
is also available via dependency injection.
2017-11-04 13:25:09 +00:00
Martijn van Groningen 9e67cca987
build: Fix setting the incorrect bwc version in mixed cluster qa module
Prior to this change if the `bwcTest` task is run then it would create
task for each version, but each task in reality would use wireCompatVersions - 1
ES version. So we were not actually testing against 5.6.x versions in the
6.x and 6.0 branches.
2017-11-03 14:18:27 +01:00
Jason Tedor 8b4a92fbb7 Adjust assertions for sequence numbers BWC tests
This commit adjusts the assertions for the sequence number BWC tests to
account for the fact that sometimes these tests are run in
mixed-clusters with 5.6 nodes (that do not understand sequence numbers),
and sometimes these tests are run in mixed-cluster with 6.0+ nodes (that
all understood sequence numbers).

Relates #27251
2017-11-03 08:58:05 -04:00
Jason Tedor 77f87732ef Adjust .DS_Store test assertions on Windows
Windows handles trying to read a file that does not exist because a
component of the path is not a directory differently than other OS
handle this situation. This commit adjusts these assertions for Windows.
2017-10-25 22:36:53 -04:00
Jason Tedor 6722b9c4a2 Ignore .DS_Store files on macOS
Finder creates these files if you browse a directory there. These files
are really annoying, but it's an incredible pain for users that these
files are created unbeknownst to them, and then they get in the way of
Elasticsearch starting. This commit adds leniency on macOS only to skip
these files.

Relates #27108
2017-10-25 11:25:29 -04:00
Simon Willnauer 8dda827ff4 Don't refresh on `_flush` `_force_merge` and `_upgrade` (#27000)
Today all these API calls have a sideeffect of making documents visible
to search requests. While this is sometimes desired it's an unnecessary sideeffect
and now that we have an internal (engine-private) index reader (#26972) we artificially
add a refresh call for bwc. This change removes this sideeffect in 7.0.
2017-10-16 10:16:35 +02:00
Anton Pozhidaev cee9640c20 Update by Query is modified to accept short `script` parameter. (#26841)
Update by Query is modified to accept short `script` parameter.

Closes issue #24898
2017-10-11 21:57:46 +00:00
kel 2e36f19051 Add support for parsing inline script (#23824) (#26846)
* Add support for parsing inline script (#23824)

* Fix test
2017-10-11 09:15:37 -07:00
Martijn van Groningen 19dc629e6d
Test query builder bwc against previous supported versions instead of just the current version.
Relates to #25456
2017-10-09 13:22:01 +02:00
Yannick Welsch a4436195f8 Set minimum_master_nodes on rolling-upgrade test (#26911)
The rolling-upgrade test was only writing the "minimum_master_nodes" setting to the configuration file of the old nodes, but not the upgraded ones.

Also changes the value of "minimum_master_nodes" from "number_of_nodes" to "(number_of_nodes / 2) + 1".
2017-10-09 10:45:03 +02:00
Simon Willnauer cdd7c1e6c2 Return List instead of an array from settings (#26903)
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.
2017-10-09 09:52:08 +02:00
Nhat bf4c3642b2 remove _primary and _replica shard preferences (#26791)
The shard preference _primary, _replica and its variants were useful
for the asynchronous replication. However, with the current impl, they
are no longer useful and should be removed.

Closes #26335
2017-10-08 11:03:06 -04:00
Boaz Leskes c342cdeab5 Setup debug logging for qa.full-cluster-restart 2017-10-07 23:37:09 +02:00
Boaz Leskes 2d409a912f full-cluster-restart tests: prevent shards from going inactive
FullClusterRestartIT.testRecovery relies on the translogs not being flushed
2017-10-05 10:08:10 +02:00
Boaz Leskes 2a04118e88 Promote common rest test utility methods to ESRestTestCase
We have duplicates in some classes and I was about to create one more.
2017-10-05 10:08:10 +02:00
Luca Cavanna 9b9cb81c41 Fix serialization errors when cross cluster search goes to a single shard (#26881)
The single shard optimization that we have in our search api changes the type of response returned by the query transport action name based on the shard search request. if the request goes to one shard, we will do query and fetch at the same time, hence the response will be different. The proxying layer used in cross cluster search was not aware of this distinction, which causes serialization issues every time a cross cluster search request goes to a single shard and goes through a gateway node which has to forward the shard request to a data node. The coordinating node would then expect a QueryFetchSearchResult while the gateway would return a QuerySearchResult.

Closes #26833
2017-10-04 22:39:14 +02:00
Simon Willnauer d1533e2397 Remove Settings#getAsMap() (#26845)
Since `#getAsMap` exposes internal representation we are trying to remove it
step by step. This commit is cleaning up some xcontent writing as well as
usage in tests
2017-10-04 01:21:38 -06:00