Commit Graph

27920 Commits

Author SHA1 Message Date
Jim Ferenczi 68deda6d03 FastVectorHighlighter should not cache the field query globally (#25197)
This commit removes the global caching of the field query and replaces it with
a caching per field. Each field can use a different `highlight_query` and the rewriting of
some queries (prefix, automaton, ...) depends on the targeted field so the query used for highlighting
must be unique per field.
There might be a small performance penalty when highlighting multiple fields since the query needs to be rewritten
once per highlighted field with this change.

Fixes #25171
2017-06-15 00:33:01 +02:00
Lee Hinman 4a30e23365 Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes (#25223)
* Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes

This was a BWC layer where we expicitly set the `search_type` to
"query_and_fetch" when a single node is queried on pre-5.3 nodes. Since 6.0 no
longer needs to be compatible with 5.3 nodes, this can be removed.

* Fix indentation

* Remove unused QUERY_FETCH_ACTION_NAME constant
2017-06-14 15:42:29 -06:00
Zachary Tong 52719b2118 Add more missing AggregationBuilder getters (#25198)
* Add more missing AggregationBuilder getters

- getMetadata for all aggs
- various getters on TermsAggBuilder (without "get" prefix to maintain convention)
- Also makes InternalSum's ctor public, to follow suit of other metrics (min/max/avg/etc)
2017-06-14 14:31:01 -04:00
Nik Everett ce11b894b4 Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests (#25204)
Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests. That way they are easier to read.
2017-06-14 13:03:59 -04:00
Lee Hinman aa3134c093 Refactor TransportShardBulkAction.executeUpdateRequest and add tests
This splits `executeUpdateRequest` into separate parts and adds some unit tests
for the behavior in it. The actual behavior has not been changed.
2017-06-14 09:27:58 -06:00
Adrien Grand cadd31b3a8 Make sure range queries are correctly profiled. (#25108)
We introduced a new API for ranges in order to be able to decide whether points
or doc values would be more appropriate to execute a query, but since
`ProfileWeight` does not implement this API, the optimization is disabled when
profiling is enabled.
2017-06-14 16:31:16 +02:00
Jay Modi ed76b9a518 Test: allow setting socket timeout for rest client (#25221)
In #25201, a setting was added to allow setting the retry timeout for the rest client under the
impression that this would allow requests to go longer than 30s. However, there is also a socket
timeout that needs to be set to greater than 30s, which this change adds a setting for.
2017-06-14 08:21:56 -06:00
Boaz Leskes a0fcfc732d Migration docs for #25080 (#25218) 2017-06-14 14:06:53 +02:00
John Murphy c652b586c4 Remove `discovery.type` BWC layer from the EC2/Azure/GCE plugins #25080
Those plugins don't replace the discovery logic but rather only provide a custom unicast host provider for their respective platforms. in 5.1 we introduced the  `discovery.zen.hosts_provider` setting to better reflect it. This PR removes BWC code in those plugins as it is not needed anymore

Fixes #24543
2017-06-14 13:52:48 +02:00
David Roberts a5658c0fea When stopping via systemd only kill the JVM, not its control group (#25195)
This prevents possible race conditions between the Elasticsearch JVM and
plugin native controller processes that can cause the Elasticsearch shutdown
to hang.  The problem can happen when the JVM and the controller process
receive a SIGTERM at almost the same time.

(There's an assumption here that Elasticsearch will continue to use other
mechanisms to kill native controller processes.)
2017-06-14 09:23:41 +01:00
Martijn van Groningen e333955557
Remove PrefixAnalyzer, because it is no longer used. 2017-06-14 08:59:10 +02:00
Ryan Ernst 9ec1fc7b02 Internal: Remove Strings.cleanPath (#25209)
This commit removes the cleanPath method, in favor of using java's
Path.normalize().
2017-06-13 21:09:45 -07:00
Ryan Ernst 1bd5cecc34 Docs: Add note about which secure settings are valid (#25212)
This commit adds a note to the docs to clarify that only some settings
can be used with the keystore.
2017-06-13 21:04:16 -07:00
Boaz Leskes 43f4ae5a7b Indices.rollover/10_basic should refresh to make the doc visible in lucene stats 2017-06-13 23:37:15 +02:00
Adis Nezirović 82897e2636 Port support for commercial GeoIP2 databases from Logstash. (#24889)
* Port support for commercial GeoIP2 databases from Logstash.

* Match GeoIP databases according to the database name suffix.

* Rename CITY/COUNTRY_DB_TYPE, since they are suffixes now.
2017-06-13 14:20:01 -07:00
Lisa Cawley d181761aeb [DOCS] Add ML node to node.asciidoc (#24495)
* [DOCS] Add ML node to node.asciidoc

* [DOCS] Clarify ML node in node.asciidoc

* [DOCS] Add X-Pack icon for admonition blocks

* [DOCS] Formatting X-Pack blocks in node.asciidoc

* [DOCS] Add xpack icon images to node.asciidoc

* [DOCS] Add final xpack role attributes

* [DOCS] Remove unnecssary xpackicon image

* [DOCS] Add link to X-Pack node settings

* [DOCS] Fix path to X-Pack repository

* [DOCS] Add links to X-Pack node settings

* [DOCS] Fixed text for links to X-Pack node settings

* [DOCS] Change standalone node to dedicated node
2017-06-13 14:03:42 -07:00
Andy Bristol 48696ab544 expose simple pattern tokenizers (#25159)
Expose the experimental simplepattern and 
simplepatternsplit tokenizers in the common 
analysis plugin. They provide tokenization based 
on regular expressions, using Lucene's 
deterministic regex implementation that is usually 
faster than Java's and has protections against 
creating too-deep stacks during matching.

Both have a not-very-useful default pattern of the 
empty string because all tokenizer factories must 
be able to be instantiated at index creation time. 
They should always be configured by the user 
in practice.
2017-06-13 12:46:59 -07:00
Jay Modi 190242fb1b Test: add setting to change request timeout for rest client (#25201)
This commit adds a setting to change the request timeout for the rest client. This is useful as the
default timeout is 30s, which is also the same default for calls like cluster health. If both are
the same then the response from the cluster health api will not be received as the client usually
times out first making test failures harder to debug.

Relates #25185
2017-06-13 12:19:17 -06:00
Jason Tedor 8de6f4e608 Fix secure repository-hdfs tests on JDK 9
The secure repository-hdfs tests fail on JDK 9 because some Hadoop code
reaches into sun.security.krb5. This commit adds the necessary flags to
open the java.security.jgss module. Note that these flags are actually
needed at runtime as well when using secure repository-hdfs. For now we
will punt on how best to help users obtain this when running on JDK 9
with this plugin.

Relates #25205
2017-06-13 13:26:48 -04:00
Alexander Kazakov a7dafdaa05 Add target_field parameter to gsub, join, lowercase, sort, split, trim, uppercase (#24133)
Closes #23682 #23228
2017-06-13 09:40:44 -07:00
Simon Willnauer bc7ec68e76 Add Cross Cluster Search support for scroll searches (#25094)
To complete the cross cluster search capabilities for all search types and
function this change adds cross cluster search support for scroll searches.
2017-06-13 17:22:49 +02:00
Boaz Leskes d3c97615c1 Adapt skip version in rest-api-spec/test/indices.rollover/20_max_doc_condition.yml
The relevant change was backported.
2017-06-13 14:46:15 +02:00
Sergey Galkin 1c95cbc4e8 Rollover max docs should only count primaries (#24977)
max_doc condition for index rollover should use document count only from primary shards 

Fixes #24217
2017-06-13 14:30:46 +02:00
Simon Willnauer 01d7c217f6 Add remote cluster infrastructure to fetch discovery nodes. (#25123)
In order to add scroll support for cross cluster search we need
to resolve the nodes encoded in the scroll ID to send requests to the
corresponding nodes. This change adds the low level connection infrastructure
that also ensures that connections are re-established if the cluster is
disconnected due to a network failure or restarts.

Relates to #25094
2017-06-13 14:23:56 +02:00
Simon Willnauer 186c16ea41 Ensure pending transport handlers are invoked for all channel failures (#25150)
Today if a channel gets closed due to a disconnect we notify the response
handler that the connection is closed and the node is disconnected. Unfortunately
this is not a complete solution since it only works for published connections.
Connections that are unpublished ie. for discovery can indefinitely hang since we
never invoke their handers when we get a failure while a user is waiting for
the response. This change adds connection tracking to TcpTransport that ensures
we are notifying the corresponding connection if there is a failure on a channel.
2017-06-13 09:37:05 +02:00
Jason Tedor 99262e26a0 Use synchronized Wildfly shutdown
We need to use the variant of shutdown that blocks until the connection
to Wildfly is closed or we can get spurious build failures.
2017-06-12 21:38:58 -04:00
Russ Cam a0f50e8aa4 Supported Azure Storage account types (#25167)
* Supported Azure Storage account types

Add important note for Azure Storage account types

Relates #20844
2017-06-12 17:03:18 -07:00
Russ Cam f6821c41d8 Add half_float and scaled float (#22988)
to numeric datatypes
(cherry picked from commit 67ea06145a80d5ec52ba55d1f2e1e8287e1882b1)
2017-06-13 09:54:44 +10:00
Lisa Cawley 2f7de46b72 [DOC] Add X-Pack links to Elasticsearch Reference (#25164)
* [DOC] Add X-Pack links to Elasticsearch Reference

* [DOCS] Address alignment of attributes in Versions.asciidoc
2017-06-12 13:43:06 -07:00
Spencer 88591fecac [docs] include two cluster doc pages missing from index (#25180)
* [docs] include two cluster doc pages missing from index

* [rest-api-spec] update link to remote-info docs
2017-06-12 12:33:56 -07:00
Lee Hinman ee1113c902 Tweak AggregatorBase.addRequestCircuitBreakerBytes
This modifies a method Mark added to the AggregatorBase that allows aggregations
to add additional memory tracking for datastructures used during execution. If
an aggregation would like to reclaim circuit breaker reserved bytes by adding a
negative number, `addWithoutBreaking` should be used instead of
`addEstimateBytesAndMaybeBreak`.

Resolves #24511
2017-06-12 12:55:50 -06:00
Jason Tedor bb66f3b76b Explicitly reject duplicate data paths
Duplicate data paths already fail to work because we would attempt to
take out a node lock on the directory a second time which will fail
after the first lock attempt succeeds. However, how this failure
manifests is not apparent at all and is quite difficult to
debug. Instead, we should explicitly reject duplicate data paths to make
the failure cause more obvious.

Relates #25178
2017-06-12 12:55:19 -04:00
Jason Tedor 982900eabf Do not swallow node lock failed exception
When attempting to obtain the node lock, if an exception is thrown it is
not logged. This makes debugging difficult. This commit causes such an
exception to be logged.

Relates #25176
2017-06-12 11:42:45 -04:00
James Baiera 2e29b69f6a Revert "Revert "Sense for VirtualBox and $HOME when deciding to turn on vagrant testing. (#24636)""
This reverts commit b9e2a1f989.
2017-06-12 09:41:35 -04:00
markharwood 518cda6637 Aggregations bug: Significant_text fails on arrays of text. (#25030)
* Aggregations bug: Significant_text fails on arrays of text.
The set of previously-seen tokens in a doc was allocated per-JSON-field string value rather than once per JSON document meaning the number of docs containing a term could be over-counted leading to exceptions from the checks in significance heuristics. Added unit test for this scenario

Closes #25029
2017-06-12 14:02:54 +01:00
Jim Ferenczi 7ab3d5d04a Speed up sorted scroll when the index sort matches the search sort (#25138)
Sorted scroll search can use early termination when the index sort matches the scroll search sort.
The optimization can be done after the first query (which still needs to collect all documents)
by applying a query that only matches documents that are greater than the last doc retrieved in the previous request.
Since the index is sorted, retrieving the list of documents that are greater than the last doc
only requires a binary search on each segment.
This change introduces this new query called `SortedSearchAfterDocQuery` and apply it when possible.
Scrolls with this optimization will search all documents on the first request and then will early terminate each segment
after $size doc for any subsequent requests.

Relates #6720
2017-06-12 09:33:30 +02:00
Boaz Leskes f34136eda4 TranslogTests.testWithRandomException ignored a possible simulated OOM when trimming files 2017-06-12 08:32:55 +02:00
Boaz Leskes cfb5f6a5a6 Adapt TranslogTests.testWithRandomException to checkpoint syncing on trim
#25005 changed the translog dynamic to fsync the checkpoint before trimming a file. This changed the dynamics of potential failure modes which requires a change to testWithRandomException - it's now possible that we had an exception but the translog was trimmed.

Closes #25133
2017-06-11 23:17:10 +02:00
Jason Tedor 725f6b6983 Change BWC versions on get mapping 404s
This commit changes the BWC versions on the get mapping 404s now that
this API returning 404s when a type is missing is supported since 5.5.0.

Relates #23192
2017-06-11 16:59:12 -04:00
Jason Tedor dcf57f296e Fix get mappings HEAD requests
Get mappings HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
mappings HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23192
2017-06-11 14:58:56 -04:00
Boaz Leskes 9b8754e4c2 TranslogTests#commit didn't allow for a concurrent closing of a view
The view closing will trim unneeded files but there is a small window where they may still be around.
2017-06-11 19:09:01 +02:00
Jason Tedor 7182577904 Fix handling of exceptions thrown on HEAD requests
Today when an exception is thrown handling a HEAD request, the body is
swallowed before the channel has a chance to see it. Yet, the channel is
where we compute the content length that would be returned as a header
in the response. This is a violation of the HTTP specification. This
commit addresses the issue. To address this issue, we remove the special
handling in bytes rest response for HEAD requests when an exception is
thrown. Instead, we let the upstream channel handle the special case, as
we already do today for the non-exceptional case.

Relates #25172
2017-06-10 23:44:18 -04:00
Jason Tedor 7ed3d6e75b Fix comment formatting in EvilLoggerTests
The comments here were formatted inconsistently so this commit fixes
them.
2017-06-10 13:25:44 -04:00
Jason Tedor 5108fa7529 Remove unneeded weak reference from prefix logger
We have a custom logger implementation known as a prefix logger that is
used to write every message by the logger with a given prefix. This is
useful for node-level, index-level, and shard-level messages where we
want to log the node name, index name, and shard ID, respectively, if
possible. The mechanism that we employ is that of a marker. Log4j has a
built-in facility for managing these markers, but its effectively a
memory leak because these markers are held in a map and can never be
released. This is problematic for us since indices and shards do not
necessarily have infinite life spans and so on a node where there are
many indices being creted and destroyed, this infinite lifespan can be a
problem indeed. To solve this, we use our own cache of markers. This is
necessary to prevent too many instances of the marker for the same
prefix from being created (just think of all the shard-level components
that exist in the system), and to workaround the effective leak in
Log4j. These markers are stored as weak references in a weak hash
map. It is these weak references that are unneeded. When a key is
removed from a weak hash map, the corresponding entry is placed on a
reference queue that is eventually cleared. This commit simplifies
prefix logger by removing this unnecessary weak reference wrapper.

Relates #22460
2017-06-10 13:20:45 -04:00
Jim Ferenczi 5cdbebec94 Test: remove faling test that relies on merge order 2017-06-10 11:55:41 +02:00
Jason Tedor a7a3af6f48 Log checkout so SHA is known
This commit changes the task type of the checkoutBwcBranch task to Exec
from LoggedExec so that the output of the checkout command is
shown. This enables us to see the SHA used for the checkout which can be
useful when debugging a BWC break.

Relates #25166
2017-06-09 22:06:51 -04:00
Russ Cam 3405badfb1 Add link to community Rust Client (#22897)
fix Flummi link
2017-06-09 14:50:51 -07:00
Chris Earle af7b479e12 "shard started" should show index and shard ID (#25157)
When the cluster state is updated with Shard Started entries, it simply adds "shard-started" as the source of the change.

This adds the index name and shard ID so that we can see who/what is spamming the changes when the index creation step has already left the cluster state.
2017-06-09 14:52:42 -04:00
Boaz Leskes b8fef3309c await fix testWithRandomException 2017-06-09 20:31:39 +02:00
Jason Tedor 8a45c3105f Change BWC versions on create index response
This commit changes the BWC versions on the create index response now
that the index name in the response is supported since 5.6.0.

Relates #25139
2017-06-09 13:52:08 -04:00