Commit Graph

29781 Commits

Author SHA1 Message Date
Lee Hinman 78c54c4560 Balance shards for an index more evenly across multiple data paths (#26654)
* Balance shards for an index more evenly across multiple data paths

When a node has multiple data paths configured, and is assigned all of the
shards for a particular index, it's possible now that all shards will be
assigned to the same path (see #16763).

This change keeps the same behavior around determining the "best" path for a
shard based on space, however, it enforces limits for the number of shards on a
path for an index from the single-node perspective. For example:

Assume you had a node with 4 data paths, where `/path1` has a tremendously high
amount of disk space available compared to the other paths. If you create an
index with 5 primary shards, the previous behavior would be to assign all 5
shards to `/path1`.

This change would enforce a limit of 2 shards to each data path for that
particular node, so you would end up with the following distribution:

- `/path1` - 2 shards (because it has the most usable space)
- `/path2` - 1 shard
- `/path3` - 1 shard
- `/path4` - 1 shard

Note, however, that this limit is only enforced at the local node level for
simplicity in implementation, so if you had multiple nodes, the "limit" for the
node is still 2, so assuming you had enough nodes that there was only 2 shards
for this index assigned to this node, they would still both be assigned to
`/path1`.

* Switch from ObjectLongHashMap to regular HashMap

* Remove unneeded Files.isDirectory check

* Skip iterating directories when not necessary

* Add message to assert

* Implement different (better) ranking for node paths

This is the method we discussed

* Remove unused pathHasEnoughSpace method

* Use findFirst instead of .get(0);

* Update for master merge to fix compilation

Settings.putArray -> Settings.putList
2017-10-17 05:49:24 -06:00
Jason Tedor 62bf3c11a9 Stop invoking non-existant syscall
Today when getting ready to enter seccomp, we do some probes to ensure
that we are really talking to seccomp, etc. One of these probes is pure
paranoia. The paranoia was driven by a kernel bug
(https://lkml.org/lkml/2014/7/20/222) that only impacted 32-bit x86
kernels wherein invoking a non-existant syscall was not returning ENOSYS
(as it should). This probe causes problems though, for example in
containers with syscall filters, invoking a non-existant syscall will
lead to the process being sent SIGSYS and terminated. We do not need
this paranoid, we do not support 32-bit, and our other probes give us
enough of a defense to ensure that we are talking to seccomp (and we
hardcode the seccomp syscall number for platforms that we
support). Given that this probe offers us little value, but does cause
problems in valid use-cases, this commit removes this paranoia.

Relates #27016
2017-10-17 11:34:44 +02:00
Jason Tedor 3664ede9b5 Remove unnecessary exception for engine constructor
The internal engine constructor declares a checked engine exception yet
this constructor does not actually throw this exception. This commit
removes this declaration from the internal engine constructor.

Relates #27022
2017-10-16 10:17:37 -04:00
Anton Pozhidaev 70668dddf3 Update docs about `script` parameter (#27010)
Added a description of short script form. Also removed references to the obsolete `script.default_lang`.
2017-10-16 05:04:43 -07:00
Simon Willnauer 8dda827ff4 Don't refresh on `_flush` `_force_merge` and `_upgrade` (#27000)
Today all these API calls have a sideeffect of making documents visible
to search requests. While this is sometimes desired it's an unnecessary sideeffect
and now that we have an internal (engine-private) index reader (#26972) we artificially
add a refresh call for bwc. This change removes this sideeffect in 7.0.
2017-10-16 10:16:35 +02:00
Tim Brooks 277637f42f Do not set SO_LINGER on server channels (#26997)
Right now we are attempting to set SO_LINGER to 0 on server channels
when we are stopping the tcp transport. This is not a supported socket
option and throws an exception. This also prevents the channels from
being closed.

This commit 1. doesn't set SO_LINGER for server channges, 2. checks
that it is a supported option in nio, and 3. changes the log message
to warn for server channel close exceptions.
2017-10-13 13:06:38 -06:00
olcbean bb013c60b5 Fix inconsistencies in the rest api specs for *_script (#26971) 2017-10-13 11:20:34 -07:00
olcbean 2310c8e2e2 fix inconsistencies in the rest api specs for cat.snapshots (#26996) 2017-10-13 10:57:23 -07:00
Jason Tedor 8eba1fa17c Add docs on full_id parameter in cat nodes API
This commit adds a note to the docs on the full_id parameter in the cat
nodes API. This is a useful parameter but was not previously documented
anywhere.

Relates #27009
2017-10-13 13:49:25 -04:00
Simon Willnauer cae1790492 [TEST] Add test that replicates versioned updates with random flushes 2017-10-13 11:54:51 +02:00
Simon Willnauer a517758432 Use internal searcher for all indexing related operations in the engine
The changes introduced in #26972 missed two places where an internal searcher should be used.
2017-10-13 11:41:00 +02:00
Jason Tedor a7895839a0 Reformat paragraph in template docs to 80 columns
This commit reformats a paragraph in the template docs to fit in 80
columns as for the rest of the doc, and as-is a standard that we loosely
adhere to.
2017-10-12 17:52:43 -04:00
Pius 1125bc635c Clarify settings and template on create index
This commit clarifies the interaction between settings specified in a
create index request, and those that would come from any templates that
apply to the create index request.

Relates #26994
2017-10-12 17:48:57 -04:00
Tim Brooks e40c597b67
Fix reference to TcpTransport in documentation 2017-10-12 13:27:24 -06:00
Simon Willnauer 047a916169 Allow Uid#decodeId to decode from a byte array slice (#26987)
Today we only allow to decode byte arrays where the data has a 0 offset
and the same length as the array. Allowing to decode stuff from a slice will
make decoding IDs cheaper if the the ID is for instance coming from a term dictionary
or BytesRef.

Relates to #26931
2017-10-12 20:19:14 +02:00
agent5566 93a47cf860 Fix a typo in the similarity docs (#26970) 2017-10-12 09:29:25 -07:00
Simon Willnauer 21eb9bdf6a Use separate searchers for "search visibility" vs "move indexing buffer to disk (#26972)
Today, when ES detects it's using too much heap vs the configured indexing
buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move
the bytes to disk, clear version map, etc.

But this has the unexpected side effect of making newly indexed/deleted
documents visible to future searches, which is not nice for users who are trying
to prevent that, e.g. #3593.

This is also an indirect spinoff from #26802 where we potentially pay a big
price on rebuilding caches etc. when updates / realtime-get is used. We are
refreshing the internal reader for realtime gets which causes for instance
global ords to be rebuild. I think we can gain quite a bit if we'd use a reader
that is only used for GETs and not for searches etc. that way we can also solve
problems of searchers being refreshed unexpectedly aside of replica recovery /
relocation.

Closes #15768
Closes #26912
2017-10-12 17:19:43 +02:00
Colin Goodheart-Smithe e1679bfe5e Create weights lazily in filter and filters aggregation (#26983)
Previous to this change the weights for the filter and filters aggregation were created in the `Filter(s)AggregatorFactory` which meant that they were created regardless of whether the aggregator actually collects any documents. This meant that for filters that are expensive to initialise, requests would not be quick when the query of the request was (or effectively was) a `match_none` query.

This change maintains a single Weight instance for each filter across parent buckets but passes a weight supplier to the aggregator instances which will create the weight on first call and then return that instance for subsequent calls.
2017-10-12 14:56:32 +01:00
Reto Schüttel ab94150a23 Use a dedicated ThreadGroup in rest sniffer (#26897)
This change adds a dedicated thread group, configures threads with a corresponding thread name and starts all threads as daemon threads.
2017-10-12 15:20:51 +02:00
Jason Tedor f81ee225ff Fire global checkpoint sync under system context
The global checkpoint sync action should fire under the system context
since it is not a user-facing management action.

Relates #26984
2017-10-12 09:18:58 -04:00
Anton Pozhidaev cee9640c20 Update by Query is modified to accept short `script` parameter. (#26841)
Update by Query is modified to accept short `script` parameter.

Closes issue #24898
2017-10-11 21:57:46 +00:00
Glen Smith f3942799bb Cat shards bytes (#26952)
* Add `bytes` to cat.shards API spec

* add bytes param to rest-api-spec of cat.segments
2017-10-11 11:03:59 -06:00
kel 2e36f19051 Add support for parsing inline script (#23824) (#26846)
* Add support for parsing inline script (#23824)

* Fix test
2017-10-11 09:15:37 -07:00
Alexander Kazakov 592ab043dd Change default value to true for transpositions parameter of fuzzy query (#26901) 2017-10-11 15:31:48 +02:00
Yannick Welsch d97b21d1da Adding unreleased 5.6.4 version number to Version.java 2017-10-11 08:40:30 +02:00
Tim Brooks 4a870d8c63 Rename TCPTransportTests to TcpTransportTests (#26954)
Our convention is to use lower case when naming things "Tcp". For
example, `TcpTransport`. This commit renames the outlier
(TcpTransportTests) to use lower case.
2017-10-10 20:15:14 -06:00
Nhat b63f718a0b Fix NPE for /_cat/indices when no primary shard (#26953)
When a node which contains the primary shard is unavailable, the primary
stats (and the total stats) of an `IndexStats` will be empty for a short
moment (while the primary shard is being relocated). However, we assume
that these stats are always non-empty when handling `_cat/indices` in
RestIndicesAction. This commit checks the content of these stats before
accessing.

Closes #26942
2017-10-10 17:04:55 -04:00
Deb Adair 875e582cc9 [DOCS] Fixed indentation of the definition list. 2017-10-10 12:08:21 -07:00
Jason Tedor 393e73612e Fix formatting in channel close test
This commit fixes the indentation in the transport test case for a
channel closing while connecting.
2017-10-10 13:39:45 -04:00
Jason Tedor 4c06b8f1d2 Check for closed connection while opening
While opening a connection to a node, a channel can subsequently
close. If this happens, a future callback whose purpose is to close all
other channels and disconnect from the node will fire. However, this
future will not be ready to close all the channels because the
connection will not be exposed to the future callback yet. Since this
callback is run once, we will never try to disconnect from this node
again and we will be left with a closed channel. This commit adds a
check that all channels are open before exposing the channel and throws
a general connection exception. In this case, the usual connection retry
logic will take over.

Relates #26932
2017-10-10 13:34:51 -04:00
Nicolas Sierra d6fc4affae Clarify systemd overrides
This commit clarifies how to apply an override to the systemd unit file
for Elasticsearch.

Relates #26950
2017-10-10 13:06:34 -04:00
Chris Earle dcc6b426ec [DOCS] Plugin Installation for Windows (#21671)
This shows an example of how to install a plugin on Windows, which is not as obvious at I would have expected.
2017-10-10 09:31:44 -06:00
Nik Everett 4a06dd919a Painless: add tests for cached boxing (#24163)
We had a TODO about adding tests around cached boxing. In #24077
I tracked down the uncached boxing tests and saw the TODO. Cached
boxing testing is a fairly small extension to that work.
2017-10-10 10:34:03 -04:00
Tanguy Leroux 6658ff0fd6 Don't detect source's XContentType in DocumentParser.parseDocument() (#26880)
DocumentParser.parseDocument() auto detects the XContentType of the
document to parse, but this information is already provided by SourceToParse.
2017-10-10 15:31:56 +02:00
olcbean c03f0c89af Fix handling of paths containing parentheses
This commit fixes an issue with the handling of paths containing
parentheses on Windows. When such a path is used as a component of
Elasticsearch home, then a later echo statement that is guarded by an if
will fail because the parentheses in the path will be confused with the
parentheses defining the if block. This commit fixes the issue by
protecting this echo statement by wrapping the possibly offending path
in quotes.

Relates #26916
2017-10-10 08:56:08 -04:00
Daniel Mitterdorfer e22844bd2a Allow only a fixed-size receive predictor (#26165)
With this commit we simplify our network layer by only allowing to define a
fixed receive predictor size instead of a minimum and maximum value. This also
means that the following (previously undocumented) settings are removed:

* http.netty.receive_predictor_min
* http.netty.receive_predictor_max

Using an adaptive sizing policy in the receive predictor is a very low-level
optimization. The implications on allocation behavior are extremely hard to grasp
(see our previous work in #23185) and adaptive sizing does not provide a lot of
benefits (see benchmarks in #26165 for more details).
2017-10-10 13:29:45 +02:00
vurple b3e9aa89dc Add Homebrew instructions to getting started
This commit adds instructions for installing Elasticsearch via Homebrew
to the Getting Started guide.

Relates #26847
2017-10-10 06:21:33 -04:00
Martijn van Groningen bba70205e3
ingest: Fix bug that prevent date_index_name processor from accepting timestamps specified as a json number
Closes #26890
2017-10-10 10:04:29 +02:00
Ryan Ernst 6b53dadcf9 Scripting: Fix expressions to temporarily support filter scripts (#26824)
This commit adds a hack converting 0.0 to false and non-zero to true for
expressions operating under a filter context.

closes #26429
2017-10-09 17:02:21 -07:00
Ryan Ernst a6ae6b5a9a Docs: Add note to contributing docs warning against tool based refactoring (#26936)
This commit adds a warning to deter contributers from creating PRs
generated by tools to do large refactors just for the sake of
refactoring.
2017-10-09 14:49:24 -07:00
hanbj 3ab27d16ad Fix thread context handling of headers overriding (#26068)
Previously collisions in headers between old and new contexts could be silently ignored, allowing the original context's headers to "win". This commit fixes the headers to require they are disjoint.
2017-10-09 14:41:09 -07:00
Boaz Leskes 84742690cd SearchWhileCreatingIndexIT: remove usage of _only_nodes
the only nodes preference was used as a replacement of `_primary` which was removed. Sadly, it's not the same as we also check that it makes sense - i.e., that the given node has a shard copy. Since the test uses indices with >1 shards, the primaries may be spread to multiple nodes. Using one (like it currently does) will fail for some primaries. Using all will probably end up hitting all nodes.

This commit removed the `_only_nodes` usage in favor a simple search

Relates to #26791
2017-10-09 19:37:19 +02:00
Martijn van Groningen 96823b0480
update Lucene version for 6.0-RC2 version 2017-10-09 15:27:06 +02:00
kel 1d4f70210f Calculate and cache result when advanceExact is called (#26920)
Cache final result instead of result of advanceExact.
Fix SortedNumericDoubleValues does not test MEDIAN mode
Replace deprecated random string generation method
2017-10-09 14:02:38 +02:00
Martijn van Groningen 19dc629e6d
Test query builder bwc against previous supported versions instead of just the current version.
Relates to #25456
2017-10-09 13:22:01 +02:00
Yannick Welsch a4436195f8 Set minimum_master_nodes on rolling-upgrade test (#26911)
The rolling-upgrade test was only writing the "minimum_master_nodes" setting to the configuration file of the old nodes, but not the upgraded ones.

Also changes the value of "minimum_master_nodes" from "number_of_nodes" to "(number_of_nodes / 2) + 1".
2017-10-09 10:45:03 +02:00
Simon Willnauer cdd7c1e6c2 Return List instead of an array from settings (#26903)
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.
2017-10-09 09:52:08 +02:00
Nhat bf4c3642b2 remove _primary and _replica shard preferences (#26791)
The shard preference _primary, _replica and its variants were useful
for the asynchronous replication. However, with the current impl, they
are no longer useful and should be removed.

Closes #26335
2017-10-08 11:03:06 -04:00
shaulzorea 9db21cd23f fixing typo in datehistogram-aggregation.asciidoc (#26924) 2017-10-08 15:12:43 +02:00
Karel Minarik 6825cafaa6 [API] Added the `terminate_after` parameter to the REST spec for "Count" API
Closes #26895
2017-10-08 14:13:20 +02:00