Commit Graph

29140 Commits

Author SHA1 Message Date
Jim Ferenczi 792641a6e3 [Docs] #26541: add warning regarding the limit on the number of fields that can be queried at once in the multi_match query. 2017-10-30 18:03:56 +01:00
Dimitrios Athanasiou 3796471ac4 [Docs] Fix note in bucket_selector 2017-10-30 15:20:46 +00:00
Clarkie b1ce5cf836 [Docs] Fix indentation of examples (#27168) 2017-10-30 11:56:38 +01:00
Jim Ferenczi a4105c6b4a
[Docs] Clarify `span_not` query behavior for non-overlapping matches (#27150)
Closes #27134
2017-10-30 11:29:40 +01:00
Christoph Büscher 8e62314ce4
[Docs] Remove first person "I" from getting started (#27155)
Avoid first person style and consistently switch to an unpersonal style in the getting started docs.
2017-10-30 10:45:50 +01:00
Holger Bartnick aa03fb72b7 [Docs] Correct link target for datatype murmur3 (#27143) 2017-10-30 09:31:55 +01:00
Martijn van Groningen c406a91158
Fix division by zero in phrase suggester that causes assertion to fail 2017-10-30 09:04:56 +01:00
Nhat d01ad9367e Enable Docstats with totalSizeInBytes for 6.1.0
Relates https://github.com/elastic/elasticsearch/pull/27117
2017-10-28 14:54:53 -04:00
Nhat 07d270b45f
Adds average document size to DocsStats (#27117)
This change is required in order to support a size based check for the
index rollover.

The index size is estimated by sampling the existing segments only. We
prefer using segments to StoreStats because StoreStats is not reliable
if indexing or merging operations are in progress.

Relates #27004
2017-10-28 12:47:08 -04:00
Jack Conradson abaede2373
Upgrade Painless from ANTLR 4.5.1-1 to ANTLR 4.5.3. (#27153) 2017-10-27 11:07:49 -07:00
Honza Král 1dd6aeeb8d Exists template needs a template name (#25988) 2017-10-27 16:14:41 +02:00
Christoph Büscher b88dbe8f49 [Tests] Fix occasional test failure due to two random values being the same 2017-10-27 12:06:16 +02:00
Christoph Büscher 9253ea8aec Fix beidermorse phonetic token filter for unspecified `languageset` (#27112)
Currently, when we create a BeiderMorseFilter with an unspecified `languageset`,
the filter will not guess the language, which should be the default behaviour.
This change fixes this and adds a simple test for the cases with and without
provided `languageset` settings.

Closes #26771
2017-10-27 10:07:36 +02:00
Jim Ferenczi 6625ecfff4 Fix max score tracking with field collapsing (#27122)
This change makes sure that we track score when sort is set to relevancy only.
In this case we always track max score like normal search does.

Closes #23840
2017-10-27 09:18:34 +02:00
Jun Ohtani 77e11f6969 [Doc] Add Ingest CSV Processor Plugin to plugin as a community plugin (#27105)
* [Doc] Add Ingest CSV Processor Plugin to plugin as a community plugin
2017-10-27 16:16:02 +09:00
Clinton Gormley 0499dc0873 Removed the beta tag from cross-cluster search 2017-10-27 08:51:36 +02:00
olcbean 35a2cc1003 fixed typo in ConstructingObjectParse (#27129) 2017-10-26 13:14:56 -06:00
Jack Conradson dda5d1af29 Allow for the Painless Definition to have multiple instances (#27096) 2017-10-26 08:33:55 -07:00
Jim Ferenczi d1acf449f5 Apply missing request options to the expand phase (#27118)
* Apply missing request options to the expand phase

This change adds some missing options to the expand query that builds the inner hits for field collapsing.
The following options are now applied to the inner_hits query:
 * post_filters
 * preferences
 * routing

Closes #27079
Closes #26649
2017-10-26 17:01:57 +02:00
Simon Willnauer 1460a3feac Only pull SegmentReader once in getSegmentInfo (#27121) 2017-10-26 14:56:14 +02:00
Jason Tedor 0174d13ca2 Fix BWC for discovery stats
The new discovery stats were pushed to the 6.x branch (currently
versioned at 6.1.0) but master was not updated to reflect this. This
impacts the mixed-cluster BWC tests because a 6.1.0 node will be trying
to send a 7.0.0 node the new discovery stats but the 7.0.0 did not yet
understand that it should be reading these when talking to a 6.1.0
node. This commit addresses this, and changes the skip version on the
discovery stats REST tests.
2017-10-26 07:53:18 -04:00
Martijn van Groningen f1e944a675
docs: describe parent/child performances 2017-10-26 11:49:13 +02:00
Catalin Ursachi 8bf33241ed Add Delete Index API support to high-level REST client (#27019)
Relates to #25847
2017-10-26 09:52:46 +02:00
Jason Tedor 77f87732ef Adjust .DS_Store test assertions on Windows
Windows handles trying to read a file that does not exist because a
component of the path is not a directory differently than other OS
handle this situation. This commit adjusts these assertions for Windows.
2017-10-25 22:36:53 -04:00
Jason Tedor 17d6820a4b Emit settings deprecation logging on empty update
When executing a cluster settings update that leaves the cluster state
unchanged, we skip validation and this avoids deprecation logging for
deprecated settings in the cluster state. This commit addresses this by
running validation even if the settings are unchanged.

Relates #27017
2017-10-25 22:15:38 -04:00
Jason Tedor 9aae2f593a Avoid stack overflow on search phases
When a search is executing locally over many shards, we can stack
overflow during query phase execution. This happens due to callbacks
that occur after a phase completes for a shard and we move to the same
phase on another shard. If all the shards for the query are local to the
local node then we will never go async and these callbacks will end up
as recursive calls. With sufficiently many shards, this will end up as a
stack overflow. This commit addresses this by truncating the stack by
forking to another thread on the executor for the phase.

Relates #27069
2017-10-25 22:05:46 -04:00
Nhat b31028e8c4 bwc: do not check errmsg for put template request
Relates #27100
2017-10-25 21:43:20 -04:00
Ryan Ernst 2a8452b513 Reindex: Fix headers in reindex action (#26937)
The headers passed to reindex were skipped except for the last one. This
commit fixes the copying of the headers, as well as adds a base test
case for rest client builders to access the headers within the built
rest client.

relates #22976
2017-10-25 16:37:01 -07:00
Nhat adc195e30c Fix error message for a put index template request without index_patterns (#27102)
Just correct the error message from "Validation Failed: 1: pattern is
missing;" to "Validation Failed: 1: index_patterns is missing;".

Closes #27100
2017-10-25 18:54:40 -04:00
olcbean 981b7f4d39 Make yaml test runner stricter by enforcing `required` for paths and parameters (#27035)
Till now the yaml test runner was verifying that the provided path parts and parameters are supported.
With this PR, yaml test runner also checks that all required path parts and parameters are provided.
2017-10-25 19:36:42 +00:00
Armin Braun 6533b165d6 #25601 Add pipeline support for REST API bulk upsert (#27075) 2017-10-25 19:03:25 +02:00
Jason Tedor 6722b9c4a2 Ignore .DS_Store files on macOS
Finder creates these files if you browse a directory there. These files
are really annoying, but it's an incredible pain for users that these
files are created unbeknownst to them, and then they get in the way of
Elasticsearch starting. This commit adds leniency on macOS only to skip
these files.

Relates #27108
2017-10-25 11:25:29 -04:00
Loading Zhang 149e558dd5 Docs: Fix ingest geoip config location (#27110) 2017-10-25 07:16:42 -07:00
Jason Tedor cfa4646161 Adjust SHA-512 supported format on plugin install
This commit adjusts the format of the SHA-512 checksum files supported
by the plugin installer. In particular, we now require that the SHA-512
format be a single-line file containing the checksum followed by two
spaces followed by the filename. We continue to support the legacy
format for SHA-1.

Relates #27093
2017-10-25 07:53:33 -04:00
Luca Cavanna 5818ff6b56 Make ShardSearchTarget optional when parsing ShardSearchFailure (#27078)
Turns out that `ShardSearchTarget` is nullable, hence its fields may not be printed out as part of `ShardSearchFailure#toXContent`, in which case `fromXContent` cannot parse it back. We would previously try to create the object with all of its fields set to null, but `Index` complains about it in the constructor. Also made sure that this code path is covered by our unit tests in `ShardSearchFailureTests`.

Closes #27055
2017-10-25 13:26:06 +02:00
markwalkom 2b864156ca [Docs] Clarify mapping `index` option default (#27104) 2017-10-25 12:42:29 +02:00
Luca Cavanna 8caf7d4ff8 Decouple BulkProcessor from ThreadPool (#26727)
Introduce minimal thread scheduler as a base class for `ThreadPool`. Such a class can be used from the `BulkProcessor` to schedule retries and the flush task. This allows to remove the `ThreadPool` dependency from `BulkProcessor`, which requires to provide settings that contain `node.name` and also needed log4j for logging. Instead, it needs now a `Scheduler` that is much lighter and gets automatically created and shut down on close.

Closes #26028
2017-10-25 10:30:23 +02:00
David Turner cc3364e4f8 Stats to record how often the ClusterState diff mechanism is used successfully (#26973)
It's believed that using diffs obsoletes the other mechanism for reusing the
bits of the ClusterState that didn't change between updates, but in fact we
don't know for sure how often the diff mechanism works successfully. The stats
collected here will tell us.
2017-10-25 07:35:25 +01:00
Lee Hinman 6bc7024f26 Tie-break shard path decision based on total number of shards on path (#27039)
Right now if the number of shards for a particular index is equal across the
data paths, we tie-break on space. This changes to tie-break first on the total
number of shards for each path, and then, if that is the same, on the usable
bytes.

Relates to #26654 (it's a follow-up)
2017-10-24 16:12:47 -06:00
Jason Tedor 7a792d2c1f Timed runnable should delegate to abstract runnable
If timed runnable wraps an abstract runnable, then it should delegate to
the abstract runnable otherwise force execution and handling rejections
is dropped on the floor. Thus, timed runnable should itself be an
abstract runnable delegating all methods to the wrapped runnable in
cases when it is an abstract runnable. This commit causes this to be the
case.

Relates #27095
2017-10-24 11:36:50 -04:00
Lee Hinman fcfbdf1f37 Expose adaptive replica selection stats in /_nodes/stats API
This exposes the collected metrics we store for ARS in the nodes stats, as well
as the computed rank of nodes. Each node exposes its perspective about the
cluster.

Here's an example output (with `?human`):

```json
...
"adaptive_selection" : {
  "_k6v1-wERxyUd5ke6s-D0g" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "7.8ms",
    "avg_service_time_ns" : 7896963,
    "avg_response_time" : "9ms",
    "avg_response_time_ns" : 9095598,
    "rank" : "9.1"
  },
  "VJiCUFoiTpySGmO00eWmtQ" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "1.3ms",
    "avg_service_time_ns" : 1330240,
    "avg_response_time" : "4.5ms",
    "avg_response_time_ns" : 4524154,
    "rank" : "4.5"
  },
  "DHNGTdzyT9iiaCpEUsIAKA" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "2.1ms",
    "avg_service_time_ns" : 2113164,
    "avg_response_time" : "6.3ms",
    "avg_response_time_ns" : 6375810,
    "rank" : "6.4"
  }
}
...
```
2017-10-24 08:58:42 -06:00
Tim Brooks a7fa5d3335 Remove dangerous `ByteBufStreamInput` methods (#27076)
This commit removes the `ByteBufStreamInput` `readBytesReference` and
`readBytesRef` methods. These methods are zero-copy which means that
they retain a reference to the underlying netty buffer. The problem is
that our `TcpTransport` is not designed to handle zero-copy. The netty
implementation sets the read index past the current message once it has
been deserialized, handled, and mostly likely dispatched to another
thread. This means that netty is free to release this buffer. So it is
unsafe to retain a reference to it without calling `retain`. And we
cannot call `retain` because we are not currently designed to handle
reference counting past the transport level.

This should not currently impact us as we wrap the `ByteBufStreamInput`
in `NamedWriteableAwareStreamInput` in the `TcpTransport`. This stream
essentially delegates to the underling stream. However, in the case of
`readBytesReference` and `readBytesRef` it leaves thw implementations
to the standard `StreamInput` methods. These methods call the read byte
array method which delegates to `ByteBufStreamInput`. The read byte
array method on `ByteBufStreamInput` copies so it is safe. The only
impact of this commit should be removing methods that could be dangerous
if they were eventually called due to some refactoring.
2017-10-24 08:51:14 -06:00
Jason Tedor d5efc30968 Blacklist Gradle 4.2 and above
An upstream Gradle change has broken us starting on version 4.2. This
commit blacklists these versions until we can either find a workaround,
or the upstream issue is addressed.

Relates #27087
2017-10-24 09:11:30 -04:00
David Turner cf2d0834f5 Remove duplicated test (#27091) 2017-10-24 11:52:01 +01:00
David Turner 559fc5a4de Update numbers to reflect 4-byte UTF-8-encoded characters (#27083)
You need 4 bytes for characters outside the BMP, which includes many emoji and
a bunch of less-common writing characters too.
2017-10-24 09:50:47 +01:00
Nhat bf557fd886 test: avoid generating duplicate multiple fields (#27080)
Multifields parser does not allow duplicate values, however the
MultiFieldTests may produce duplicate field values.

See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+release-tests/132/console.
2017-10-23 09:59:40 -04:00
Adrien Grand d0104c22a5 Reduce the default number of cached queries. (#26949)
Memory usage of queries can't be properly accounted, which can be an issue when
large queries are cached since the actual memory usage will be much higher than
what the cache thinks. This problem is very hard if not impossible to fix so as
a workaround I would like to decrease the maximum number of cached queries so
that this problem is less likely to cause trouble in practice.

For the record, this problem is more likely to occur in envirenments that have
small shards or don't give much memory to the JVM.

Closes #26938
2017-10-23 14:11:35 +02:00
Martijn van Groningen 93107f8466
removed unused import 2017-10-23 10:00:54 +02:00
Martijn van Groningen 141d1b62e9
ingest: date processor should not fail if timestamp is specified as json number
Closes #26967
2017-10-23 09:32:44 +02:00
Jason Tedor 35984a616e Keep cumulative elapsed scroll time in microseconds
Today we internally accumulate elapsed scroll time in nanoseconds. The
problem here is that this can reasonably overflow. For example, on a
system with scrolls that are open for ten minutes on average, after
sixteen million scrolls the largest value that can be represented by a
long will be executed. To address this, we switch to internally
representing scrolls using microseconds as this enables with the same
number of scrolls scrolls that are open for seven days on average, or
with the same average elapsed time sixteen billion scrolls which will
never happen (executing one scroll a second until sixteen billion have
executed would not occur until more than five-hundred years had
elapsed).

Relates #27068
2017-10-21 13:18:28 +02:00