Commit Graph

4431 Commits

Author SHA1 Message Date
Boaz Leskes 1e1f8e6376 Validate a joining node's version with version of existing cluster nodes (#25770)
When a node tries to join a cluster, it goes through a validation step to make sure the node is compatible with the cluster. Currently we validation that the node can read the cluster state and that it is compatible with the indexes of the cluster. This PR adds validation that the joining node's version is compatible with the versions of existing nodes. Concretely we check that:

1) The node's min compatible version is higher or equal to any node in the cluster (this prevents a too-new node from joining)
2) The node's version is higher or equal to the min compat version of all cluster nodes (this prevents a too old join where, for example, the master is on 5.6, there's another 6.0 node in the cluster and a 5.4 node tries to join).
3) The node's major version is at least as higher as the lowest node in the cluster. This is important as we use the minimum version in the cluster to stop executing bwc code for operations that require multiple nodes. If the nodes are already operating in "new cluster mode", we should prevent nodes from the previous major to join (even if they are wire level compatible). This does mean that if you have a very unlucky partition during the upgrade which partitions all old nodes which are also a minority / data nodes only, the may not be able to re-join the cluster. We feel this edge case risk is well worth the simplification it brings to BWC layers only going one way.

 Also, the node join validation can now selectively fail specific nodes (previously the entire batch was failed). This is an important preparation for a follow up PR where we plan to have a rejected joining node die with dignity.
2017-07-19 12:57:29 +02:00
Jason Tedor 3d3d99557d Expand migration note regarding default paths
This commit expands on the migration note regarding the removal of
default.path.data and default.path.logs to include a note that users
that were relying on the defaults (the common case for path.logs), and
they carry over their previous elasticsearch.yml configruation file,
then they must add explicit values for path.data and path.logs.
2017-07-19 13:40:42 +09:00
Deb Adair 23c810b334 [DOCS] Changes xrefs to cross doc links to enable building GS "mini-docs" 2017-07-18 13:52:38 -07:00
Deb Adair d9e55179f1 [DOCS] Adding index file for GS "mini book". 2017-07-18 13:44:08 -07:00
Christoph Büscher 43bfe06759 [Docs] Add sorting and source filtering section to client docs (#25767) 2017-07-18 16:58:46 +02:00
Clinton Gormley ff4a2519f2 Update experimental labels in the docs (#25727)
Relates https://github.com/elastic/elasticsearch/issues/19798

Removed experimental label from:
* Painless
* Diversified Sampler Agg
* Sampler Agg
* Significant Terms Agg
* Terms Agg document count error and execution_hint
* Cardinality Agg precision_threshold
* Pipeline Aggregations
* index.shard.check_on_startup
* index.store.type (added warning)
* Preloading data into the file system cache
* foreach ingest processor
* Field caps API
* Profile API

Added experimental label to:
* Moving Average Agg Prediction


Changed experimental to beta for:
* Adjacency matrix agg
* Normalizers
* Tasks API
* Index sorting

Labelled experimental in Lucene:
* ICU plugin custom rules file
* Flatten graph token filter
* Synonym graph token filter
* Word delimiter graph token filter
* Simple pattern tokenizer
* Simple pattern split tokenizer

Replaced experimental label with warning that details may change in the future:
* Analysis explain output format
* Segments verbose output format
* Percentile Agg compression and HDR Histogram
* Percentile Rank Agg HDR Histogram
2017-07-18 14:06:22 +02:00
Luca Cavanna 0d8b753325 IndexClosedException to return 400 rather than 403 (#25752)
403 can be confused with security. If an API doesn't support working against closed indices and closed indices are referred to in a request, that is a bad request, hence 400 is more appropriate.
2017-07-18 10:26:32 +02:00
Christoph Büscher a6e3d356ed Change parsing of numeric `to` and `from` parameters in `date_range` aggregation (#25376)
Currently the `to` and `from` parameter in the `date_range` aggregation is not
parsed with the correct date field format from the mappings or the aggregation
if the argument is numeric, but always treated as a long value specifying
`epoch_millis`. This leads to problems e.g. when the format is `epoch_second`,
but the `to` and `from` are currently treated as millis.

With this change, we interpret these parameters according to the `format` of the target field.
If the `format` in the mappings is not compatible with numeric input values,
a compatible `format` (e.g. `epoch_millis`, `epoch_second`) must be specified in
the `date_range` aggregation itself, otherwise an error is thrown.

#Closes #17920
2017-07-18 09:45:28 +02:00
Christoph Büscher 56b1250a34 [Docs] Adding highlighting section to high level client docs (#25751)
Adding a section about how to use highlighting in the SearchSourceBuilder and
how to retrieve highlighted fragments from the SearchResponse.
2017-07-17 19:30:58 +02:00
Simon Willnauer cb4eebcd6a Make `index` in TermsLookup mandatory (#25753)
This change removes the leniency of having a `null` index to fetch
terms from in 6.0 onwards. This feature will be deprecated in the 5.x series
and 6.0 nodes will require the index to be set.

Closes #25750
2017-07-17 18:50:30 +02:00
Clinton Gormley 25a89e613a Broke recipes into separate pages 2017-07-17 18:21:39 +02:00
Glen Smith e9dfb2a215 Fix another simulate example in ingest docs
When simulating an ingest pipeline against an existing pipeline, the
_source field is required to wrap each doc. This commit fixes another
example in the docs that is missing this.
    
Relates #25743, relates e3a0c11239
2017-07-17 15:17:42 +09:00
Glen Smith e3a0c11239 Fix simulate example in ingest docs
When simulating an ingest pipeline against an existing pipeline, the
_source field is required to wrap each doc. This commit fixes an example
in the docs that is missing this.

Relates #25742
2017-07-17 14:17:41 +09:00
Ryan Ernst 072402463b Scripting: Remove search template actions (#25717)
The dedicated search template put/get/delete actions are deprecated in
5.6. This commit removes them from 6.0.
2017-07-14 23:12:05 -07:00
javanna 2c38e93e96 [DOCS] Added note to high level client docs on version
The alpha2 docs is built out of master which may make users think that the high level client was already released as part of alpha2 which it was not. This note should clarify that the client will be released with 6.0.0-beta1
2017-07-15 07:50:25 +02:00
Ryan Ernst b1762d69b5 Setup: Change default heap to 1G (#25695)
This commit changes the default heap size to 1 GB. Experimenting with
elasticsearch is often done on laptops, and 1 GB is much friendlier to
laptop memory. It does put more pressure on the gc, but the tradeoff is
a smaller default footprint. Users running in production can (and
should) adjust the heap size as necessary for their usecase.
2017-07-14 09:38:08 -07:00
Christoph Büscher 5387ed00d2 [Docs] Adding suggestion sections to high level client docs (#25724)
This adds a section about how to add suggestions to the SearchSourceBuilder and
how to retrieve them from a SearchResponse.
2017-07-14 18:33:28 +02:00
Christoph Büscher f809a12493 [Docs] Adding aggregation sections to high level client docs (#25707)
This adds a section about how to add aggregations to the SearchSourceBuilder and how
to retrieve them from a SearchRepsonse to the documentation for the high level rest client.
2017-07-14 12:47:47 +02:00
Bodecker DellaMaria 4f0dc5bf32 Mark filtered query example as not to be used (#25661)
The Filtered Query has been deprecated in favour of the Bool Query with a filter context. However, this deleted page for the Filtered Query is often ranked highly in search results when searching for documentation on "filtered queries". Often people just copy the first code snippet they see, which in this case is the INCORRECT syntax (the correct syntax follows). I think reordering the examples would help avoid a lot of confusion (I have seen people make this same mistake 3 times now)

Adding a comment to indicate that the first example shouldn't be used
2017-07-14 11:45:21 +02:00
Martijn van Groningen c8777c4c2e
docs: Updated reference docs that `document_type` is deprecated 2017-07-14 11:07:46 +02:00
Antonio Matarrese afd9a1c1b1 [DOCS] Explain mapping explosion (#25654) 2017-07-14 09:47:41 +02:00
Neil Rickards 5189bd14f1 [Docs] Fix typo in pattern-tokenizer.asciidoc (#25626) 2017-07-13 18:43:48 +02:00
Jim Ferenczi fe383b7c27 More clarifications on the unified highlighter being the new default (#25668)
* More clarifications on the unified highlighter being the new default
2017-07-13 15:38:58 +02:00
Jim Ferenczi 13da3eb53e Refactor QueryStringQuery for 6.0 (#25646)
This change refactors the query_string query to analyze the query text around logical operators of the query string the same way than a match_query/multi_match_query.
It also adds a type parameter that can be used to change the way multi fields query are built the same way than a multi_match query does.

Now that these queries share the same behavior regarding text analysis, some parameters are obsolete and have been deprecated:

split_on_whitespace: This setting is now ignored with a deprecation notice
if it is used explicitely. With this PR The query_string always splits on logical operator.
It simplifies the understanding of the other parameters that can have different meanings
depending on the value of split_on_whitespace.

auto_generate_phrase_queries: This setting is now ignored with a deprecation notice
if it is used explicitely. This setting only makes sense when the parser splits on whitespace.

use_dismax: This setting is now ignored with a deprecation notice
if it is used explicitely. The tie_breaker parameter is sufficient to handle best_fields/most_fields.

Fixes #25574
2017-07-13 15:32:17 +02:00
Martijn van Groningen 02fad9ac8c
docs: updated java client api to take this into account too to take into account the p/c queries are in parent-join module
Closes #25624
2017-07-13 11:24:22 +02:00
Luca Cavanna ec66d655b5 Rename client artifacts (#25693)
It was brought up that our current client artifacts have generic names like 'rest' that may cause conflicts with other artifacts.

This commit renames:

- rest -> elasticsearch-rest-client
- sniffer -> elasticsearch-rest-client-sniffer
- rest-high-level -> elasticsearch-rest-high-level-client

A couple of small changes are also preparing the high level client for its first release.

Closes #20248
2017-07-13 09:44:25 +02:00
Deb Adair ded9f55263 [DOCS] Incorporated feedback on the highlighting changes. 2017-07-12 16:36:33 -07:00
Ryan Ernst 70b2897bdf Scripting: Deprecate stored search template apis (#25437)
This commit deprecates the PUT, GET and DELETE search template apis.
Instead, the stored script api should be used.

closes #24596
2017-07-12 16:07:28 -07:00
Simon Willnauer e81804cfa4 Add a shard filter search phase to pre-filter shards based on query rewriting (#25658)
Today if we search across a large amount of shards we hit every shard. Yet, it's quite
common to search across an index pattern for time based indices but filtering will exclude
all results outside a certain time range ie. `now-3d`. While the search can potentially hit
hundreds of shards the majority of the shards might yield 0 results since there is not document
that is within this date range. Kibana for instance does this regularly but used `_field_stats`
to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results.

This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards
and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 22:19:20 +02:00
Jason Tedor 86e9438d3c Prevent excessive disk consumption by log files
This commit enables management of the main Elasticsearch log files
out-of-the-box by the following changes:
 - compress rolled logs
 - roll logs every 128 MB
 - maintain a sliding window of logs
 - remove the oldest logs maintaining no more than 2 GB of compressed
   logs on disk

Relates #25660
2017-07-12 15:52:00 -04:00
Jason Tedor 5a416b9922 Use config directory to find jvm.options
This commit removes the environment variable ES_JVM_OPTIONS that allows
the jvm.options file to sit separately from the rest of the config
directory. Instead, we use the CONF_DIR environment variable for custom
configuration location just as we do for the other configuration files.

Relates #25679
2017-07-12 15:29:13 -04:00
Christoph Büscher f3e7a1c4a4 Adding basic search request documentation for high level client (#25651) 2017-07-12 17:06:46 +02:00
Jack Conradson d2b4f7ac5a Disallow lang to be used with Stored Scripts (#25610)
Requests that execute a stored script will no longer be allowed to specify the lang of the script. This information is stored in the cluster state making only an id necessary to execute against. Putting a stored script will still require a lang.
2017-07-12 07:55:57 -07:00
Deb Adair b5e81132cf [DOCS] Reorganized the highlighting topic so it's less confusing. 2017-07-11 21:16:14 -07:00
Jason Tedor e165c405ac Add an underscore to flood stage setting
This is a minor nitty bikeshedding change that renames the suffix of the
disk flood stage setting to "flood_stage" from "floodstage".

Relates #25659
2017-07-11 22:02:00 -04:00
James Baiera 847378a43b Add another parent value option to join documentation (#25609)
Indexing a join field on a document requires a value of type "object" and two sub fields "name" 
and "parent". The "parent" field is only required on child documents, but the "name" field which 
denotes the name of the relation is always needed. Previously, only the short-hand version of the 
join field was documented. This adds documentation for the long-hand join field data, and 
explicitly points out that just specifying the name of the relation for the field value is a 
convenience shortcut.
2017-07-11 15:36:59 -04:00
Adrien Grand de99610c4e Remove reference to field-stats docs. 2017-07-11 18:38:25 +02:00
Simon Willnauer 98c91a3bd0 Limit the number of concurrent shard requests per search request (#25632)
This is a protection mechanism to prevent a single search request from
hitting a large number of shards in the cluster concurrently. If a search is
executed against all indices in the cluster this can easily overload the cluster
causing rejections etc. which is not necessarily desirable. Instead this PR adds
a per request limit of `max_concurrent_shard_requests` that throttles the number of
concurrent initial phase requests to `256` by default. This limit can be increased per request
and protects single search requests from overloading the cluster. Subsequent PRs can introduces
addiontional improvemetns ie. limiting this on a `_msearch` level, making defaults a factor of
the number of nodes or sort shards iters such that we gain the best concurrency across nodes.
2017-07-11 16:23:10 +02:00
Clinton Gormley bd7ddfa175 Removed field-stats docs 2017-07-11 15:15:25 +02:00
Clinton Gormley 92849c64db Fixed bad asciidoc file name 2017-07-11 12:47:52 +02:00
Clinton Gormley ddbbe9f7cc Tidied up the breaking changes docs 2017-07-11 12:40:14 +02:00
Herman Schaaf 977712f977 Change small typo in shards_allocation.asciidoc (#25643) 2017-07-11 11:25:49 +02:00
Tal Levy e04be73ad5 remove ingest.new_date_format (#25583) 2017-07-10 13:07:50 -07:00
Colin Goodheart-Smithe 3a5a54e83e Collapses package structure for some bucket aggs (#25579)
This change collapses some of the packages for the bucket aggregations into their parent packages. This was done for the following aggregations:
* The variants of the range aggregation (geo_distance, date and ip) were moved into the `o.e.s.a.bucket.range` package
* The `o.e.s.a.bucket.terms.support` package was removed and the classes were moved to `o.e.s.a.bucket.terms`
* The filter aggregation was moved to `o.e.s.a.bucket.filter`

Since this PR is already relatively large with only the above changes subsequent PRs will do similar operations on relevant metric and pipeline aggregations

Relates to #22868
2017-07-10 15:08:15 +01:00
Clinton Gormley e85871cfe9 Update cross-cluster-search.asciidoc
Increased the required min version of CCS in the docs to 5.5
2017-07-10 12:04:05 +02:00
DeDe Morton baa1858f56 [DOCS] Fix link (#25616) 2017-07-07 20:40:44 -07:00
DeDe Morton a4fedb213e Fix attribute reference on redirects page (#25614) 2017-07-07 20:15:42 -07:00
Jason Tedor 8148e25087 Fix disk allocator docs
This commit fixes the disk allocator docs which were broken due to the
inadvertent removal of some docs snippet markup.
2017-07-07 22:11:09 -04:00
Jason Tedor bc22c1c286 Add disk threshold settings validation
This commit adds cross-settings validation for the low/high/flood stage
disk watermark settings. This validation was enabled by the introduction
of multiple settings validation.

Relates #25600
2017-07-07 19:54:36 -04:00
matarrese 2eafbaf759 Document aggregating by day of the week (#25602)
Add documentation for aggregating by day of the week.

Closes #24660
2017-07-07 14:16:53 -04:00