I have seen this question a couple times already, most recently at
https://twitter.com/dimosr7/status/973872744965332993
I tried to keep the explanation as simple as I could, which is not always easy
as this is a matter of trade-offs.
By the time the master branch is released the deprecated url
parameters in the `/_cache/clear` API will have been deprecated
for a couple of minor releases. Since master will be the next
major release we are fine with removing these parameters.
Currently we have a fairly complicated logic in the engine constructor logic to deal with all the
various ways we want to mutate the lucene index and translog we're opening.
We can:
1) Create an empty index
2) Use the lucene but create a new translog
3) Use both
4) Force a new history uuid in all cases.
This leads complicated code flows which makes it harder and harder to make sure we cover all the
corner cases. This PR tries to take another approach. Constructing an InternalEngine always opens
things as they are and all needed modifications are done by static methods directly on the
directory, one at a time.
* Add a REST integration test that documents date_range support
Add a test case that exercises date_range aggregations using the missing
option.
Addresses #17597
* Test cleanup and correction
Adding a document with a null date to exercise `missing` option, update
test name to something reasonable.
* Update documentation to explain how the "missing" parameter works for
date_range aggregations.
* Wrap lines at 80 chars in docs.
* Change format of test to YAML for readability.
The current docs on [Indices APIs: PUT Mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html) suggests that a having number of different mapping types per index is still possible in elasticsearch versions > 6.0.0 although they have been [removed](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html). The console code has already been updated accordingly but notes (2) and (3) on the console code still name the `user` mapping type.
This PR updates the list with notes after the console code, as well as the first sentence of the docs
to avoid confusion. Also, I have removed the second command from the console code as it no
longer holds any value if the docs are solely on the `_doc` mapping.
The original example resulted in a 400 error due to the example being `-` separated instead of the default `.` separation.
```
failed to parse date field [2001-01-01] with format [YYYY.MM.dd]
```
Adds a usage example of the JLH score used in significant terms aggregation.
All other methods to calculate significance score have such an example
Closes#28513
Increase the default limit of `index.highlight.max_analyzed_offset` to 1M instead of previous 10K.
Enhance an error message when offset increased to include field name, index name and doc_id.
Relates to https://github.com/elastic/kibana/issues/16764
* Clarifies how the query_string splits textual part to build a query
Whitespaces are not considered as operators anymore in 6x but the documentation is not clear about it.
This commit changes the example in the documentation and adds a note regarding whitespaces and operators.
Closes#28719
Values for the network.host setting can often contain a colon which is a
character that is considered special by YAML (these arise in IPv6
addresses and some of the special tags like ":ipv4"). As such, these
values need to be quoted or a YAML parser will be unhappy with
them. This commit adds a note to the docs regarding this.
* Reject regex search if regex string is too long (#28344)
* Add docs
* Introduce index level setting `index.max_regex_length`
to control the maximum length of the regular expression
Closes#28344
Similarly to what has been done for s3 and azure, this commit removes
the repository settings `application_name` and `connect/read_timeout`
in favor of client settings. It introduce a GoogleCloudStorageClientSettings
class (similar to S3ClientSettings) and a bunch of unit tests for that,
it aligns the documentation to be more coherent with the S3 one, it
documents the connect/read timeouts that were not documented at all and
also adds a new client setting that allows to define a custom endpoint.
We previously specified the -server flag to force the JVM to use the
server JVM. This is the default on all the systems that we support when
using a 64-bit JVM (and we no longer support 32-bit JVMs). There was
some trouble with this flag for the Windows service since procrun did
not understand what to do with it; as such, we had to filter this flag
out in the service. When we migrated to parsing JVM options in Java (via
the JVM options parser) we simplified this situation and removed
specifying the -server flag. This commit removes a leftover statement
that we are forcing the server JVM.
Relates #28738
The node stats API enables filtlering the top-level stats for only
desired top-level stats. Yet, this was never enabled for adaptive
replica selection stats. This commit enables this. We also add setting
these stats on the request builder, and fix an inconsistent name in a
setter.
Relates #28721
The Windows service will use a private temporary directory under the
user that is performing the installation. In cases when the service will
run as a different user, operators need a method to set this temporary
directory elsewhere. We have such a mechanism, so this commit merely
adds a note to the documentation on how to utilize it.
Relates #28712
Elasticsearch 6.x indices do not allow multiple types index. Instead, they use "_doc" as default if created internally (Elasticsearch), or "doc" default if sent by Logstash.
Currently the Translog constructor is capable both of opening an existing translog and creating a
new one (deleting existing files). This PR separates these two into separate code paths. The
constructors opens files and a dedicated static methods creates an empty translog.
* Search option terminate_after does not handle post_filters and aggregations correctly
This change fixes the handling of the `terminate_after` option when post_filters (or min_score) are used.
`post_filter` should be applied before `terminate_after` in order to terminate the query when enough document are accepted
by the post_filters.
This commit also changes the type of exception thrown by `terminate_after` in order to ensure that multi collectors (aggregations)
do not try to continue the collection when enough documents have been collected.
Closes#28411
We do want to keep this functionality in the future and we provide support for it.
This change is a first step towards replacing the `synonym` token filter with `synonym_graph`.