Currently `fuzziness` is not supported for the `cross_fields` type
of the `multi_match` query since it complicates the logic that
blends the term queries that cross_fields uses internally. At the
moment using this combination is silently ignored, which can lead to
confusions. Instead we should throw an exception in this case.
The same is true for phrase and phrase_prefix type.
Closes#7764
This commit adds a test that a fixed executors rejected count behaves as
expected. In particular, we test that if we consume the executor, then
stuff the executor queue, further tasks will be rejected and the
rejected stats are updated appropriately. This test also asserts that if
we resize the queue the rejected count is reset to zero.
Relates #18301
This removes all the mentions of the sandbox from the script engine
services and permissions model. This means that the following settings
are no longer supported:
```yaml
script.inline: sandbox
script.stored: sandbox
```
Instead, only a `true` or `false` value can be specified.
Since this would otherwise break the default-allow parameter for
languages like expressions, painless, and mustache, all script engines
have been updated to have individual settings, for instance:
```yaml
script.engine.groovy.inline: true
```
Would enable all inline scripts for groovy. (they can still be
overridden on a per-operation basis).
Expressions, Painless, and Mustache all default to `true` for inline,
file, and stored scripts to preserve the old scripting behavior.
Resolves#17114
Currently terms on an ip address try to put their binary representation in the
json response. With this commit, they would return a formatted ip address:
```
"buckets": [
{
"key": "192.168.1.7",
"doc_count": 1
}
]
```
We add support to explicitly exclude specific transport actions
from the request size limit check.
We also exclude the following request types currently:
*MasterPingRequest
* PingRequest
After considerable discussion, we have elected to set the default min
heap to 256m and the default max heap to 2g. This is to balance the
desire for a good out-of-the-box performance experience (default max
heap of 2g) with a good out-of-the-box experience running on machines
with limited resources or running multiple instances on a single modern
developer laptop (default min heap of 1g).
Benchmarking with the geonames data set shows that Elasticsearch truly
needs 2g of heap or the heap is a bottleneck for indexing rates. If our
goal is to satisfy the out-of-the-box-experience, we should prioritize
the out-of-the-box performance experience. Thus, the default heap should
be 2g until smaller heaps are not the bottleneck for indexing small data
sets.
As most of our log messages are not sentences and do not end with
periods, this commit removes a period from the end of the min master
node bootstrap check log message.
Rearranges the FingerprintAnalyzer so that AsciiFolding comes earlier in the chain (after lowercasing, before stop removal, for maximum deduping power)
Closes#18266
This commit fixes the Debian package requires clause for bash. For the
RPM it is okay to specify the binary, but for the Debian package the
package name must be specified.
This commit ensures that if CORS is enabled, then Origin headers are
checked regardless of whether the request came from a browser or not.
In the past, we only proceeded with CORS checks if the User-Agent was a
browser.
(same name and UUID) exists in the cluster state. This resolves a
situation where if an index data folder was copied into a node's data
directory while the node is running and that index had a tombstone in
the cluster state, the index would still get imported.
Closes#18250Closes#18249
It will keep using the caching terms enum for keyword/text fields and falls back
to IndexSearcher.count for fields that do not use the inverted index for
searching (such as numbers and ip addresses). Note that this probably means that
significant terms aggregations on these fields will be less efficient than they
used to be. It should be ok under a sampler aggregation though.
This moves tests back to the state they were in before numbers started using
points, and also adds a new test that significant terms aggs fail if a field is
not indexed.
In the long term, we might want to follow the approach that Robert initially
proposed that consists in collecting all documents from the background filter in
order to compute frequencies using doc values. This would also mean that
significant terms aggregations do not require fields to be indexed anymore.