Currently `fuzziness` is not supported for the `cross_fields` type
of the `multi_match` query since it complicates the logic that
blends the term queries that cross_fields uses internally. At the
moment using this combination is silently ignored, which can lead to
confusions. Instead we should throw an exception in this case.
The same is true for phrase and phrase_prefix type.
Closes#7764
This commit adds a test that a fixed executors rejected count behaves as
expected. In particular, we test that if we consume the executor, then
stuff the executor queue, further tasks will be rejected and the
rejected stats are updated appropriately. This test also asserts that if
we resize the queue the rejected count is reset to zero.
Relates #18301
Currently terms on an ip address try to put their binary representation in the
json response. With this commit, they would return a formatted ip address:
```
"buckets": [
{
"key": "192.168.1.7",
"doc_count": 1
}
]
```
We add support to explicitly exclude specific transport actions
from the request size limit check.
We also exclude the following request types currently:
*MasterPingRequest
* PingRequest
As most of our log messages are not sentences and do not end with
periods, this commit removes a period from the end of the min master
node bootstrap check log message.
Rearranges the FingerprintAnalyzer so that AsciiFolding comes earlier in the chain (after lowercasing, before stop removal, for maximum deduping power)
Closes#18266
This commit fixes the Debian package requires clause for bash. For the
RPM it is okay to specify the binary, but for the Debian package the
package name must be specified.
This commit ensures that if CORS is enabled, then Origin headers are
checked regardless of whether the request came from a browser or not.
In the past, we only proceeded with CORS checks if the User-Agent was a
browser.
(same name and UUID) exists in the cluster state. This resolves a
situation where if an index data folder was copied into a node's data
directory while the node is running and that index had a tombstone in
the cluster state, the index would still get imported.
Closes#18250Closes#18249
It will keep using the caching terms enum for keyword/text fields and falls back
to IndexSearcher.count for fields that do not use the inverted index for
searching (such as numbers and ip addresses). Note that this probably means that
significant terms aggregations on these fields will be less efficient than they
used to be. It should be ok under a sampler aggregation though.
This moves tests back to the state they were in before numbers started using
points, and also adds a new test that significant terms aggs fail if a field is
not indexed.
In the long term, we might want to follow the approach that Robert initially
proposed that consists in collecting all documents from the background filter in
order to compute frequencies using doc values. This would also mean that
significant terms aggregations do not require fields to be indexed anymore.
Today we encourage users to set their minimum and maximum heap settings
equal to each other to prevent the heap from resizing. Yet, the default
heap settings do not start Elasticsearch in this fashion. This commit
addresses this discrepancy by setting the default min heap to '512m' and
the default max heap to the default min heap.
Relates #16334
This commit modifies two logging statements in the
IndexingMemoryController to log the key for the setting
indices.memory.index_buffer_size instead of the object.
Relates #18191
* Docs: First pass at improving analyzer docs
I've rewritten the intro to analyzers plus the docs
for all analyzers to provide working examples.
I've also removed:
* analyzer aliases (see #18244)
* analyzer versions (see #18267)
* snowball analyzer (see #8690)
Next steps will be tokenizers, token filters, char filters
* Fixed two typos