Requesting a million hits, or page 100,000 is always a bad idea, but users
may not be aware of this. This adds a per-index limit on the maximum size +
from that can be requested which defaults to 10,000.
This should not interfere with deep-scrolling.
Closes#9311
The documentation states that scrolls are automatically closed when all
documents are consumed, but this is not the case. I first tried to fix
the code to close scrolls automatically but this made REST tests fail
because clearing a scroll that is already closed returned a 4xx error
instead of a 2xx code, so this has probably been this way for a very long
time.
Just like specifying `?preference=_primary`, this adds the ability to
specify `?preference=_replica` or `?preference=_replica_first` on
requests that support it.
Resolves#12222
Field stats index constraints allows to omit all field stats for indices that don't match with the constraint. An index
constraint can exclude indices' field stats based on the `min_value` and `max_value` statistic. This option is only
useful if the `level` option is set to `indices`.
For example index constraints can be useful to find out the min and max value of a particular property of your data in
a time based scenario. The following request only returns field stats for the `answer_count` property for indices
holding questions created in the year 2014:
curl -XPOST 'http://localhost:9200/_field_stats?level=indices' -d '{
"fields" : ["answer_count"] <1>
"index_constraints" : { <2>
"creation_date" : { <3>
"min_value" : { <4>
"gte" : "2014-01-01T00:00:00.000Z",
},
"max_value" : {
"lt" : "2015-01-01T00:00:00.000Z"
}
}
}
}'
Closes#11187
In order to be more consistent with what they do, the query cache has been
renamed to request cache and the filter cache has been renamed to query
cache.
A known issue is that package/logger names do no longer match settings names,
please speak up if you think this is an issue.
Here are the settings for which I kept backward compatibility. Note that they
are a bit different from what was discussed on #11569 but putting `cache` before
the name of what is cached has the benefit of making these settings consistent
with the fielddata cache whose size is configured by
`indices.fielddata.cache.size`:
* index.cache.query.enable -> index.requests.cache.enable
* indices.cache.query.size -> indices.requests.cache.size
* indices.cache.filter.size -> indices.queries.cache.size
Close#11569
This change unifies the way scripts and templates are specified for all instances in the codebase. It builds on the Script class added previously and adds request building and parsing support as well as the ability to transfer script objects between nodes. It also adds a Template class which aims to provide the same functionality for template APIs
Closes#11091
The default `false` for `require_field_match` is a bit odd and confusing for users, given that field names get ignored by default and every field gets highlighted if it contains terms extracted out of the query, regardless of which fields were queries. Changed the default to `true`, it can always be changed per request.
Closes#10627Closes#11067
Our own fork of the lucene PostingsHighlighter is not easy to maintain and doesn't give us any added value at this point. In particular, it was introduced to support the require_field_match option and discrete per value highlighting, used in case one wants to highlight the whole content of a field, but get back one snippet per value. These two features won't
make it into lucene as they slow things down and shouldn't have been supported from day one on our end probably.
One other customization we had was support for a wider range of queries via custom rewrite etc. (yet another way to slow
things down), which got added to lucene and works much much better than what we used to do (instead of or rewrite, term
s are pulled out of the automata for multi term queries).
Removing our fork means the following in terms of features:
- dropped support for require_field_match: the postings highlighter will only highlight fields that were queried
- some custom es queries won't be supported anymore, meaning they won't be highlighted. The only one I found up until now is the phrase_prefix. Postings highlighter rewrites against an empty reader to avoid slow operations (like the ones that we were performing with the fork that we are removing here), thus the prefix will not be expanded to any term. What the postings highlighter does instead is pulling the automata out of multi term queries, but this is not supported at the moment with our MultiPhrasePrefixQuery.
Closes#10625Closes#11077
Previously, collate feature would be executed on all shards of an index using the client,
this leads to a deadlock when concurrent collate requests are run from the _search API,
due to the fact that both the external request and internal collate requests use the
same search threadpool.
As phrase suggestions are generated from the terms of the local shard, in most cases the
generated suggestion, which does not yield a hit for the collate query on the local shard
would not yield a hit for collate query on non-local shards.
Instead of using the client for collating suggestions, collate query is executed against
the ContextIndexSearcher. This PR removes the ability to specify a preference for a collate
query, as the collate query is only run on the local shard.
closes#9377
Add methods to operate on multi-valued fields in the expressions language.
Note that users will still not be able to access individual values
within a multi-valued field.
The following methods will be included:
* min
* max
* avg
* median
* count
* sum
Additionally, changes have been made to MultiValueMode to support the
new median method.
closes#11105
Removes the More Like This API, users should now use the More Like This query.
The MLT API tests were converted to their query equivalent. Also some clean
ups in MLT tests.
Closes#10736Closes#11003