Native scripts are no longer documented and instead using a ScriptEngine
is recommended. This change adds a deprecation warning for removal in
6.0.
relates #19966
This PR adds a new thread pool type: `fixed_auto_queue_size`. This thread pool
behaves like a regular `fixed` threadpool, except that every
`auto_queue_frame_size` operations (default: 10,000) in the thread pool,
[Little's Law](https://en.wikipedia.org/wiki/Little's_law) is calculated and
used to adjust the pool's `queue_size` either up or down by 50. A minimum and
maximum is taken into account also. When the min and max are the same value, a
regular fixed executor is used instead.
The `SEARCH` threadpool is changed to use this new type of thread pool. However,
the min and max are both set to 1000, meaning auto adjustment is opt-in rather
than opt-out.
Resolves#3890
Moves the remaining preconfigured token figured into the analysis-common module. There were a couple of tests in core that depended on the pre-configured token filters so I had to touch them:
* `GetTermVectorsCheckDocFreqIT` depended on `type_as_payload` but didn't do anything important with it. I dropped the dependency. Then I moved the test to a single node test case because we're trying to cut down on the number of `ESIntegTestCase` subclasses.
* `AbstractTermVectorsTestCase` and its subclasses depended on `type_as_payload`. I dropped their usage of the token filter and added an integration test for the termvectors API that uses `type_as_payload` to the `analysis-common` module.
* `AnalysisModuleTests` expected a few pre-configured token filtes be registered by default. They aren't any more so I dropped this assertion. We assert that the `CommonAnalysisPlugin` registers these pre-built token filters in `CommonAnalysisFactoryTests`
* `SearchQueryIT` and `SuggestSearchIT` had tests that depended on the specific behavior of the token filters so I moved the tests to integration tests in `analysis-common`.
In scripts (at least some of the languages), the terms dictionary and
postings can be access with the special _index variable. This is for
very advanced use cases which want to do their own scoring. The problem
is segment level statistics must be recomputed for every document.
Additionally, this is not friendly to the terms index caching as the
order of looking up terms should be controlled by lucene.
This change removes _index from scripts. Anyone using it can and should
instead write a Similarity plugin, which is explicitly designed to allow
doing the calculations needed for a relevance score.
closes#19359
Today when an index is `read-only` the index is also blocked from
being deleted which sometimes is undesired since in-order to make
changes to a cluster indices must be deleted to free up space. This is
a likely scenario in a hosted environment when disk-space is limited to switch
indices read-only but allow deletions to free up space.
This moves the releasing logic to the base test, so that individual test cases don't need
to worry about releasing the aggregators. It's not a big deal for individual aggs,
but once tests start using sub-aggs, it can become tricky to free (without double-freeing)
all the aggregators.
Range queries with now based date ranges were previously not allowed,
but since #23921 these queries were allowed. This change should really
fix range queries with now based date ranges.
Adding a unit test to InternalAdjecencyMatrix that extends the shared InternalAggregationTestCase
that we use for testing aggregations.
Relates to #22278
The test check that the number of outgoing/incoming recoveries of a shard is 0 after recoveries were done. Sadly that is not guaranteed by the current recovery logic as we decrement the counters only when all references to the relevant RecoveryTarget object have been released. This may happen in an async fashion to the recovery completion which causes the test to fail. I looked at options to change the recovery logic to have the recovery counters decrease before the recovery is done *under normally circumstances* but I don't see a clean way to do it. Since it won't give hard guarantees anyway I opted to add assertBusy to the test
Closes#24669
This commit adds a deprecation warning if `_index` is used in scripts.
It is emitted each time a script is invoked, but not per document. There
is no test because constructing a LeafIndexLookup is quite difficult,
but the deprecation warning does show up in IndexLookupIT, there is just
no way to assert warnings in integ tests.
relates #19359
Today we assert hart if failure listeners are invoked more than once. Yet, this
can happen if we cancel the execution since the caller and the handler will get
the exception on the cancelable threads and will notify the listener concurrently
if timinig allows. This commit relaxes the assertion towards handling multiple
invocations with `ExecutionCancelledException`
Closes#24010Closes#24179
Closes vagnerclementino/elasticsearch/#98
Template script engines (mustache, the only one) currently return a
BytesReference that users must know is utf8 encoded. This commit
modifies all callers and mustache to have the template engine return
String. This is much simpler, and does not require decoding in order to
use (for example, in ingest).
When retrieving documents to extract terms from as part of a more like this query, the _routing value can be set, yet it gets lost. That leads to not being able to retrieve the documents, hence more_like_this used to return no matches all the time.
Closes#23699
We generally accept string values when a boolean is expected. We've been doing that in our parsing code, but we missed that bit when moving parsing code to ObjectParser, which throws an error instead. This commit makes ObjectParser parse also string values into booleans. It throws an error in case the value is not `true` or `false`.
Closes#21802
The rendering methods in String and Long Significant String aggregations
and buckets are very similar. They can be factored out in the
InternalSignificantTerms class an InternalMappedSignificantTerms class.
This commit changes SignificantTerms.Bucket so that it is not an
abstract class anymore but an interface. It will be easier for the Java
High Level Rest Client to provide its own implementation of
SignificantTerms and SignificantTerms.Bucket. Also, it is now more
coherent with the others aggregations.
The disruption tests sit in a single test suite which causes these tests
to be single-threaded. We can split this test suite into multiple suites
(logically, of course) enabling them to be run in parallel reducing the
total run time of all integration tests in core. This commit splits the
discovery with service disruptions test suite into three suites
- master disruptions
- discovery disruptions
- cluster disruptions
The last one could probably be better named, it is meant to represent
performing actions in the cluster (indexing, failing a shard, etc.)
while a disruption is taking place.
Relates #24662
This commit removes the deprecated support for .yaml and .json files. If
the files still exist, the node will fail to start, indicating the file
must be converted or renamed.
closes#19391
With the current implementation, SniffNodesSampler might close the
current connection right after a request is sent but before the response
is correctly handled. This causes to timeouts in the transport client
when the sniffing is activated.
closes#24575closes#24557
When constructing an array list, if we know the size of the list in
advance (because we are adding objects to it derived from another list),
we should size the array list to the appropriate capacity in advance (to
avoid resizing allocations). This commit does this in various places.
Relates #24439
* Add parent-join module
This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.
Relates #20257
Today we prune transport handlers in TransportService when a node is disconnected.
This can cause connections to starve in the TransportService if the connection is
opened as a short living connection ie. without sharing the connection to a node
via registering in the transport itself. This change now moves to pruning based
on the connections cache key to ensure we notify handlers as soon as the connection
is closed for all connections not just for registered connections.
Relates to #24632
Relates to #24575
Relates to #24557
- Removes clusterState, getInitialClusterState and getMinimumMasterNodes methods from Discovery interface.
- Sets PingContextProvider in ZenPing constructor
- Renames state in ZenDiscovery to committedState
This commit documents how to write a `ScriptEngine` in order to use
expert internal apis, such as using Lucene directly to find index term
statistics. These documents prepare the way to remove both native
scripts and IndexLookup.
The example java code is actually compiled and tested under a new gradle
subproject for example plugins. This change does not yet breakup
jvm-example into the new examples dir, which should be done separately.
relates #19359
relates #19966
Specifying s3 access and secret keys inside repository settings are not
secure. However, until there is a way to dynamically update secure
settings, this is the only way to dynamically add repositories with
credentials that are not known at node startup time. This commit adds
back `access_key` and `secret_key` s3 repository settings, but protects
it with a required system property `allow_insecure_settings`.
This allows other plugins to use a client to call the functionality
that is in the core modules without duplicating the logic.
Plugins can now safely send the request and response classes via the
client even if the requests are executed locally. All relevant classes
are loaded by the core classloader such that plugins can share them.
This is re-adds this commit that was revered in 952feb58e4
If we fail to acquire the shard lock, need to retry and wait for the new
cluster state, we were sending the wrong kind of request for the replica
action. This commit fixes this issue.
This commit adds support for histogram and date_histogram agg compound order by refactoring and reusing terms agg order code. The major change is that the Terms.Order and Histogram.Order classes have been replaced/refactored into a new class BucketOrder. This is a breaking change for the Java Transport API. For backward compatibility with previous ES versions the (date)histogram compound order will use the first order. Also the _term and _time aggregation order keys have been deprecated; replaced by _key.
Relates to #20003: now that all these aggregations use the same order code, it should be easier to move validation to parse time (as a follow up PR).
Relates to #14771: histogram and date_histogram aggregation order will now be validated at reduce time.
Closes#23613: if a single BucketOrder that is not a tie-breaker is added with the Java Transport API, it will be converted into a CompoundOrder with a tie-breaker.
This commit addresses an issue in the missing active IDs prevent advance
test from the global checkpoint tracker. The assumptions this test was
making about reality were violated when global checkpoints were inlined
(specifically, the component of that change where the tracker's
knowledge of the global checkpoint was updated inline with updates to
the tracker's knowledge of local checkpoints for an allocatio ID). The
point of the test was to ensure that a lagging shard prevents the global
checkpoint from advancing, so this commit rewrites the test with that in
mind.
This allows other plugins to use a client to call the functionality
that is in the core modules without duplicating the logic.
Plugins can now safely send the request and response classes via the
client even if the requests are executed locally. All relevant classes
are loaded by the core classloader such that plugins can share them.
With global checkpoints we also take into account if a global checkpoint must be fsynced.
Yet, with recent addition of inlining global checkpoints into indexing operations from a
test perspective unnecessary fsyncs might be reported if `Translog#syncNeeded` is checked.
Now the test only check if the last write location triggers an fsync instead.
Closes#24600
This commit fixes an issue in the checkpoints advance test. Namely, when
there zero documents indexed, after the global checkpoint is synced, the
global checkpoint will have advanced to the no ops performed. There is a
larger conceptual problem here, namely that the primary does not update
its knowledge of its own local checkpoint upon recovery which causes the
global checkpoint to initially be unassigned and then advance to no ops
performed, but this will be addressed in a follow-up.
Previously query weight was created for each search hit that needed to compute inner hits,
with this change the weight of the inner hit query is computed once for all search hits.
Closes#23917
We allow non-dynamic settings to be updated on closed indices but we don't
check if the updated settings can be used to open/create the index.
This can lead to unrecoverable state where the settings are updated but the index
cannot be reopened since the settings are not valid. Trying to update the invalid settings
is also not possible since the update will fail to validate the current settings.
This change adds the validation of the updated settings for closed indices and make sure that the new settings
do not prevent the reopen of the index.
Fixes#23787