Adds a global soft limit on the number of concurrently executing enrich policies.
Since an enrich policy is run on the generic thread pool, this is meant to limit
policy runs separately from the generic thread pool capacity.
This PR adds a background maintenance task that is scheduled on the master node only.
The deletion of an index is based on if it is not linked to a policy or if the enrich alias is not
currently pointing at it. Synchronization has been added to make sure that no policy
executions are running at the time of cleanup, and if any executions do occur, the marking
process delays cleanup until next run.
This commit adds permissions validation on the indices provided in the
enrich policy. These indices should be validated at store time so as not
to have cryptic error messages in the event the user does not have
permissions to access said indices.
A bug was introduced in 6.6.0 when we added support for
rollup indices. Rollup caps does NOT support looking at
remote indices, consequently, since we always look up rollup
caps, the datafeed fails with an error if its config
includes a concrete remote index. (When all remote indices
in a datafeed config are wildcards the problem did not
occur.)
The rollups feature does not support remote indices, so if
there is any remote index in a datafeed config (wildcarded
or not), we can skip the rollup cap checks. This PR
implements that change.
In MachineLearningIT.testStopDataFrameAnalytics we call start and
then assert the state is `started`. However, if things go fast enough,
the state could have already changed to `reindexing` or `analyzing`.
The test has been failing occasionally due to the state being
`reindexing`. We fix this by simply asserting the state is either
of `started`, `reindexing` or `analyzing`.
Closes#43924
This brings TokenizerFactory into line with CharFilterFactory and TokenFilterFactory,
and removes the need to pass around tokenizer names when building custom analyzers.
As this means that TokenizerFactory is no longer a functional interface, the commit also
adds a factory method to TokenizerFactory to make construction simpler.
Disjunction over two individual terms in a phrase query with multi-word synonyms
wrongly applies a prefix query to each of these terms. This change fixes this bug
by inversing the logic to use prefixes on `phrase_prefix` queries only.
Closes#43308
This commit changes async IO processor to release the promiseSemaphore
before notifying consumers. This ensures that a bad consumer that
sometimes does blocking (or otherwise slow) operations does not halt the
processor. This should slightly increase the concurrency for shard
fsync, but primarily improves safety so that one bad piece of code has
less effect on overall system performance.
Enables libs/geo parser to return a geometry format object that can
perform both serialization and deserialization functions. This can
be useful for ingest nodes that are trying to modify an existing
geometry in the source.
Relates to #43554
* [ML][Data Frame] Adding bwc tests for pivot transform (#43506)
* [ML][Data Frame] Adding bwc tests for pivot transform
* adding continuous transforms
* adding continuous dataframes to bwc
* adding continuous data frame tests
* Adding rolling upgrade tests for continuous df
* Fixing test
* Adjusting indices used in BWC, and handling NPE for seq_no_stats
* updating and muting specific bwc test
* Adjusting bwc tests for backport
This is a prerequisite of #42189:
* Add directory delete method to blob container specific to each implementation:
* Some notes on the implementations:
* AWS + GCS: We can simply exploit the fact that both AWS and GCS return blobs lexicographically ordered which allows us to simply delete in the same order that we receive the blobs from the listing request. For AWS this simply required listing without the delimiter setting (so we get a deep listing) and for GCS the same behavior is achieved by not using the directory mode on the listing invocation. The nice thing about this is, that even for very large numbers of blobs the memory requirements are now capped nicely since we go page by page when deleting.
* For Azure I extended the parallelization to the listing calls as well and made it work recursively. I verified that this works with thread count `1` since we only block once in the initial thread and then fan out to a "graph" of child listeners that never block.
* HDFS and FS are trivial since we have directory delete methods available for them
* Enhances third party tests to ensure the new functionality works (I manually ran them for all cloud providers)
IndexAnalyzers has a close() method that should iterate through all its wrapped
analyzers and close each one in turn. However, instead of delegating to the
analyzers' close() methods, it instead wraps them in a Closeable interface,
which just returns a list of the analyzers. In addition, whitespace normalizers are
ignored entirely.
Introduced proxy api the handle the search request load that originates
from enrich processor. The enrich processor can execute many search
requests that execute asynchronously in parallel and that can easily overwhelm
the search thread pool on nodes. In order to protect this the Coordinator
queues the search requests and only executes a fixed number of search requests
in parallel.
Besides this; the Coordinator tries to include as much as possible search requests
(up to a defined maximum) inside a multi search request in order to reduce the number
of remote api calls to be made from the node that performs ingestion.
When profiling a call to `AllocationService#reroute()` in a large cluster
containing allocation filters of the form `node-name-*` I observed a nontrivial
amount of time spent in `Regex#simpleMatch` due to these allocation filters.
Patterns ending in a wildcard are not uncommon, and this change treats them as
a special case in `Regex#simpleMatch` in order to shave a bit of time off this
calculation. It also uses `String#regionMatches()` to avoid an allocation in
the case that the pattern's only wildcard is at the start.
Microbenchmark results before this change:
Result "org.elasticsearch.common.regex.RegexStartsWithBenchmark.performSimpleMatch":
1113.839 ±(99.9%) 6.338 ns/op [Average]
(min, avg, max) = (1102.388, 1113.839, 1135.783), stdev = 9.486
CI (99.9%): [1107.502, 1120.177] (assumes normal distribution)
Microbenchmark results with this change applied:
Result "org.elasticsearch.common.regex.RegexStartsWithBenchmark.performSimpleMatch":
433.190 ±(99.9%) 0.644 ns/op [Average]
(min, avg, max) = (431.518, 433.190, 435.456), stdev = 0.964
CI (99.9%): [432.546, 433.833] (assumes normal distribution)
The microbenchmark in question was:
@Fork(3)
@Warmup(iterations = 10)
@Measurement(iterations = 10)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
@SuppressWarnings("unused") //invoked by benchmarking framework
public class RegexStartsWithBenchmark {
private static final String testString = "abcdefghijklmnopqrstuvwxyz";
private static final String[] patterns;
static {
patterns = new String[testString.length() + 1];
for (int i = 0; i <= testString.length(); i++) {
patterns[i] = testString.substring(0, i) + "*";
}
}
@Benchmark
public void performSimpleMatch() {
for (int i = 0; i < patterns.length; i++) {
Regex.simpleMatch(patterns[i], testString);
}
}
}
* Optimize Snapshot Finalization
* Delete index-N blobs and segement blobs in one single bulk delete instead of in separate ones to save RPC calls on implementations that have bulk deletes implemented
* Don't fail snapshot because deleting old index-N failed, this results in needlessly logging finalization failures and makes analysis of failures harder going forward as well as incorrect index.latest blobs
This introduces a `failed` state to which the data frame analytics
persistent task is set to when something unexpected fails. It could
be the process crashing, the results processor hitting some error,
etc. The failure message is then captured and set on the task state.
From there, it becomes available via the _stats API as `failure_reason`.
The df-analytics stop API now has a `force` boolean parameter. This allows
the user to call it for a failed task in order to reset it to `stopped` after
we have ensured the failure has been communicated to the user.
This commit also adds the analytics version in the persistent task
params as this allows us to prevent tasks to run on unsuitable nodes in
the future.
This commit deprecates the `transport.profiles.*.xpack.security.type`
setting. This setting is used to configure a profile that would only
allow client actions. With the upcoming removal of the transport client
the setting should also be deprecated so that it may be removed in
a future version.
* Add Ability to List Child Containers to BlobContainer (#42653)
* Add Ability to List Child Containers to BlobContainer
* This is a prerequisite of #42189
This adds the ability to execute an action for each element that occurs
in an array, for example you could sent a dedicated slack action for
each search hit returned from a search.
There is also a limit for the number of actions executed, which is
hardcoded to 100 right now, to prevent having watches run forever.
The watch history logs each action result and the total number of actions
the were executed.
Relates #34546
All valid licenses permit security, and the only license state where
we don't support security is when there is a missing license.
However, for safety we should attach the system (or xpack/security)
user to internally originated actions even if the license is missing
(or, more strictly, doesn't support security).
This allows all nodes to communicate and send internal actions (shard
state, handshake/pings, etc) even if a license is transitioning
between a broken state and a valid state.
Relates: #42215
Backport of: #43468
Document level security was depending on the shared
"BitsetFilterCache" which (by design) never expires its entries.
However, when using DLS queries - particularly templated ones - the
number (and memory usage) of generated bitsets can be significant.
This change introduces a new cache specifically for BitSets used in
DLS queries, that has memory usage constraints and access time expiry.
The whole cache is automatically cleared if the role cache is cleared.
Individual bitsets are cleared when the corresponding lucene index
reader is closed.
The cache defaults to 50MB, and entries expire if unused for 7 days.
Backport of: #43669
This change fixes the name of the index_prefix sub field when the `index_prefix`
option is set on a text field that is nested under an object or a multi-field.
We don't use the full path of the parent field to set the index_prefix field name
so the field is registered under the wrong name. This doesn't break queries since
we always retrieve the prefix field through its parent field but this breaks other
APIs like _field_caps which tries to find the parent of the `index_prefix` field
in the mapping but fails.
Closes#43741
If an item in the bulk request fails, that could be for a variety of
reasons - it may be that the underlying behaviour of security has
changed, or it may just be a transient failure during testing.
Simply asserting a `true`/`false` value produces failure messages that
are difficult to diagnose and debug. Using hamcert (`assertThat`) will
make it easier to understand the causes of failures in this test.
Backport of: #43725
Add the "Authorization" section to the API key API docs.
These APIs require The new manage_api_key cluster privilege.
Relates: #43865
Backport of: #43811
This commit allows bulk upserts to correctly read the default pipeline
for the concrete index that belongs to an alias.
Bulk upserts are modeled differently from normal index requests such that
the index request is a request inside of the update request. The update
request (outer) contains the index or alias name is not part of the (inner)
index request. This commit adds a secondary check against the update request
(outer) if the index request (inner) does not find an alias.