diskspace and creates a subfolder for storing data outside of Lucene
indexes, but as part of the ES data paths.
Details:
- tmp storage is managed and does not allow allocation if disk space is
below a threshold (5GB at the moment)
- tmp storage is supposed to be managed by the native component but in
case this fails cleanup is provided:
- on job close
- on process crash
- after node crash, on restart
- available space is re-checked for every forecast call (the native
component has to check again before writing)
Note: The 1st path that has enough space is chosen on job open (job
close/reopen triggers a new search)
Watcher tests now always fail hard when watches that were
tried to be triggered in a test using the trigger() method,
but could not because they were not found on any of the
nodes in the cluster.
This change adds version information in case a native ML process crashes, the version is important for choosing the right symbol files when analyzing the crash. Adding the version combines all necessary information on one line.
relates elastic/ml-cpp#94
I still do not like == false. However, I am so use to reading it that
today I read this line of code and could not understand how it could
possibly be doing the right thing. It was only when I finally noticed
the ! that the code made sense. This commit changes this code to be in
our style of == false. I still do not like == false.
Get Settings API changes have now been backported to version 6.4, and
therefore the latest version must send and expect the extra fields when
communicating with 6.4+ code.
Relates #29229#30494
It is possible for state documents to be
left behind in the state index. This may be
because of bugs or uncontrollable scenarios.
In any case, those documents may take up quite
some disk space when they add up. This commit
adds a step in the expired data deletion that
is part of the daily maintenance service. The
new step searches for state documents that
do not belong to any of the current jobs and
deletes them.
Closes#30551
There's no need for an extra blobExists() call when writing a blob to the GCS service. GCS provides
an option (with stronger consistency guarantees) on the insert method that guarantees that the
blob that's uploaded does not already exist.
Relates to #19749
Adjust fast forward for token expiration test
Adjusts the maximum fast forward time for token expiration tests
to be 5 seconds before actual token expiration so that the test
won't fail even when upperlimit is randomly selected.
Resolves: #30062
Currently in a rescore request if window_size is smaller than
the top N documents returned (N=size), explanation of scores could be incorrect
for documents that were a part of topN and not part of rescoring.
This PR corrects this, but saving in RescoreContext docIDs of documents
for which rescoring was applied, and adding rescoring explanation
only for these docIDs.
Closes#28725
The camel case name `nGram` should be removed in favour of `ngram` and
similar for `edgeNGram` and `edge_ngram`. Before removal, we need to
deprecate the camel case names first. This change adds deprecation
warnings for indices with versions 6.4.0 and higher and logs deprecation
warnings.
The part of the history template responsible for slack attachments had a
dynamic mapping configured which could lead to problems, when a string
value looking like a date was configured in the value field of an
attachment.
This commit fixes the template by setting this field always to text.
This also requires a change in the template numbering to be sure this
will be applied properly when starting watcher.
Since #30143, the Cluster State API should always returns the current
cluster_uuid in the response body, regardless of the metrics filters.
This is not exactly true as it is returned only if metadata metrics and
no specific indices are requested.
This commit fixes the behavior to always return the cluster_uuid and
add new test.
This test failed but the cause is not obvious. This commit adds more
debug logging traces so that if it reproduces we could gather more
information.
Related #30577
The default behaviour for Apache HTTP client is to mimic the standard
browser behaviour of clearing the authentication cache (for a given
host) if that host responds with 401.
This behaviour is appropriate in a interactive browser environment
where the user is given the opportunity to provide alternative
credentials, but it is not the preferred behaviour for the ES REST
client.
X-Pack may respond with a 401 status if a request is made before the
node/cluster has recovered sufficient state to know how to handle the
provided authentication credentials - for example the security index
need to be recovered before we can authenticate native users.
In these cases the correct behaviour is to retry with the same
credentials (rather than discarding those credentials).
Adds windows server 2012r2 and 2016 vagrant boxes to packaging tests.
They can only be used if IDs for their images are specified, which are
passed to gradle and then to vagrant via env variables. Adds options
to the project property `vagrant.boxes` to choose between linux and
windows boxes.
Bats tests are run only on linux boxes, and portable packaging tests run
on all boxes. Platform tests are only run on linux boxes since they are
not being maintained.
For #26741
This commit removes xpack from being a meta-plugin-as-a-module.
It also fixes a couple tests which were missing task dependencies, which
failed once the gradle execution order changed.
Removes dependency for server's version from the JDBC driver code. This
should allow us to dramatically reduce driver's size by removing the
server dependency from the driver.
Relates #29856
This commit increases the logging level around search to aid in
debugging failures in LicensingTests#testSecurityActionsByLicenseType
where we are seeing all shards failed error while trying to search the
security index.
See #30301
Date histograms on non-fixed timezones such as `Europe/Paris` proved much slower
than histograms on fixed timezones in #28727. This change mitigates the issue by
using a fixed time zone instead when shard data doesn't cross a transition so
that all timestamps share the same fixed offset. This should be a common case
with daily indices.
NOTE: Rewriting the aggregation doesn't work since the timezone is then also
used on the coordinating node to create empty buckets, which might be out of the
range of data that exists on the shard.
NOTE: In order to be able to get a shard context in the tests, I reused code
from the base query test case by creating a new parent test case for both
queries and aggregations: `AbstractBuilderTestCase`.
Mitigates #28727
This pipeline aggregation gives the user the ability to script functions that "move" across a window
of data, instead of single data points. It is the scripted version of MovingAvg pipeline agg.
Through custom script contexts, we expose a number of convenience methods:
- MovingFunctions.max()
- MovingFunctions.min()
- MovingFunctions.sum()
- MovingFunctions.unweightedAvg()
- MovingFunctions.linearWeightedAvg()
- MovingFunctions.ewma()
- MovingFunctions.holt()
- MovingFunctions.holtWinters()
- MovingFunctions.stdDev()
The user can also define any arbitrary logic via their own scripting, or combine with the above methods.
The TemplateUpgradeService is a system service that allows for plugins
to register templates that need to be upgraded. These template upgrades
should always happen in a system context as they are not a user
initiated action. For security integrations, the lack of running this
in a system context could lead to unexpected failures. The changes in
this commit set an empty system context for the execution of the
template upgrades performed by this service.
Relates #30603
When processing a top-level sibling pipeline, we destructively sublist
the path by assigning back onto the same variable. But if aggs are
specified such:
A. Multi-bucket agg in the first entry of our internal list
B. Regular agg as the immediate child of the multi-bucket in A
C. Regular agg with the same name as B at the top level, listed as the
second entry in our internal list
D. Finally, a pipeline agg with the path down to B
We'll get class cast exception. The first agg will sublist the path
from [A,B] to [B], and then when we loop around to check agg C,
the sublisted path [B] matches the name of C and it fails.
The fix is simple: we just need to store the sublist in a new object
so that the old path remains valid for the rest of the aggs in the loop
Closes#30608
* Fixes IndiceOptionsTests to serialise correctly
Previous to this change `IndicesOptionsTests.testSerialisation()` would
select a complete random version for both the `StreamOutput` and the
`StreamInput`. This meant that the output could be selected as 7.0+
while the input was selected as <7.0 causing the stream to be written
in the new format and read in teh old format (or vica versa). This
change splits the two cases into different test methods ensuring that
the Streams are at least on compatibile versions even if they are on
different versions.
* Use same random version for input and output streams
server/src/test/java/org/elasticsearch/action/support/IndicesOptionsTest
s.java
This change adds a `listTasks` method to the high level java
ClusterClient which allows listing running tasks through the
task management API.
Related to #27205
* Refactors ClientHelper to combine header logic
This change removes all the `*ClientHelper` classes which were
repeating logic between plugins and instead adds
`ClientHelper.executeWithHeaders()` and
`ClientHelper.executeWithHeadersAsync()` methods to centralise the
logic for executing requests with stored security headers.
* Removes Watcher headers constant