By default ML native processes are only allowed to use
30% of RAM, so the previous 2GB setting prevented the
test passing on VMs with only 4GB RAM. This change
reduces the limit to 1200MB, which means it can now
pass on VMs with 4GB RAM.
As the first record is random, there's a chance it will
be aligned on a bucket start. Thus we need to check the
bucket count is in [23, 24].
Closes#30715
These tests aim to check the set model memory limit is
respected. Additionally, it was asserting counts of
partition, by, over fields in an attempt to check that
the used memory is spent meaningfully. However, this
made the tests fragile, as changes in the ml-cpp could
lead to CI failures.
This commit removes those assertions. We are working on
adding tests in ml-cpp that will compensate.
diskspace and creates a subfolder for storing data outside of Lucene
indexes, but as part of the ES data paths.
Details:
- tmp storage is managed and does not allow allocation if disk space is
below a threshold (5GB at the moment)
- tmp storage is supposed to be managed by the native component but in
case this fails cleanup is provided:
- on job close
- on process crash
- after node crash, on restart
- available space is re-checked for every forecast call (the native
component has to check again before writing)
Note: The 1st path that has enough space is chosen on job open (job
close/reopen triggers a new search)
It is possible for state documents to be
left behind in the state index. This may be
because of bugs or uncontrollable scenarios.
In any case, those documents may take up quite
some disk space when they add up. This commit
adds a step in the expired data deletion that
is part of the daily maintenance service. The
new step searches for state documents that
do not belong to any of the current jobs and
deletes them.
Closes#30551
This commit fixes an issue with the data diagnostics were
empty buckets are not reported even though they should. Once
a job is reopened, the diagnostics do not get initialized from
the current data counts (especially the latest record timestamp).
The result is that if the data that is sent have a time gap compared
to the previous ones, that gap is not accounted for in the empty bucket
count.
This commit fixes that by initializing the diagnostics with the current
data counts.
Closes#30080
Tests need to wait for changes to the job's established memory usage to
propagate and an over enthusiastic optimisation meant jobs were updated
from stale state causing recent change to be lost.
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.