If a job is deleted and then GetJobs API is immediately called,
it is possible for a job to be returned in the response. This is likely
due to the GetJobs API being executed on a node with a slightly
stale cluster state which shows the job as still existing.
So we delegate to the master node so the list of jobs/tasks is current.
After routing to the master, we need to check if the rollup job
is in the PersistentTask's CS. A job can be acknowledged canceled,
removed from the CS, but the allocated task is still alive. So we
first check the CS to make sure it's really there before going to the
allocated task to get the status.
As extra precaution, when running local to the task, we also make
sure the task isn't canceled before including it in the response.
relates elastic/x-pack-elasticsearch#4041
Original commit: elastic/x-pack-elasticsearch@3b6fb65e12
`doSaveState` can be invoked on different types of failure. Some of
these failures are recoverable (e.g. search exception) which just cause
the job to reset until the next trigger time. Other exceptions might
be caused by an Abort request.
Previously `doSaveState` assumed that the indexer state would be
INDEXING, STOPPED or STARTED and asserted that. But if we are ABORTING
it failed the assertion, and in production would try to persist
that aborting state which is not needed (and may complicate matters later).
This commit removes the assertion and only tries to persist if we
are not aborting. If we're aborting, we just invoke the next handler
which is likely an onFailure handler.
Relates to elastic/x-pack-elasticsearch#4243
Original commit: elastic/x-pack-elasticsearch@3643b7c0e4
If there are multiple jobs that are all the "best" (e.g. share the
best interval) we have no way of knowing which is actually the best.
Unfortunately, we cannot just filter for all the jobs in a single
search because their doc_counts can potentially overlap.
To solve this, we execute an msearch-per-job so that the results
stay isolated. When rewriting the response, we iteratively
unroll and reduce the independent msearch responses into a single
"working tree". This allows us to intervene if there are
overlapping buckets and manually choose a doc_count.
Job selection is found by recursively descending through the aggregation
tree and independently pruning the list of valid job caps in each branch.
When a leaf node is reached in the branch, the remaining jobs are
sorted by "best'ness" (see comparator in RollupJobIdentifierUtils for the
implementation) and added to a global set of "best jobs". Once
all branches have been evaluated, the final set is returned to the
calling code.
Job "best'ness" is, briefly, the job(s) that have
- The largest compatible date interval
- Fewer and larger interval histograms
- Fewer terms groups
Note: the final set of "best" jobs is not guaranteed to be minimal,
there may be redundant effort due to independent branches choosing
jobs that are subsets of other branches.
Related changes:
- We have to include the job's ID in the rollup doc's
hash, so that different jobs don't overwrite the same summary
document.
- Now that we iteratively reduce the agg tree, the agg framework
injects empty buckets while we're working. In most cases this
is harmless, but for `avg` aggs the empty bucket is a SumAgg while
any unrolled versions are converted into AvgAggs... causing a cast
exception. To get around this, avg's are renamed to
`{source_name}.value` to prevent a conflict
- The job filtering has been pushed up into a query filter, since it
applies to the entire msearch rather than just individual agg components
- We no longer add a filter agg clause about the date_histo's interval, because
that is handled by the job validation and pruning.
Original commit: elastic/x-pack-elasticsearch@995be2a039
We had a Usage class before, but weren't registering it with XPack.
Would be nice to add more usage info in the future (like the running
jobs on each node), but unclear the best way to do it since we'd need
to filter through the list of allocated tasks.
Original commit: elastic/x-pack-elasticsearch@5207d2758b
This commit fixes the Javadocs for the class o.e.x.r.j.RollupIndexer as
these Javadocs were referring to instance methods on the class
incorrectly (using a this prefix).
Original commit: elastic/x-pack-elasticsearch@fdcc7338f9
This wraps the stream (`.streamInput()`) that is passed to many of the
`createParser` instances in the enclosing (or a new) try-with-resources block.
This ensures the `BytesReference.streamInput()` is closed.
Relates to elastic/x-pack-elasticsearch#28504
Original commit: elastic/x-pack-elasticsearch@7546e3b4d4
Use AggregatorTestCase's `newIndexSearcher()` instead. Lucene's
version can randomly wrap with IndexReader with things we can't handle
like ParallelCompositeReader
Original commit: elastic/x-pack-elasticsearch@b4c0e9a601
The test job helper randomizes the index name with 1-10 characters,
which can lead to randomized index names to overlap and show fewer
caps than the test expects.
The solution is to just use index names "0"-"24" to ensure none
of the names overlap, and thus the caps don't overlap.
Original commit: elastic/x-pack-elasticsearch@74a6d13213
The arrangement of the final latch meant the latch could countdown,
then the test ends before the await() triggers which caused the
thread to be interrupted and fail. The whole arrangement was incorrect
anyhow.
We need to await the latch before sending the search response as before,
but move the final atomicBoolean to the second time the persistent
task status is updated which is a signal that we are done
and can end the test
If these tests continues to be flaky, we should probably just remove them.
The headers are tested elsewhere and not required to be tested in this
context.
Original commit: elastic/x-pack-elasticsearch@0cf5603972
Incorrect latch caused this test to run slowly (until the await
finished), and could probably cause failure due to incorrect ordering
Original commit: elastic/x-pack-elasticsearch@ebeb8655da
The latches were not placed correctly, allowing the aborts
to be set before we checked the state for Indexing the first time.
This was due to using the DelayingIndexer's built in latch, which
isn't placed quite where we needed it.
Original commit: elastic/x-pack-elasticsearch@590cfa07b0
This adds a new Rollup module to XPack, which allows users to configure periodic "rollup jobs" to pre-aggregate data. That data is then available later for search through a special RollupSearch API, which mimics the DSL and functionality of regular search.
Rollups are used to drastically reduce the on-disk footprint of metric-based data (e.g. timestamped document with numeric and keyword fields). It can also be used to speed up aggregations over large datasets, since the rolled data will be considerably smaller and fewer documents to search.
The PR adds seven new endpoints to interact with Rollups; create/get/delete job, start/stop job, a capabilities API similar to field-caps, and a Rollup-enabled search.
Original commit: elastic/x-pack-elasticsearch@dcde91aacf