Commit Graph

142 Commits

Author SHA1 Message Date
Jim Ferenczi ec63160243 Fix max boundary for rollups job that use a delay (#42158)
Rollup jobs can define how long they should wait before rolling up new documents.
However if the delay is smaller or if it's not a multiple of the rollup interval
the job can create incomplete buckets because the max boundary for a job is computed
from the time when the job started rounded to the interval minus the delay. This change
fixes this computation by applying the delay substraction before the rounding in order to ensure
that we never create a boundary that falls in a middle of a bucket.
2019-05-21 08:48:53 +02:00
Zachary Tong 6ae6f57d39
[7.x Backport] Force selection of calendar or fixed intervals (#41906)
The date_histogram accepts an interval which can be either a calendar
interval (DST-aware, leap seconds, arbitrary length of months, etc) or
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.

This leads to confusing arrangement where `1d` == calendar, but
`2d` == fixed.  And if you want a day of fixed time, you have to
specify `24h` (e.g. the next smallest unit).  This arrangement is very
error-prone for users.

This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.

The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the
old dual-purpose interval will be removed.

The change applies to both REST and java clients.
2019-05-20 12:07:29 -04:00
Zachary Tong f410f91f13 Cleanup RollupSearch exceptions, disallow partial results (#41272)
- msearch exceptions should be thrown directly instead of wrapping
in a RuntimeException
- Do not allow partial results (where some indices are missing), 
instead throw an exception if any index is missing
2019-05-08 12:38:42 -04:00
Ryan Ernst 6fd8924c5a Switch run task to use real distro (#41590)
The run task is supposed to run elasticsearch with the given plugin or
module. However, for modules, this is most realistic if using the full
distribution. This commit changes the run setup to use the default or
oss as appropriate.
2019-05-06 12:34:07 -07:00
Hendrik Muhs 0c03707704 [ML-DataFrame] reset/clear the position after indexer is done (#41736)
reset/clear the position after indexer is done
2019-05-06 09:41:51 +02:00
Zachary Tong ec5dd0594f Disallow null/empty or duplicate composite sources (#41359)
Adds some validation to prevent duplicate source names from being
used in the composite agg.

Also refactored to use a ConstructingObjectParser and removed the
private ctor and setter for sources, making it mandatory.
2019-04-24 13:23:31 -04:00
Zachary Tong 7e62ff2823 [Rollup] Validate timezones based on rules not string comparision (#36237)
The date_histogram internally converts obsolete timezones (such as
"Canada/Mountain") into their modern equivalent ("America/Edmonton").
But rollup just stored the TZ as provided by the user.

When checking the TZ for query validation we used a string comparison,
which would fail due to the date_histo's upgrading behavior.

Instead, we should convert both to a TimeZone object and check if their
rules are compatible.
2019-04-17 13:46:44 -04:00
Hendrik Muhs 3df6798c4c Rollup/DataFrame: disallow partial results (#41114)
disallow partial results in rollup and data frame, after this change the client throws an error directly
replacing the previous runtime exception thrown, allowing better error handling in implementations.
2019-04-12 07:31:22 +02:00
Hendrik Muhs c37b127a07 fix a timing issue: isFinished is used for a busy loop in testing, (#41055)
test: ensure state is persisted before the isFinished is changed

fixes #41046
2019-04-10 18:47:34 +02:00
Julie Tibshirani 0d5f86a001 Mute RollupIndexerStateTests#testIndexing.
Tracked in #41046.
2019-04-09 17:17:04 -07:00
Hendrik Muhs d5fcbf2f4a refactor onStart and onFinish to take runnables and executed them guarded by state (#40855)
refactor onStart and onFinish to take action listeners and execute them when indexer is in indexing state.
2019-04-07 21:46:37 +02:00
Jim Ferenczi a15f55b2de Rollup ignores time_zone on date histogram (#40844)
When translating the original aggregation for the rollup indices,
the timezone of the date histogram is validated against the rollup
job but the value is not copied in the newly created date_histogram.
2019-04-04 21:16:50 +02:00
Zachary Tong abbfc75052 Remove timezone validation on rollup range queries (#40647)
We enforced the timezone of range queries when using the rollup
search endpoint, but this validation is not needed.  Since
rollup dates are stored in UTC, and range queries are always
converted to UTC (even if specifying a `time_zone`) the validation
is not needed and can prevent legitimate queries from running.
2019-04-02 14:25:16 -04:00
Jay Modi 697911c31d
Fixed missed stopping of SchedulerEngine (#39193)
The SchedulerEngine is used in several places in our code and not all
of these usages properly stopped the SchedulerEngine, which could lead
to test failures due to leaked threads from the SchedulerEngine. This
change adds stopping to these usages in order to avoid the thread leaks
that cause CI failures and noise.

Closes #38875
2019-02-21 14:31:33 -07:00
Alexander Reelsen daa2ec8a60
Switch mapping/aggregations over to java time (#36363)
This commit moves the aggregation and mapping code from joda time to
java time. This includes field mappers, root object mappers, aggregations with date
histograms, query builders and a lot of changes within tests.

The cut-over to java time is a requirement so that we can support nanoseconds
properly in a future field mapper.

Relates #27330
2019-01-23 10:40:05 +01:00
Zachary Tong de52ba1f78 Fix RollupDocumentation test to wait for job to stop
Also adds some extra state debug information to various log messages
2019-01-11 14:14:58 -05:00
Jim Ferenczi e38cf1d0dc
Add the ability to set the number of hits to track accurately (#36357)
In Lucene 8 searches can skip non-competitive hits if the total hit count is not requested.
It is also possible to track the number of hits up to a certain threshold. This is a trade off to speed up searches while still being able to know a lower bound of the total hit count. This change adds the ability to set this threshold directly in the track_total_hits search option. A boolean value (true, false) indicates whether the total hit count should be tracked in the response. When set as an integer this option allows to compute a lower bound of the total hits while preserving the ability to skip non-competitive hits when enough matches have been collected.

Relates #33028
2019-01-04 20:36:49 +01:00
Josh Soref 1df66d21fe Spelling: replace uknown with unknown (#37056) 2019-01-02 17:33:02 +01:00
Zachary Tong 6d49873ab7
Fix Rollup's metadata parser (#36791)
The parser used for rollup configs in _meta fields was not able to
handle unrelated data in the meta field.  If an unrelated object
was encountered, it would half-consume the JSON object, realize it
wasn'ta rollup config, then stop parsing.  This would leave the object
halfway consumed and the parsing framework would throw an exception.

This commit replaces the parsing logic with a set of minimal parsers,
each for the specific component we care about (`_doc`, `_meta`,
`_rollup`) and configured to ignore unknown fields where applicable.

More verbose, but less hacky than before and should be more robust.

Also adds tests (randomized and explicit) to make sure this doesn't
break in the future.
2018-12-18 16:35:39 -05:00
Ryan Ernst c4f4378006
Core: Rework multi date formatter merging (#36447)
This commit moves the MergedDateFormatter to a package private class and
reworks joda DateFormatter instances to use that instead of a single
DateTimeFormatter with multiple parsers. This will allow the java and
joda multi formats to share the same format parsing method in a
followup.
2018-12-11 23:47:44 -08:00
Nik Everett 03daad9812
Re-deprecate xpack rollup endpoints (#36451)
Redeprecates the `/_xpack/rollup` endpoints in favor of `/_rollup`.

When we cleanup the rollup in a cluster containing 6.x nodes we need to
use `/_xpack/rollup` instead of `/_rollup` because the 6.x nodes don't
know about `/_rollup`. In those cases we must ignore the deprecation
warnings that the 7.0 node will return for the end point.

Closes #36044
2018-12-11 19:43:17 -05:00
Jason Tedor 0909a631ba
Add non-X-Pack centric rollup endpoints (#36383)
* Add non-X-Pack centric rollup endpoints

This commit adds new endpoints for rollup that do not have _xpack in
their path. The purpose for this change is to take these endpoints into
6.x as well so that they can be available in mixed cluster tests too. A
follow-up change will deprecate the use of _xpack in the rollup
endpoints. And finally, in the future, we would remove the _xpack
endpoints.

* Remove import

* Fix typo
2018-12-10 14:50:30 -05:00
Ryan Ernst a27f2efca5
Core: Converge FormatDateTimeFormatter and DateFormatter apis (#36390)
This commit makes FormatDateTimeFormatter and DateFormatter apis close
to each other, so that the former can be removed in favor of the latter.
This PR does not change the uses of FormatDateTimeFormatter yet, so that
that future change can be purely mechanical.
2018-12-07 17:23:41 -08:00
Nik Everett ead2b9e08b
HLRC: Add rollup search (#36334)
Relates to #29827
2018-12-07 14:39:58 -05:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Martijn van Groningen 11935cd480
Replace Streamable w/ Writeable in BaseTasksResponse and subclasses (#36176)
This commit replaces usages of Streamable with Writeable for the
BaseTasksResponse / TransportTasksAction classes and subclasses of
these classes.

Note that where possible response fields were made final.

Relates to #34389
2018-12-05 13:14:10 +01:00
Martijn van Groningen 43773a32a4
Replace Streamable w/ Writeable in BaseTasksRequest and subclasses (#35854)
* Replace Streamable w/ Writeable in BaseTasksRequest and subclasses

This commit replaces usages of Streamable with Writeable for the
BaseTasksRequest / TransportTasksAction classes and subclasses of
these classes.

Relates to #34389
2018-12-03 08:04:29 +01:00
Zachary Tong 61c2db5ebb Revert "Deprecate X-Pack centric rollup endpoints (#35962)"
This reverts commit b84f1f6a3a.
2018-11-29 12:58:23 -05:00
Jim Ferenczi 8a7f3f75f3
Add support for rest_total_hits_as_int (#36051)
The support for rest_total_hits_as_int has already been merged to 6x
in #35848 so this change adds this new option to master. The plan was
to add this new option as part of #35848 but we've decided to wait a few
days before merging this breaking change so this commit just handles
the new option as a noop exactly like 6x for now. This will allow
users to migrate to this parameter before #35848 is merged.

Relates #33028
2018-11-29 18:36:16 +01:00
Jason Tedor b84f1f6a3a
Deprecate X-Pack centric rollup endpoints (#35962)
This commit is part of our plan to deprecate and ultimately remove the
use of _xpack in the REST APIs.
2018-11-27 20:34:17 -05:00
Zachary Tong 48fa251812
[Rollup] Add more diagnostic stats to job (#35471)
* [Rollup] Add more diagnostic stats to job

To help debug future performance issues, this adds the
 min/max/avg/count/total latencies (in milliseconds) for search
and bulk phase.  This latency is the total service time including
transfer between nodes, not just the `took` time.

It also adds the count of search/bulk failures encountered during
runtime.  This information is also in the log, but a runtime counter
will help expose problems faster

* review cleanup

* Remove dead ParseFields
2018-11-27 15:46:10 -05:00
Zachary Tong c346a0f027
[Rollup] Add `wait_for_completion` option to StopRollupJob API (#34811)
This adds a `wait_for_completion` flag which allows the user to block 
the Stop API until the task has actually moved to a stopped state, 
instead of returning immediately.  If the flag is set, a `timeout` parameter
can be specified to determine how long (at max) to block the API
call.  If unspecified, the timeout is 30s.

If the timeout is exceeded before the job moves to STOPPED, a
timeout exception is thrown.  Note: this is just signifying that the API
call itself timed out.  The job will remain in STOPPING and evenutally
flip over to STOPPED in the background.

If the user asks the API to block, we move over the the generic
threadpool so that we don't hold up a networking thread.
2018-11-13 16:37:17 -05:00
Jason Tedor 4f4fc3b8f8
Replicate index settings to followers (#35089)
This commit uses the index settings version so that a follower can
replicate index settings changes as needed from the leader.

Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
2018-11-07 21:20:51 -05:00
Alexander Reelsen 409050e8de
Refactor: Remove settings from transport action CTOR (#35208)
As settings are not used in the transport action constructor, this
removes the passing of the settings in all the transport actions.
2018-11-05 13:08:18 +01:00
Nik Everett e28509fbfe
Core: Less settings to AbstractComponent (#35140)
Stop passing `Settings` to `AbstractComponent`'s ctor. This allows us to
stop passing around `Settings` in a *ton* of places. While this change
touches many files, it touches them all in fairly small, mechanical
ways, doing a few things per file:
1. Drop the `super(settings);` line on everything that extends
`AbstractComponent`.
2. Drop the `settings` argument to the ctor if it is no longer used.
3. If the file doesn't use `logger` then drop `extends
AbstractComponent` from it.
4. Clean up all compilation failure caused by the `settings` removal
and drop any now unused `settings` isntances and method arguments.

I've intentionally *not* removed the `settings` argument from a few
files:
1. TransportAction
2. AbstractLifecycleComponent
3. BaseRestHandler

These files don't *need* `settings` either, but this change is large
enough as is.

Relates to #34488
2018-10-31 21:23:20 -04:00
Zachary Tong f9dd33a0b9
[Rollup] Proactively resolve index patterns in RollupSearch endoint (#34930)
This changes the RollupSearch endpoint to proactively resolve index
patterns.  If the index pattern(s) match more than one rollup index,
an exception is throw as before.  But if the pattern only matches one
rollup index, execution is allowed to continue (unlike before where
it would assume all patterns were for raw data).

This also allows the search endpoint to resolve aliases that point to
a rollup index.

Also tweaks the documentation to make this clear.

Closes #34828
2018-10-30 13:50:50 -04:00
Pratik Sanglikar f1135ef0ce Core: Replace deprecated Loggers calls with LogManager. (#34691)
Replace deprecated Loggers calls with LogManager.

Relates to #32174
2018-10-29 15:52:30 -04:00
Andrey Atapin 5f588180f9 Improve IndexNotFoundException's default error message (#34649)
This commit adds the index name to the error message when an index is not found.
2018-10-24 12:53:31 -07:00
Benjamin Trent cd27b0b996
Revert "Rollup add default metrics to histo groups (#34534)" (#34815)
This reverts commit 4236358f5d.
2018-10-24 14:25:10 -05:00
Julie Tibshirani 90fd15bb56 Mute RollupIndexerIndexingTests#testRandomizedDateHisto as we await a fix. 2018-10-23 11:14:43 -07:00
Zachary Tong 4dbf498721
[Rollup] Job deletion should be invoked on the allocated task (#34574)
We should delete a job by directly talking to the allocated 
task and telling it to shutdown. Today we shut down a job 
via the persistent task framework. This is not ideal because, 
while the job has been removed from the persistent task 
CS, the allocated task continues to live until it gets the 
shutdown message.

This means a user can delete a job, immediately delete 
the rollup index, and then see new documents appear in
 the just-deleted index. This happens because the indexer
 in the allocated task is still running and indexes a few 
more documents before getting the shutdown command.

In this PR, the transport action is changed to a TransportTasksAction, 
and we invoke onCancelled() directly on the matching job. 
The race condition still exists after this PR (albeit less likely), 
but this was a precursor to fixing the issue and a self-contained
chunk of code. A second PR will followup to fix the race itself.
2018-10-23 12:23:22 -04:00
Benjamin Trent 4236358f5d
Rollup add default metrics to histo groups (#34534)
* Rollup: Adding default metrics for histo group timefield (#34379)

* Rollup: Adding default histo metrics and tests

* fixing failing client side test with new default values

* Adding HLRC docs for default values

* Addressing PR comments

* Removing value_count default agg

* Updating docs for rollups

* Minor interval change
2018-10-19 07:23:25 -05:00
Zachary Tong 45546e71c2
Add GetRollupCaps API to high level rest client (#32880)
Adds GetRollupCaps API to the HLRC, and tweaks some of the
Caps objects to be immutable.  Also various style tweaks
2018-10-18 17:12:38 -04:00
Zachary Tong ca51fb6873
[Rollup] Add support for date histo `format` (#34537)
Adds support for query-time formatting of the date histo keys
when executing a rollup search.

Closes #34391
2018-10-18 12:12:17 -04:00
Kazuhiro Sera d45fe43a68 Fix a variety of typos and misspelled words (#32792) 2018-10-03 18:11:38 +01:00
Benjamin Trent 10201e06cb
Allowing {index}/_xpack/rollup/data to accept comma delimited list (#34115)
* Allowing `{index}/_xpack/rollup/data` to accept comma delimited list

* Address PR comments
2018-10-02 06:21:46 -07:00
Hendrik Muhs e2f310b56c
Fix AggregationFactories.Builder equality and hash regarding order (#34005)
Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization
and deserialization. This ensures storing configs with aggregation works properly.

This also addresses a potential issue in caching when the same query contains aggregations but in 
different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in 
the cache.
2018-09-28 13:30:50 +02:00
Ryan Ernst 7800b4fa91
Core: Abstract DateMathParser in an interface (#33905)
This commits creates a DateMathParser interface, which is already
implemented for both joda and java time. While currently the java time
DateMathParser is not used, this change will allow a followup which will
create a DateMathParser from a DateFormatter, so the caller does not
need to know the internals of the DateFormatter they have.
2018-09-26 07:56:25 -07:00
Christoph Büscher ba3ceeaccf
Clean up "unused variable" warnings (#31876)
This change cleans up "unused variable" warnings. There are several cases were we 
most likely want to suppress the warnings (especially in the client documentation test
where the snippets contain many unused variables). In a lot of cases the unused
variables can just be deleted though.
2018-09-26 14:09:32 +02:00
Tanguy Leroux e77835c6f5
Add create rollup job api to high level rest client (#33521)
This commit adds the Create Rollup Job API to the high level REST
client. It supersedes #32703 and adds dedicated request/response
objects so that it does not depend on server side components.

Related #29827
2018-09-17 09:10:23 +02:00
Jim Ferenczi 79cd6385fe
Collapse package structure for metrics aggs (#33463)
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.

Relates #22868
2018-09-07 10:58:06 +02:00
Zachary Tong 90ce3a6224 [Rollup] Fix Caps Comparator to handle calendar/fixed time (#33336)
The comparator used TimeValue parsing, which meant it couldn't handle
calendar time.  This fixes the comparator to handle either (and potentially
mixed).  The mixing shouldn't be an issue since the validation code
upstream will prevent it, but was simplest to allow the comparator
to handle both.
2018-09-03 10:49:19 +02:00
Zachary Tong d93b2a2e9a
[Rollup] Only allow aggregating on multiples of configured interval (#32052)
We need to limit the search request aggregations to whole multiples
of the configured interval for both histogram and date_histogram.
Otherwise, agg buckets won't overlap with the rolled up buckets
and the results will be incorrect.

For histogram, the validation is very simple: request must be >= the config,
and modulo evenly.

Dates are more tricky.
- If both request and config are fixed dates, we can convert to millis
and treat them just like the histo
- If both are calendar, we make sure the request is >= the config with
a static lookup map that ranks the calendar values relatively.  All
calendar units are "singles", so they are evenly divisible already
- We disallow any other combination (one fixed, one calendar, etc)
2018-08-29 17:10:00 -04:00
Hendrik Muhs cfc003d485 [Rollup] Re-factor Rollup Indexer into a generic indexer for re-usability (#32743)
This extracts a super class out of the rollup indexer called the AsyncTwoPhaseIterator. 
The implementor of it can define the query, transformation of the response, 
indexing and the object to persist the position/state of the indexer.

The stats object used by the indexer to record progress is also now abstract, allowing
the implementation provide custom stats beyond what the indexer provides.  It also
allows the implementation to decide how the stats are presented (leaves toXContent()
up to the implementation).

This should allow new projects to reuse the search-then-index persistent task that Rollup
uses, but without the restrictions/baggage of how Rollup has to work internally to
satisfy time-based rollups.
2018-08-29 14:28:21 -04:00
Zachary Tong 353112a033
[Rollup] Better error message when trying to set non-rollup index (#32965)
We don't allow the user to configure a rollup index against an
existing index, but the exceptions that we return are not clear about
that.  They indicate issues with metadata, instead of stating
the real reason (not allowed to use a non-rollup index to store
rollup data).

This makes the exception better, and adds a bit more testing
2018-08-28 11:50:35 -04:00
Tanguy Leroux e1e8cf382f
[Rollup] Move toBuilders() methods out of rollup config objects (#32585) 2018-08-27 09:18:26 +02:00
Tanguy Leroux 879a90b999
[Rollup] Move getMetadata() methods out of rollup config objects (#32579)
This committ removes the getMetadata() methods from the DateHistoGroupConfig 
and HistoGroupConfig objects. This way the configuration objects do not rely on RollupField.formatMetaField() anymore and do not expose a getMetadata() 
method that is tighlty coupled to the rollup indexer.
2018-08-24 11:57:46 +02:00
Zachary Tong 8f8d3a5556
[Rollup] Return empty response when aggs are missing (#32796)
If a search request doesn't contain aggs (or an empty agg object),
we should just retun an empty response.  This is how the normal search
API works if you specify zero hits and empty aggs.

The existing behavior throws an exception because it tries to send
an empty msearch.

Closes #32256
2018-08-23 16:15:37 -04:00
Nik Everett 2c81d7f77e
Build: Rework shadow plugin configuration (#32409)
This reworks how we configure the `shadow` plugin in the build. The major
change is that we no longer bundle dependencies in the `compile` configuration,
instead we bundle dependencies in the new `bundle` configuration. This feels
more right because it is a little more "opt in" rather than "opt out" and the
name of the `bundle` configuration is a little more obvious.

As an neat side effect of this, the `runtimeElements` configuration used when
one project depends on another now contains exactly the dependencies needed
to run the project so you no longer need to reference projects that use the
shadow plugin like this:

```
testCompile project(path: ':client:rest-high-level', configuration: 'shadow')
```

You can instead use the much more normal:

```
testCompile "org.elasticsearch.client:elasticsearch-rest-high-level-client:${version}"
```
2018-08-21 20:03:28 -04:00
Nik Everett 462e91d362
Logging: Use settings when building daemon threads (#32751)
Subclasses of `EsIntegTestCase` run multiple Elasticsearch nodes in the
same JVM and when we log we look at the name of the thread to figure out
the node name. This makes sure that all calls to `daemonThreadFactory`
include the node name.

Closes #32574

I'd like to follow this up with more drastic changes that make it
impossible to do this incorrectly but that change is much larger than
this and I'd like to get these log lines fixed up sooner rather than
later.
2018-08-20 13:53:15 -04:00
Lee Hinman 48281ac5bc
Use generic AcknowledgedResponse instead of extended classes (#32859)
This removes custom Response classes that extend `AcknowledgedResponse` and do nothing, these classes are not needed and we can directly use the non-abstract super-class instead.

While this appears to be a large PR, no code has actually changed, only class names have been changed and entire classes removed.
2018-08-15 08:06:14 -06:00
Tanguy Leroux 2e65bac5dd
[Rollup] Remove builders from RollupJobConfig (#32669) 2018-08-07 18:54:42 +02:00
Tanguy Leroux 1122314b3b
[Rollup] Remove builders from GroupConfig (#32614) 2018-08-07 09:39:24 +02:00
Zachary Tong fc9fb64ad5
[Rollup] Improve ID scheme for rollup documents (#32558)
Previously, we were using a simple CRC32 for the IDs of rollup documents.
This is a very poor choice however, since 32bit IDs leads to collisions
between documents very quickly.

This commit moves Rollups over to a 128bit ID.  The ID is a concatenation
of all the keys in the document (similar to the rolling CRC before),
hashed with 128bit Murmur3, then base64 encoded.  Finally, the job
ID and a delimiter (`$`) are prepended to the ID.

This gurantees that there are 128bits per-job.  128bits should
essentially remove all chances of collisions, and the prepended
job ID means that _if_ there is a collision, it stays "within"
the job.

BWC notes:

We can only upgrade the ID scheme after we know there has been a good
checkpoint during indexing.  We don't rely on a STARTED/STOPPED
status since we can't guarantee that resulted from a real checkpoint,
or other state.  So we only upgrade the ID after we have reached
a checkpoint state during an active index run, and only after the
checkpoint has been confirmed.

Once a job has been upgraded and checkpointed, the version increments
and the new ID is used in the future.  All new jobs use the
new ID from the start
2018-08-03 11:13:25 -04:00
Tanguy Leroux 21f660d801
[Rollup] Remove builders from DateHistogramGroupConfig (#32555)
Same motivation as #32507 but for the DateHistogramGroupConfig
configuration object. This pull request also changes the format of the
time zone from a Joda's DateTimeZone to a simple String.

It should help to port the API to the high level rest client and allows
clients to not be forced to use the Joda Time library. Serialization is
impacted but does not need a backward compatibility layer as
DateTimeZone are serialized as String anyway. XContent also expects
a String for timezone, so I found it easier to move everything to String.

Related to #29827
2018-08-03 13:11:00 +02:00
Tanguy Leroux 937dcfd716
[Rollup] Remove builders from MetricConfig (#32536)
Related to #29827
2018-08-03 10:01:20 +02:00
Tanguy Leroux 08e4f4be42
[Rollup] Remove builders from HistoGroupConfig (#32533)
Related to #29827
2018-08-02 17:55:00 +02:00
Tanguy Leroux 82fe67b225
[Rollup] Remove builders from TermsGroupConfig (#32507)
While working on adding the Create Rollup Job API to the 
high level REST client (#29827), I noticed that the configuration 
objects like TermsGroupConfig rely on the Builder pattern in 
order to create or parse instances. These builders are doing 
some validation but the same validation could be done within 
the constructor itself or on the server side when appropriate.

This commit removes the builder for TermsGroupConfig, 
removes some other methods that I consider not really usefull 
once the TermsGroupConfig object will be exposed in the 
high level REST client. It also simplifies the parsing logic.

Related to #29827
2018-08-01 09:43:32 +02:00
Zachary Tong 6cf7588c3d
[TEST] Fix failure due to exception message in java11 (#32321)
Java 11 uses more verbose exceptions messages, causing this assertion
to fail.  Changed the test to be less restrictive and only look
for the classes we care about.
2018-07-25 11:34:26 -04:00
Nik Everett e6b9f59e4e
Build: Shadow x-pack:protocol into x-pack:plugin:core (#32240)
This bundles the x-pack:protocol project into the x-pack:plugin:core
project because we'd like folks to consider it an implementation detail
of our build rather than a separate artifact to be managed and depended
on. It is now bundled into both x-pack:plugin:core and
client:rest-high-level. To make this work I had to fix a few things.

Firstly, I had to make PluginBuildPlugin work with the shadow plugin.
In that case we have to bundle only the `shadow` dependencies and the
shadow jar.

Secondly, every reference to x-pack:plugin:core has to use the `shadow`
configuration. Without that the reference is missing all of the
un-shadowed dependencies. I tried to make it so that applying the shadow
plugin automatically redefines the `default` configuration to mirror the
`shadow` configuration which would allow us to use bare project references
to the x-pack:plugin:core project but I couldn't make it work. It'd *look*
like it works but then fail for transitive dependencies anyway. I think
it is still a good thing to do but I don't have the willpower to do it
now.

Finally, I had to fix an issue where Eclipse and IntelliJ didn't properly
reference shadowed transitive dependencies. Neither IDE supports shadowing
natively so they have to reference the shadowed projects. We fix this by
detecting `shadow` dependencies when in "Intellij mode" or "Eclipse mode"
and adding `runtime` dependencies to the same target. This convinces
IntelliJ and Eclipse to play nice.
2018-07-24 11:53:04 -04:00
Zachary Tong 6ba144ae31
Add WeightedAvg metric aggregation (#31037)
Adds a new single-value metrics aggregation that computes the weighted 
average of numeric values that are extracted from the aggregated 
documents. These values can be extracted from specific numeric
fields in the documents.

When calculating a regular average, each datapoint has an equal "weight"; it
contributes equally to the final value.  In contrast, weighted averages
scale each datapoint differently.  The amount that each datapoint contributes 
to the final value is extracted from the document, or provided by a script.

As a formula, a weighted average is the `∑(value * weight) / ∑(weight)`

A regular average can be thought of as a weighted average where every value has
an implicit weight of `1`.

Closes #15731
2018-07-23 18:33:15 -04:00
Jim Ferenczi 644a92f158
Fix rollup on date fields that don't support epoch_millis (#31890)
The rollup indexer uses a range query to select the next page
of results based on the last time bucket of the previous round
and the `delay` configured on the rollup job. This query uses
the `epoch_millis` format implicitly but doesn't set the `format`.
This result in errors during the rollup job if the field
definition doesn't allow this format. It can also miss documents
if the format is not accepted but another format in the field
definition is able to parse the query (e.g.: `epoch_second`).
This change ensures that we use `epoch_millis` as the only format
to parse the rollup range query.
2018-07-19 09:34:23 +02:00
Zachary Tong 791b9b147c
[Rollup] Add new capabilities endpoint for concrete rollup indices (#30401)
This introduces a new GetRollupIndexCaps API which allows the user to retrieve rollup capabilities of a specific rollup index (or index pattern). This is distinct from the existing RollupCaps endpoint.

- Multiple jobs can be stored in multiple indices and point to a single target data index pattern (logstash-*). The existing API finds capabilities/config of all jobs matching that data index pattern.
- One rollup index can hold data from multiple jobs, targeting multiple data index patterns. This new API finds the capabilities based on the concrete rollup indices.
2018-07-16 17:20:50 -04:00
Zachary Tong 59191b4998
[Rollup] Replace RollupIT with a ESRestTestCase version (#31977)
The old RollupIT was a node IT, an flaky for a number of reasons.
This new version is an ESRestTestCase and should be a little more robust.

This was added to the multi-node QA tests as that seemed like the most
appropriate location.  It didn't seem necessary to create a whole new
QA module.

Note: The only test that was ported was the "Big" test for validating
a larger dataset.  The rest of the tests are represented in existing
yaml tests.

Closes #31258
Closes #30232
Related to #30290
2018-07-16 10:47:46 -04:00
Zachary Tong b7f07f03ed
[Rollup] Use composite's missing_bucket (#31402)
We can leverage the composite agg's new `missing_bucket` feature on
terms groupings.  This means the aggregation criteria used in the indexer
will now return null buckets for missing keys.  

Because all buckets are now returned (even if a key is null),
we can guarantee correct doc counts with
"combined" jobs (where a job rolls up multiple schemas).  This was
previously impossible since composite would ignore documents that
didn't have _all_ the keys, meaning non-overlapping schemas would
cause composite to return no buckets.

Note: date_histo does not use `missing_bucket`, since a timestamp is
always required.

The docs have been adjusted to recommend a single, combined job.  It
also makes reference to the previous issue to help users that are upgrading
(rather than just deleting the sections).
2018-07-13 10:07:42 -04:00
Vladimir Dolzhenko 6acb591012 mark RollupIT.testTwoJobsStartStopDeleteOne as AwaitsFix 2018-07-05 10:03:10 +02:00
Alpar Torok 0afec8f31c
Remove deprecation warnings to prepare for Gradle 5 (sourceSets.main.output.classesDirs) (#30389)
* Remove deprecation warnings to prepare for Gradle 5

Gradle replaced `project.sourceSets.main.output.classesDir` of type
`File` with `project.sourceSets.main.output.classesDirs` of type
`FileCollection`
(see [SourceSetOutput](https://github.com/gradle/gradle/blob/master/subprojects/plugins/src/main/java/org/gradle/api/tasks/SourceSetOutput.java))
Build output is now stored on a per language folder.

There are a few places where we use that, here's these and how it's
fixed:

- Randomized Test execution
    - look in all test folders ( pass the multi dir configuration to the
    ant runner )
    - DRY the task configuration by introducing `basedOn` for
      `RandomizedTestingTask` DSL
- Extend the naming convention test to support passing in multiple
  directories
- Fix the standalon test plugin, the dires were not passed trough,
  checked with a debuger and the statement had no affect due to a
  missing `=`.

Closes #30354

* Only check Java tests, PR feedback

- Name checker was ran for Groovy tests that don't adhere to the same
  convections causing the check to fail
- implement PR feedback

* Replace `add` with `addAll`

This worked because the list is passed to `project.files` that does the
right thing.

* Revert "Only check Java tests, PR feedback"

This reverts commit 9bd9389875d8b88aadb50df57a45cd0d2b073241.

* Remove `basedOn` helper

* Bring some changes back

Previus revert accidentally reverted too much

* Fix negation

* add back public

* revert name check changes

* Revert "revert name check changes"

This reverts commit a2800c0b363168339ea65e2a79ec8256e5883e6d.

* Pass all dirs to name check

Only run on Java for build-tools, this is safe because it's a self test.
It needs more work before we could pass in the Groovy classes as well as
these inherit from `GroovyTestCase`

* remove self tests from name check

The self complicates the task setup and disable real checks on
build-tools.
With this change there are no more self tests, and the build-tools tests
adhere to the conventions.
The self test will be replaced by gradle test kit, thus the addition of
the Gradle plugin builder plugin.

* First test to run a Gradle build

* Add tests that replace the name check self test

* Clean up integ test base class

* Always run tests

* Align with test naming conventions

* Make integ. test case inherit from unit test case

The check requires this

* Remove `import static org.junit.Assert.*`
2018-06-28 15:14:34 +03:00
Alpar Torok 8557bbab28
Upgrade gradle wrapper to 4.8 (#31525)
* Move to Gradle 4.8 RC1

* Use latest version of plugin

The current does not work with Gradle 4.8 RC1

* Switch to Gradle GA

* Add and configure build compare plugin

* add work-around for https://github.com/gradle/gradle/issues/5692

* work around https://github.com/gradle/gradle/issues/5696

* Make use of Gradle build compare with reference project

* Make the manifest more compare friendly

* Clear the manifest in compare friendly mode

* Remove animalsniffer from buildscript classpath

* Fix javadoc errors

* Fix doc issues

* reference Gradle issues in comments

* Conditionally configure build compare

* Fix some more doclint issues

* fix typo in build script

* Add sanity check to make sure the test task was replaced

Relates to #31324. It seems like Gradle has an inconsistent behavior and
the taks is not always replaced.

* Include number of non conforming tasks in the exception.

* No longer replace test task, create implicit instead

Closes #31324. The issue has full context in comments.

With this change the `test` task becomes nothing more than an alias for `utest`.
Some of the stand alone tests that had a `test` task now have `integTest`, and a
few of them that used to have `integTest` to run multiple tests now only
have `check`.
This will also help separarate unit/micro tests from integration tests.

* Revert "No longer replace test task, create implicit instead"

This reverts commit f1ebaf7d93e4a0a19e751109bf620477dc35023c.

* Fix replacement of the test task

Based on information from gradle/gradle#5730 replace the task taking
into account the task providres.
Closes #31324.

* Only apply build comapare plugin if needed

* Make sure test runs before integTest

* Fix doclint aftter merge

* PR review comments

* Switch to Gradle 4.8.1 and remove workaround

* PR review comments

* Consolidate task ordering
2018-06-28 08:13:21 +03:00
Tanguy Leroux be9292cac6
[Test] Add full cluster restart test for Rollup (#31533)
This pull request adds a full cluster restart test for a Rollup job. 
The test creates and starts a Rollup job on the cluster and checks 
that the job already exists and is correctly started on the upgraded 
cluster.

This test allows to test that the persistent task state is correctly 
parsed from the cluster state after the upgrade, as the status field 
has been renamed to state in #31031.

The test undercovers a ClassCastException that can be thrown in 
the RollupIndexer when the timestamp as a very low value that fits 
into an integer. When it's the case, the value is parsed back as an 
Integer instead of Long object and (long) position.get(rollupFieldName) 
fails.
2018-06-26 10:07:25 +02:00
Ryan Ernst 7a150ec06d
Core: Combine doExecute methods in TransportAction (#31517)
TransportAction currently contains 2 doExecute methods, one which takes
a the task, and one that does not. The latter is what some subclasses
implement, while the first one just calls the latter, dropping the given
task. This commit combines these methods, in favor of just always
assuming a task is present.
2018-06-22 15:03:01 -07:00
Ryan Ernst 59e7c6411a
Core: Combine messageRecieved methods in TransportRequestHandler (#31519)
TransportRequestHandler currently contains 2 messageReceived methods,
one which takes a Task, and one that does not. The first just delegates
to the second. This commit changes all existing implementors of
TransportRequestHandler to implement the version which takes Task, thus
allowing the class to be a functional interface, and eliminating the
need to throw exceptions when a task needs to be ensured.
2018-06-22 07:36:03 -07:00
Ryan Ernst 4f9332ee16
Core: Remove ThreadPool from base TransportAction (#31492)
Most transport actions don't need the node ThreadPool. This commit
removes the ThreadPool as a super constructor parameter for
TransportAction. The actions that do need the thread pool then have a
member added to keep it from their own constructor.
2018-06-21 11:25:26 -07:00
Ryan Ernst 401800d958
Core: Remove index name resolver from base TransportAction (#31002)
Most transport actions don't need to resolve index names. This commit
removes the index name resolver as a super constructor parameter for
TransportAction. The actions that do need the resolver then have a
member added to keep the resolver from their own constructor.
2018-06-19 17:06:09 -07:00
Tanguy Leroux 992c7889ee
Uncouple persistent task state and status (#31031)
This pull request removes the relationship between the state 
of persistent task (as stored in the cluster state) and the status 
of the task (as reported by the Task APIs and used in various 
places) that have been confusing for some time (#29608).

In order to do that, a new PersistentTaskState interface is added. 
This interface represents the persisted state of a persistent task. 
The methods used to update the state of persistent tasks are 
renamed: updatePersistentStatus() becomes updatePersistentTaskState() 
and now takes a PersistentTaskState as a parameter. The 
Task.Status type as been changed to PersistentTaskState in all 
places were it make sense (in persistent task customs in cluster 
state and all other methods that deal with the state of an allocated 
persistent task).
2018-06-15 09:26:47 +02:00
Tanguy Leroux bf58660482
Remove all unused imports and fix CRLF (#31207)
The X-Pack opening and the recent other refactorings left a lot of 
unused imports in the codebase. This commit removes them all.
2018-06-11 15:12:12 +02:00
Zachary Tong a1c9def64e
[Rollup] Disallow index patterns that match the rollup index (#30491)
We should not allow the user to configure index patterns that also match
the index which stores the rollup index.

For example, it is quite natural for a user to specify `metricbeat-*`
as the index pattern, and then store the rollups in `metricbeat-rolled`.
This will start throwing errors as soon as the rollup index is created
because the indexer will try to search it.

Note: this does not prevent the user from matching against existing
rollup indices.  That should be prevented by the field-level validation
during job creation.
2018-06-05 15:00:34 -04:00
Jim Ferenczi 7f850bb8ce
Allow terms query in _rollup_search (#30973)
This change adds the `terms` query to the list of accepted queries
for the _rollup_search endpoint.
2018-06-05 16:51:14 +02:00
Yannick Welsch e1649b8669
Allow rollup job creation only if cluster is x-pack ready (#30963)
Otherwise we could end up with persistent tasks metadata in the cluster that some of the nodes
might not understand in case where the cluster is during rolling upgrade from the default 6.2 to the
default 6.3 distribution.

Follow-up to #30743
2018-06-01 10:47:53 +02:00
Tanguy Leroux a0af0e7f1e
Rename methods in PersistentTasksService (#30837)
This commit renames methods in the PersistentTasksService, to 
make obvious that the methods send requests in order to change 
the state of persistent tasks. 

Relates to #29608.
2018-05-30 09:20:14 +02:00
Colin Goodheart-Smithe a75b8adce5
Refactors ClientHelper to combine header logic (#30620)
* Refactors ClientHelper to combine header logic

This change removes all the `*ClientHelper` classes which were
repeating logic between plugins and instead adds
`ClientHelper.executeWithHeaders()` and
`ClientHelper.executeWithHeadersAsync()` methods to centralise the
logic for executing requests with stored security headers.

* Removes Watcher headers constant
2018-05-16 11:38:24 +01:00
Zachary Tong 1c0d339904
[Rollup] Validate timezone in range queries (#30338)
When validating the search request, we make sure any date_histogram
aggregations have timezones that match the jobs.  But we didn't
do any such validation on range queries.

While it wouldn't produce incorrect results, it would be confusing
to the user as no documents would match the aggregation (because we
add a filter clause on the timezone for the agg).

Now the user gets an exception up front, and some helpful text about
why the range query didnt match, and which timezones are acceptable
2018-05-04 10:45:16 -07:00
Ryan Ernst 2efd22454a Migrate x-pack-elasticsearch source to elasticsearch 2018-04-20 15:29:54 -07:00