OpenSearch

Commit Graph

Author	SHA1	Message	Date
Tanguy Leroux	e77835c6f5	Add create rollup job api to high level rest client (#33521 ) This commit adds the Create Rollup Job API to the high level REST client. It supersedes #32703 and adds dedicated request/response objects so that it does not depend on server side components. Related #29827	2018-09-17 09:10:23 +02:00
Jim Ferenczi	79cd6385fe	Collapse package structure for metrics aggs (#33463 ) This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`. It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package. Relates #22868	2018-09-07 10:58:06 +02:00
Zachary Tong	90ce3a6224	[Rollup] Fix Caps Comparator to handle calendar/fixed time (#33336 ) The comparator used TimeValue parsing, which meant it couldn't handle calendar time. This fixes the comparator to handle either (and potentially mixed). The mixing shouldn't be an issue since the validation code upstream will prevent it, but was simplest to allow the comparator to handle both.	2018-09-03 10:49:19 +02:00
Zachary Tong	d93b2a2e9a	[Rollup] Only allow aggregating on multiples of configured interval (#32052 ) We need to limit the search request aggregations to whole multiples of the configured interval for both histogram and date_histogram. Otherwise, agg buckets won't overlap with the rolled up buckets and the results will be incorrect. For histogram, the validation is very simple: request must be >= the config, and modulo evenly. Dates are more tricky. - If both request and config are fixed dates, we can convert to millis and treat them just like the histo - If both are calendar, we make sure the request is >= the config with a static lookup map that ranks the calendar values relatively. All calendar units are "singles", so they are evenly divisible already - We disallow any other combination (one fixed, one calendar, etc)	2018-08-29 17:10:00 -04:00
Hendrik Muhs	cfc003d485	[Rollup] Re-factor Rollup Indexer into a generic indexer for re-usability (#32743 ) This extracts a super class out of the rollup indexer called the AsyncTwoPhaseIterator. The implementor of it can define the query, transformation of the response, indexing and the object to persist the position/state of the indexer. The stats object used by the indexer to record progress is also now abstract, allowing the implementation provide custom stats beyond what the indexer provides. It also allows the implementation to decide how the stats are presented (leaves toXContent() up to the implementation). This should allow new projects to reuse the search-then-index persistent task that Rollup uses, but without the restrictions/baggage of how Rollup has to work internally to satisfy time-based rollups.	2018-08-29 14:28:21 -04:00
Zachary Tong	353112a033	[Rollup] Better error message when trying to set non-rollup index (#32965 ) We don't allow the user to configure a rollup index against an existing index, but the exceptions that we return are not clear about that. They indicate issues with metadata, instead of stating the real reason (not allowed to use a non-rollup index to store rollup data). This makes the exception better, and adds a bit more testing	2018-08-28 11:50:35 -04:00
Tanguy Leroux	e1e8cf382f	[Rollup] Move toBuilders() methods out of rollup config objects (#32585 )	2018-08-27 09:18:26 +02:00
Tanguy Leroux	879a90b999	[Rollup] Move getMetadata() methods out of rollup config objects (#32579 ) This committ removes the getMetadata() methods from the DateHistoGroupConfig and HistoGroupConfig objects. This way the configuration objects do not rely on RollupField.formatMetaField() anymore and do not expose a getMetadata() method that is tighlty coupled to the rollup indexer.	2018-08-24 11:57:46 +02:00
Zachary Tong	8f8d3a5556	[Rollup] Return empty response when aggs are missing (#32796 ) If a search request doesn't contain aggs (or an empty agg object), we should just retun an empty response. This is how the normal search API works if you specify zero hits and empty aggs. The existing behavior throws an exception because it tries to send an empty msearch. Closes #32256	2018-08-23 16:15:37 -04:00
Nik Everett	462e91d362	Logging: Use settings when building daemon threads (#32751 ) Subclasses of `EsIntegTestCase` run multiple Elasticsearch nodes in the same JVM and when we log we look at the name of the thread to figure out the node name. This makes sure that all calls to `daemonThreadFactory` include the node name. Closes #32574 I'd like to follow this up with more drastic changes that make it impossible to do this incorrectly but that change is much larger than this and I'd like to get these log lines fixed up sooner rather than later.	2018-08-20 13:53:15 -04:00
Lee Hinman	48281ac5bc	Use generic AcknowledgedResponse instead of extended classes (#32859 ) This removes custom Response classes that extend `AcknowledgedResponse` and do nothing, these classes are not needed and we can directly use the non-abstract super-class instead. While this appears to be a large PR, no code has actually changed, only class names have been changed and entire classes removed.	2018-08-15 08:06:14 -06:00
Tanguy Leroux	2e65bac5dd	[Rollup] Remove builders from RollupJobConfig (#32669 )	2018-08-07 18:54:42 +02:00
Tanguy Leroux	1122314b3b	[Rollup] Remove builders from GroupConfig (#32614 )	2018-08-07 09:39:24 +02:00
Zachary Tong	fc9fb64ad5	[Rollup] Improve ID scheme for rollup documents (#32558 ) Previously, we were using a simple CRC32 for the IDs of rollup documents. This is a very poor choice however, since 32bit IDs leads to collisions between documents very quickly. This commit moves Rollups over to a 128bit ID. The ID is a concatenation of all the keys in the document (similar to the rolling CRC before), hashed with 128bit Murmur3, then base64 encoded. Finally, the job ID and a delimiter (`$`) are prepended to the ID. This gurantees that there are 128bits per-job. 128bits should essentially remove all chances of collisions, and the prepended job ID means that _if_ there is a collision, it stays "within" the job. BWC notes: We can only upgrade the ID scheme after we know there has been a good checkpoint during indexing. We don't rely on a STARTED/STOPPED status since we can't guarantee that resulted from a real checkpoint, or other state. So we only upgrade the ID after we have reached a checkpoint state during an active index run, and only after the checkpoint has been confirmed. Once a job has been upgraded and checkpointed, the version increments and the new ID is used in the future. All new jobs use the new ID from the start	2018-08-03 11:13:25 -04:00
Tanguy Leroux	21f660d801	[Rollup] Remove builders from DateHistogramGroupConfig (#32555 ) Same motivation as #32507 but for the DateHistogramGroupConfig configuration object. This pull request also changes the format of the time zone from a Joda's DateTimeZone to a simple String. It should help to port the API to the high level rest client and allows clients to not be forced to use the Joda Time library. Serialization is impacted but does not need a backward compatibility layer as DateTimeZone are serialized as String anyway. XContent also expects a String for timezone, so I found it easier to move everything to String. Related to #29827	2018-08-03 13:11:00 +02:00
Tanguy Leroux	937dcfd716	[Rollup] Remove builders from MetricConfig (#32536 ) Related to #29827	2018-08-03 10:01:20 +02:00
Tanguy Leroux	08e4f4be42	[Rollup] Remove builders from HistoGroupConfig (#32533 ) Related to #29827	2018-08-02 17:55:00 +02:00
Tanguy Leroux	82fe67b225	[Rollup] Remove builders from TermsGroupConfig (#32507 ) While working on adding the Create Rollup Job API to the high level REST client (#29827), I noticed that the configuration objects like TermsGroupConfig rely on the Builder pattern in order to create or parse instances. These builders are doing some validation but the same validation could be done within the constructor itself or on the server side when appropriate. This commit removes the builder for TermsGroupConfig, removes some other methods that I consider not really usefull once the TermsGroupConfig object will be exposed in the high level REST client. It also simplifies the parsing logic. Related to #29827	2018-08-01 09:43:32 +02:00
Zachary Tong	6cf7588c3d	[TEST] Fix failure due to exception message in java11 (#32321 ) Java 11 uses more verbose exceptions messages, causing this assertion to fail. Changed the test to be less restrictive and only look for the classes we care about.	2018-07-25 11:34:26 -04:00
Zachary Tong	6ba144ae31	Add WeightedAvg metric aggregation (#31037 ) Adds a new single-value metrics aggregation that computes the weighted average of numeric values that are extracted from the aggregated documents. These values can be extracted from specific numeric fields in the documents. When calculating a regular average, each datapoint has an equal "weight"; it contributes equally to the final value. In contrast, weighted averages scale each datapoint differently. The amount that each datapoint contributes to the final value is extracted from the document, or provided by a script. As a formula, a weighted average is the `∑(value * weight) / ∑(weight)` A regular average can be thought of as a weighted average where every value has an implicit weight of `1`. Closes #15731	2018-07-23 18:33:15 -04:00
Jim Ferenczi	644a92f158	Fix rollup on date fields that don't support epoch_millis (#31890 ) The rollup indexer uses a range query to select the next page of results based on the last time bucket of the previous round and the `delay` configured on the rollup job. This query uses the `epoch_millis` format implicitly but doesn't set the `format`. This result in errors during the rollup job if the field definition doesn't allow this format. It can also miss documents if the format is not accepted but another format in the field definition is able to parse the query (e.g.: `epoch_second`). This change ensures that we use `epoch_millis` as the only format to parse the rollup range query.	2018-07-19 09:34:23 +02:00
Zachary Tong	791b9b147c	[Rollup] Add new capabilities endpoint for concrete rollup indices (#30401 ) This introduces a new GetRollupIndexCaps API which allows the user to retrieve rollup capabilities of a specific rollup index (or index pattern). This is distinct from the existing RollupCaps endpoint. - Multiple jobs can be stored in multiple indices and point to a single target data index pattern (logstash-*). The existing API finds capabilities/config of all jobs matching that data index pattern. - One rollup index can hold data from multiple jobs, targeting multiple data index patterns. This new API finds the capabilities based on the concrete rollup indices.	2018-07-16 17:20:50 -04:00
Zachary Tong	59191b4998	[Rollup] Replace RollupIT with a ESRestTestCase version (#31977 ) The old RollupIT was a node IT, an flaky for a number of reasons. This new version is an ESRestTestCase and should be a little more robust. This was added to the multi-node QA tests as that seemed like the most appropriate location. It didn't seem necessary to create a whole new QA module. Note: The only test that was ported was the "Big" test for validating a larger dataset. The rest of the tests are represented in existing yaml tests. Closes #31258 Closes #30232 Related to #30290	2018-07-16 10:47:46 -04:00
Zachary Tong	b7f07f03ed	[Rollup] Use composite's missing_bucket (#31402 ) We can leverage the composite agg's new `missing_bucket` feature on terms groupings. This means the aggregation criteria used in the indexer will now return null buckets for missing keys. Because all buckets are now returned (even if a key is null), we can guarantee correct doc counts with "combined" jobs (where a job rolls up multiple schemas). This was previously impossible since composite would ignore documents that didn't have _all_ the keys, meaning non-overlapping schemas would cause composite to return no buckets. Note: date_histo does not use `missing_bucket`, since a timestamp is always required. The docs have been adjusted to recommend a single, combined job. It also makes reference to the previous issue to help users that are upgrading (rather than just deleting the sections).	2018-07-13 10:07:42 -04:00
Vladimir Dolzhenko	6acb591012	mark RollupIT.testTwoJobsStartStopDeleteOne as AwaitsFix	2018-07-05 10:03:10 +02:00
Alpar Torok	8557bbab28	Upgrade gradle wrapper to 4.8 (#31525 ) * Move to Gradle 4.8 RC1 * Use latest version of plugin The current does not work with Gradle 4.8 RC1 * Switch to Gradle GA * Add and configure build compare plugin * add work-around for https://github.com/gradle/gradle/issues/5692 * work around https://github.com/gradle/gradle/issues/5696 * Make use of Gradle build compare with reference project * Make the manifest more compare friendly * Clear the manifest in compare friendly mode * Remove animalsniffer from buildscript classpath * Fix javadoc errors * Fix doc issues * reference Gradle issues in comments * Conditionally configure build compare * Fix some more doclint issues * fix typo in build script * Add sanity check to make sure the test task was replaced Relates to #31324. It seems like Gradle has an inconsistent behavior and the taks is not always replaced. * Include number of non conforming tasks in the exception. * No longer replace test task, create implicit instead Closes #31324. The issue has full context in comments. With this change the `test` task becomes nothing more than an alias for `utest`. Some of the stand alone tests that had a `test` task now have `integTest`, and a few of them that used to have `integTest` to run multiple tests now only have `check`. This will also help separarate unit/micro tests from integration tests. * Revert "No longer replace test task, create implicit instead" This reverts commit f1ebaf7d93e4a0a19e751109bf620477dc35023c. * Fix replacement of the test task Based on information from gradle/gradle#5730 replace the task taking into account the task providres. Closes #31324. * Only apply build comapare plugin if needed * Make sure test runs before integTest * Fix doclint aftter merge * PR review comments * Switch to Gradle 4.8.1 and remove workaround * PR review comments * Consolidate task ordering	2018-06-28 08:13:21 +03:00
Tanguy Leroux	be9292cac6	[Test] Add full cluster restart test for Rollup (#31533 ) This pull request adds a full cluster restart test for a Rollup job. The test creates and starts a Rollup job on the cluster and checks that the job already exists and is correctly started on the upgraded cluster. This test allows to test that the persistent task state is correctly parsed from the cluster state after the upgrade, as the status field has been renamed to state in #31031. The test undercovers a ClassCastException that can be thrown in the RollupIndexer when the timestamp as a very low value that fits into an integer. When it's the case, the value is parsed back as an Integer instead of Long object and (long) position.get(rollupFieldName) fails.	2018-06-26 10:07:25 +02:00
Ryan Ernst	7a150ec06d	Core: Combine doExecute methods in TransportAction (#31517 ) TransportAction currently contains 2 doExecute methods, one which takes a the task, and one that does not. The latter is what some subclasses implement, while the first one just calls the latter, dropping the given task. This commit combines these methods, in favor of just always assuming a task is present.	2018-06-22 15:03:01 -07:00
Ryan Ernst	59e7c6411a	Core: Combine messageRecieved methods in TransportRequestHandler (#31519 ) TransportRequestHandler currently contains 2 messageReceived methods, one which takes a Task, and one that does not. The first just delegates to the second. This commit changes all existing implementors of TransportRequestHandler to implement the version which takes Task, thus allowing the class to be a functional interface, and eliminating the need to throw exceptions when a task needs to be ensured.	2018-06-22 07:36:03 -07:00
Ryan Ernst	4f9332ee16	Core: Remove ThreadPool from base TransportAction (#31492 ) Most transport actions don't need the node ThreadPool. This commit removes the ThreadPool as a super constructor parameter for TransportAction. The actions that do need the thread pool then have a member added to keep it from their own constructor.	2018-06-21 11:25:26 -07:00
Ryan Ernst	401800d958	Core: Remove index name resolver from base TransportAction (#31002 ) Most transport actions don't need to resolve index names. This commit removes the index name resolver as a super constructor parameter for TransportAction. The actions that do need the resolver then have a member added to keep the resolver from their own constructor.	2018-06-19 17:06:09 -07:00
Tanguy Leroux	992c7889ee	Uncouple persistent task state and status (#31031 ) This pull request removes the relationship between the state of persistent task (as stored in the cluster state) and the status of the task (as reported by the Task APIs and used in various places) that have been confusing for some time (#29608). In order to do that, a new PersistentTaskState interface is added. This interface represents the persisted state of a persistent task. The methods used to update the state of persistent tasks are renamed: updatePersistentStatus() becomes updatePersistentTaskState() and now takes a PersistentTaskState as a parameter. The Task.Status type as been changed to PersistentTaskState in all places were it make sense (in persistent task customs in cluster state and all other methods that deal with the state of an allocated persistent task).	2018-06-15 09:26:47 +02:00
Tanguy Leroux	bf58660482	Remove all unused imports and fix CRLF (#31207 ) The X-Pack opening and the recent other refactorings left a lot of unused imports in the codebase. This commit removes them all.	2018-06-11 15:12:12 +02:00
Zachary Tong	a1c9def64e	[Rollup] Disallow index patterns that match the rollup index (#30491 ) We should not allow the user to configure index patterns that also match the index which stores the rollup index. For example, it is quite natural for a user to specify `metricbeat-*` as the index pattern, and then store the rollups in `metricbeat-rolled`. This will start throwing errors as soon as the rollup index is created because the indexer will try to search it. Note: this does not prevent the user from matching against existing rollup indices. That should be prevented by the field-level validation during job creation.	2018-06-05 15:00:34 -04:00
Jim Ferenczi	7f850bb8ce	Allow terms query in _rollup_search (#30973 ) This change adds the `terms` query to the list of accepted queries for the _rollup_search endpoint.	2018-06-05 16:51:14 +02:00
Yannick Welsch	e1649b8669	Allow rollup job creation only if cluster is x-pack ready (#30963 ) Otherwise we could end up with persistent tasks metadata in the cluster that some of the nodes might not understand in case where the cluster is during rolling upgrade from the default 6.2 to the default 6.3 distribution. Follow-up to #30743	2018-06-01 10:47:53 +02:00
Tanguy Leroux	a0af0e7f1e	Rename methods in PersistentTasksService (#30837 ) This commit renames methods in the PersistentTasksService, to make obvious that the methods send requests in order to change the state of persistent tasks. Relates to #29608.	2018-05-30 09:20:14 +02:00
Colin Goodheart-Smithe	a75b8adce5	Refactors ClientHelper to combine header logic (#30620 ) * Refactors ClientHelper to combine header logic This change removes all the `ClientHelper` classes which were repeating logic between plugins and instead adds `ClientHelper.executeWithHeaders()` and `ClientHelper.executeWithHeadersAsync()` methods to centralise the logic for executing requests with stored security headers. Removes Watcher headers constant	2018-05-16 11:38:24 +01:00
Zachary Tong	1c0d339904	[Rollup] Validate timezone in range queries (#30338 ) When validating the search request, we make sure any date_histogram aggregations have timezones that match the jobs. But we didn't do any such validation on range queries. While it wouldn't produce incorrect results, it would be confusing to the user as no documents would match the aggregation (because we add a filter clause on the timezone for the agg). Now the user gets an exception up front, and some helpful text about why the range query didnt match, and which timezones are acceptable	2018-05-04 10:45:16 -07:00
Ryan Ernst	2efd22454a	Migrate x-pack-elasticsearch source to elasticsearch	2018-04-20 15:29:54 -07:00

1 2

90 Commits