OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-18 19:05:06 +00:00

Author	SHA1	Message	Date
Gordon Brown	47b1e2b3d0	[7.x] Use rollover for SLM's history indices (#45686 ) Following our own guidelines, SLM should use rollover instead of purely time-based indices to keep shard counts low. This commit implements lazy index creation for SLM's history indices, indexing via an alias, and rollover in the built-in ILM policy.	2019-08-21 13:42:11 -06:00
William Brafford	2b549e7342	CLI tools: write errors to stderr instead of stdout (#45586 ) Most of our CLI tools use the Terminal class, which previously did not provide methods for writing to standard output. When all output goes to standard out, there are two basic problems. First, errors and warnings are "swallowed" in pipelines, making it hard for a user to know when something's gone wrong. Second, errors and warnings are intermingled with legitimate output, making it difficult to pass the results of interactive scripts to other tools. This commit adds a second set of print commands to Terminal for printing to standard error, with errorPrint corresponding to print and errorPrintln corresponding to println. This leaves it to developers to decide which output should go where. It also adjusts existing commands to send errors and warnings to stderr. Usage is printed to standard output when it's correctly requested (e.g., bin/elasticsearch-keystore --help) but goes to standard error when a command is invoked incorrectly (e.g. bin/elasticsearch-keystore list-with-a-typo \| sort).	2019-08-21 14:46:07 -04:00
Przemysław Witek	bf701b83d2	Shorten field names in EstimateMemoryUsageResponse (#45719 ) (#45772 )	2019-08-21 12:45:09 +02:00
Zachary Tong	6b391cd0d5	Mute ShapeQueryTests#testFieldAlias() Tracking issue: https://github.com/elastic/elasticsearch/issues/45628	2019-08-21 10:31:13 +01:00
David Kyle	982560afeb	Mute RollupIndexerStateTests See #45770	2019-08-21 10:05:15 +01:00
Przemysław Witek	c6709f0979	Mute tests affected by renaming fields in Estimate memory usage response (#45743 ) (#45766 )	2019-08-21 09:57:23 +02:00
Dimitris Athanasiou	d5c3d9b50f	[7.x][ML] Do not skip rows with missing values for regression (#45751 ) (#45754 ) Regression analysis support missing fields. Even more, it is expected that the dependent variable has missing fields to the part of the data frame that is not for training. This commit allows to declare that an analysis supports missing values. For such analysis, rows with missing values are not skipped. Instead, they are written as normal with empty strings used for the missing values. This also contains a fix to the integration test. Closes #45425	2019-08-21 08:15:38 +03:00
Benjamin Trent	ba7b677618	[ML] better handle empty results when evaluating regression (#45745 ) (#45759 ) * [ML] better handle empty results when evaluating regression * adding new failure test to ml_security black list * fixing equality check for regression results	2019-08-20 17:37:04 -05:00
Armin Braun	a01bd6c5a3	Stop Executing SLM Policy Transport Action on Snapshot Pool (#45727 ) (#45748 ) * Executing SLM policies on the snapshot thread will block until a snapshot finishes if the pool is completely busy executing that snapshot * Fixes #45594	2019-08-20 19:15:36 +02:00
Nhat Nguyen	99b21d50b8	Include leases in ccr errmsg when ops no longer available (#45681 ) The setting index.soft_deletes.retention.operations is no longer needed nor recommended in CCR. We, therefore, should hint users about the retention leases period setting instead when operations are no longer available for replicating.	2019-08-20 10:40:12 -04:00
Benjamin Trent	43bb5924e6	[ML][Data Frame] fixing _start?force=true bug (#45660 ) (#45734 ) * [ML][Data Frame] fixing _start?force=true bug * removing unused import * removing old TODO	2019-08-20 09:23:07 -05:00
Dimitris Athanasiou	49edf9e5b5	[7.x][ML] Remove timeout on waiting for DF analytics result processor to complete (#45724 ) (#45733 ) We cannot know how long the analysis will take to complete thus we should not have a timeout. Note that if the process crashes, the result processor will pick the exception due to the stream closing. Closes #45723	2019-08-20 17:21:40 +03:00
Przemysław Witek	b37ebd1adf	Prepare the codebase for new Auditor subclasses (#45716 ) (#45731 )	2019-08-20 16:03:50 +02:00
Przemysław Witek	80dd0a0948	Get rid of EstimateMemoryUsageRequest and EstimateMemoryUsageAction.Request. (#45718 ) (#45725 )	2019-08-20 15:49:17 +02:00
Benjamin Trent	88641a08af	[ML][Data frame] fixing failure state transitions and race condition (#45627 ) (#45656 ) * [ML][Data frame] fixing failure state transitions and race condition (#45627) There is a small window for a race condition while we are flagging a task as failed. Here are the steps where the race condition occurs: 1. A failure occurs 2. Before `AsyncTwoPhaseIndexer` calls the `onFailure` handler it does the following: a. `finishAndSetState()` which sets the IndexerState to STARTED b. `doSaveState(...)` which attempts to save the current state of the indexer 3. Another trigger is fired BEFORE `onFailure` can fire, but AFTER `finishAndSetState()` occurs. The trick here is that we will eventually set the indexer to failed, but possibly not before another trigger had the opportunity to fire. This could obviously cause some weird state interactions. To combat this, I have put in some predicates to verify the state before taking actions. This is so if state is indeed marked failed, the "second trigger" stops ASAP. Additionally, I move the task state checks INTO the `start` and `stop` methods, which will now require a `force` parameter. `start`, `stop`, `trigger` and `markAsFailed` are all `synchronized`. This should gives us some guarantees that one will not switch states out from underneath another. I also flag the task as `failed` BEFORE we successfully write it to cluster state, this is to allow us to make the task fail more quickly. But, this does add the behavior where the task is "failed" but the cluster state does not indicate as much. Adding the checks in `start` and `stop` will handle this "real state vs cluster state" race condition. This has always been a problem for `_stop` as it is not a master node action and doesn’t always have the latest cluster state. closes #45609 Relates to #45562 * [ML][Data Frame] moves failure state transition for MT safety (#45676) * [ML][Data Frame] moves failure state transition for MT safety * removing unused imports	2019-08-20 07:30:17 -05:00
markharwood	7d5ab17bb2	Search enhancement: pinned queries (#44345 ) (#45657 ) * Search enhancement: pinned queries (#44345) Search enhancement: - new query type allows selected documents to be promoted above any "organic” search results. This is the first feature in a new module `search-business-rules` which will house licensed (non OSS) logic for rewriting queries according to business rules. The PinnedQueryBuilder class offers a new `pinned` query in the DSL that takes an array of promoted IDs and an “organic” query and ensures the documents with the promoted IDs rank higher than the organic matches. Closes #44074	2019-08-20 11:38:22 +01:00
Costin Leau	0f51dd69cb	SQL: Improve serialization of SQL processors (#45678 ) Encapsulate the serialization/deserialization of SQL client classes. Make configuration specific parameters (such as ZoneId) generic just like the version and remove the need for consumer classes to manage them individually. This is not only consistent but also provides significant savings in the cursor. Fix #40216 (cherry picked from commit 5c844798045d7baa0d932289d2e3d1607ba6a9a4)	2019-08-20 11:50:47 +03:00
Przemysław Witek	7bc8400222	Call the new _estimate_memory_usage API endpoint on df analytics _start (#45536 ) (#45701 )	2019-08-19 21:37:55 +02:00
Costin Leau	1cd58c8ea8	SQL: Break TextFormatter/Cursor dependency (#45613 ) Improve the initialization and state passing of TextFormatter in CLI and TEXT mode by leveraging the Page listener hook. Additionally simplify the code inside RestSqlQueryAction. (cherry picked from commit a56db2fa119cf9e8748723e19f1fc9f6a8afe5fc)	2019-08-17 00:16:08 +03:00
Costin Leau	96883dd028	SQL: Refactor away the cycle between Rowset and Cursor (#45516 ) Improve encapsulation of pagination of rowsets by breaking the cycle between cursor and associated rowset implementation, all logic now residing inside each cursor implementation. (cherry picked from commit be8fe0a0ce562fe732fae12a0b236b5731e4638c)	2019-08-17 00:16:05 +03:00
Gordon Brown	ecb3ebd796	Clean SLM and ongoing snapshots in test framework (#45564 ) Adjusts the cluster cleanup routine in ESRestTestCase to clean up SLM test cases, and optionally wait for all snapshots to be deleted. Waiting for all snapshots to be deleted, rather than failing if any are in progress, is necessary for tests which use SLM policies because SLM policies may be in the process of executing when the test ends.	2019-08-16 14:17:34 -06:00
Igor Motov	98c850c08b	Geo: Change order of parameter in Geometries to lon, lat 7.x (#45618 ) Changes the order of parameters in Geometries from lat, lon to lon, lat and moves all Geometry classes are moved to the org.elasticsearch.geomtery package. Backport of #45332 Closes #45048	2019-08-16 14:42:02 -04:00
Luca Cavanna	c31cddf27e	Update the schema for the REST API specification (#42346 ) * Update the REST API specification This patch updates the REST API spefication in JSON files to better encode deprecated entities, to improve specification of URL paths, and to open up the schema for future extensions. Notably, it changes the `paths` from a list of strings to a list of objects, where each particular object encodes all the information for this particular path: the `parts` and the `methods`. Among the benefits of this approach is eg. encoding the difference between using the `PUT` and `POST` methods in the Index API, to either use a specific document ID, or let Elasticsearch generate one. Also `documentation` becomes an object that supports an `url` and also a `description` which is a new field. * Adapt YAML runner to new REST API specification format The logic for choosing the path to use when running tests has been simplified, as a consequence of the path parts being listed under each path in the spec. The special case for create and index has been removed. Also the parsing code has been hardened so that errors are thrown earlier when the structure of the spec differs from what expected, and their error messages should be more helpful.	2019-08-16 14:40:00 +02:00
Andrei Stefan	30a0711777	Remove deprecated use of "interval" method, in favor of "fixedInterval". (#45501 ) (cherry picked from commit 3fef65160f9e61883e9f8f7f345b814f945e2f4b)	2019-08-16 15:03:43 +03:00
Alpar Torok	7119e54be5	Mute data frame tests on 7.x Tracking in #45610 #45609	2019-08-15 17:07:53 +03:00
David Roberts	d40f3718f2	[ML] Muting 5 SSLErrorMessageTests tests on Windows (#45602 ) Due to https://github.com/elastic/elasticsearch/issues/45598	2019-08-15 11:05:00 +01:00
Benjamin Trent	fde5dae387	[ML][Data Frame] adjusting change detection workflow (#45511 ) (#45580 ) * [ML][Data Frame] adjusting change detection workflow * adjusting for PR comment * disallowing null as an argument value	2019-08-14 17:26:24 -05:00
Nick Knize	647a8308c3	[SPATIAL] Backport new ShapeFieldMapper and ShapeQueryBuilder to 7x (#45363 ) * Introduce Spatial Plugin (#44389) Introduce a skeleton Spatial plugin that holds new licensed features coming to Geo/Spatial land! * [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923) Refactor DeprecatedParameters specific to legacy geo_shape out of AbstractGeometryFieldMapper.TypeParser#parse. * [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980) Add a new ShapeFieldMapper to the xpack spatial module for indexing arbitrary cartesian geometries using a new field type called shape. The indexing approach leverages lucene's new XYShape field type which is backed by BKD in the same manner as LatLonShape but without the WGS84 latitude longitude restrictions. The new field mapper builds on and extends the refactoring effort in AbstractGeometryFieldMapper and accepts shapes in either GeoJSON or WKT format (both of which support non geospatial geometries). Tests are provided in the ShapeFieldMapperTest class in the same manner as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests. Documentation for how to use the new field type and what parameters are accepted is included. The QueryBuilder for searching indexed shapes is provided in a separate commit. * [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108) Add a new ShapeQueryBuilder to the xpack spatial module for querying arbitrary Cartesian geometries indexed using the new shape field type. The query builder extends AbstractGeometryQueryBuilder and leverages the ShapeQueryProcessor added in the previous field mapper commit. Tests are provided in ShapeQueryTests in the same manner as GeoShapeQueryTests and docs are updated to explain how the query works.	2019-08-14 16:35:10 -05:00
Benjamin Trent	0c343d8443	[7.x] [ML][Transforms] adjusting stats.progress for cont. transforms (#45361 ) (#45551 ) * [ML][Transforms] adjusting stats.progress for cont. transforms (#45361) * [ML][Transforms] adjusting stats.progress for cont. transforms * addressing PR comments * rename fix * Adjusting bwc serialization versions	2019-08-14 13:08:27 -05:00
Przemysław Witek	df574e5168	[7.x] Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188 ) (#45510 )	2019-08-14 08:26:03 +02:00
Gordon Brown	3f5dab99c3	Properly set origin for SLM history store client (#45515 ) The origin was not set properly for the SnapshotHistoryStore client, resulting in errors when SLM was used when security was enabled.	2019-08-13 18:23:20 -06:00
Andrei Stefan	adf8e20021	SQL: adds format parameter to range queries for constant date comparisons (#45503 ) * Add format parameter to the range queries built for CURRENT_* functions used in comparison conditions * Use range queries for date fields equality/non-equality as well. (cherry picked from commit c1e81e90f937ee5a002524d632bfce74d76962f9)	2019-08-13 23:04:30 +03:00
Armin Braun	90803a5caf	Reenable Integ Tests in native-multi-node-tests (#45482 ) (#45496 ) * Reenable Integ Tests in native-multi-node-tests * The tests broken here were likely fixed by #45463 => let's reenable them and see if things run fine again * Relates #45405, #45455	2019-08-13 15:55:54 +02:00
Alexander Reelsen	dd527b4e91	Fix watcher HttpClient URL creation (#45207 ) The http client could end up creating URLs, that did not resemble the original one, when encoding. This fixes a couple of corner cases, where too much or too few slashes were added to an URI. Closes #44970	2019-08-13 12:15:54 +02:00
Przemysław Witek	1aed388a24	Add view_index_metadata to roles.yml and remove as many df analytics test cases from build.gradle blacklist as possible. (#45451 ) (#45465 )	2019-08-13 08:31:58 +02:00
Yogesh Gaikwad	471d940c44	Refactor cluster privileges and cluster permission (#45265 ) (#45442 ) The current implementations make it difficult for adding new privileges (example: a cluster privilege which is more than cluster action-based and not exposed to the security administrator). On the high level, we would like our cluster privilege either: - a named cluster privilege This corresponds to `cluster` field from the role descriptor - or a configurable cluster privilege This corresponds to the `global` field from the role-descriptor and allows a security administrator to configure them. Some of the responsibilities like the merging of action based cluster privileges are now pushed at cluster permission level. How to implement the predicate (using Automaton) is being now enforced by cluster permission. `ClusterPermission` helps in enforcing the cluster level access either by performing checks against cluster action and optionally against a request. It is a collection of one or more permission checks where if any of the checks allow access then the permission allows access to a cluster action. Implementations of cluster privilege must be able to provide information regarding the predicates to the cluster permission so that can be enforced. This is enforced by making implementations of cluster privilege aware of cluster permission builder and provide a way to specify how the permission is to be built for a given privilege. This commit renames `ConditionalClusterPrivilege` to `ConfigurableClusterPrivilege`. `ConfigurableClusterPrivilege` is a renderable cluster privilege exposed as a `global` field in role descriptor. Other than this there is a requirement where we would want to know if a cluster permission is implied by another cluster-permission (`has-privileges`). This is helpful in addressing queries related to privileges for a user. This is not just simply checking of cluster permissions since we do not have access to runtime information (like request object). This refactoring does not try to address those scenarios. Relates #44048	2019-08-13 09:06:18 +10:00
Mark Vieira	7e3379444b	Fix build failure due to unknown task and disable test conventions (cherry picked from commit 8ed84bc5cef9bcfae6c817059f764d97e4451a4a)	2019-08-12 09:18:39 -07:00
Przemyslaw Gomulka	421e9b8e8b	Mute integ tests in native-multi-node-tests (#45457 ) Tracked at #45405	2019-08-12 17:42:24 +02:00
Przemyslaw Gomulka	d11ae08467	Muting ForecastIT.testOverflowToDisk (#45435 ) (#45438 ) awaits #45405	2019-08-12 11:01:32 +02:00
Dimitris Athanasiou	d02d6e40c2	[ML] Mute regression integ test Relates #45425	2019-08-12 10:59:24 +03:00
Armin Braun	a9e1402189	Remove Settings from BaseRestRequest Constructor (#45418 ) (#45429 ) * Resolving the todo, cleaning up the unused `settings` parameter * Cleaning up some other minor dead code in affected classes	2019-08-12 05:14:45 +02:00
Benjamin Trent	fac1a6f8e8	[ML][Data Frame] have DataFrameTransformConfigUpdate#apply set Version (#45391 ) (#45400 )	2019-08-09 14:32:49 -05:00
Hendrik Muhs	bf4da6c6ad	[ML-DataFrame] fix starting a batch data frame after stopping at runtime (#45340 ) (#45381 ) fix loading of next checkpoint after data frame transform has been stopped/started within one run closes #45339	2019-08-09 20:30:11 +02:00
Dimitris Athanasiou	27497ff75f	[7.x][ML] Add regression analysis to DF analytics (#45292 ) (#45388 ) This commit adds a first draft of a regression analysis to data frame analytics. There is high probability that the exact syntax might change. This commit adds the new analysis type and its parameters as well as appropriate validation. It also modifies the extractor and the fields detector to be able to handle categorical fields as regression analysis supports them.	2019-08-09 19:31:13 +03:00
Alpar Torok	634a070430	Restrict which tasks can use testclusters (#45198 ) * Restrict which tasks can use testclusters This PR fixes a problem between the interaction of test-clusters and build cache. Before this any task could have used a cluster without tracking it as input. With this change a new interface is introduced to track the tasks that can use clusters and we do consider the cluster as input for all of them.	2019-08-09 13:38:01 +03:00
Hendrik Muhs	7d0aff0ed5	[ML-DataFrame] fix test failure in checkpoint retrieval (#45297 ) gracefully handle if index response returns null, increase and assert timeout closes #45238	2019-08-09 09:04:53 +02:00
Hendrik Muhs	68f9102550	[ML-DataFrame] audit changes in the source index (#45282 ) add audits when the set of source indexes changes and in a special case runs empty	2019-08-08 23:31:55 +02:00
Andrei Stefan	740d58fd46	SQL: Uniquely named inner_hits sections for each nested field condition (#45341 ) * Name each inner_hits section of nested queries differently and extract and combine the multiple values it generates into a single list. This also introduces a limitation (its origin it's with Elasticsearch though) on the sorting capabilities when the sorting is based on the nested fields filtered: only one of the conditions applied to nested documents will be used in the nested sorting. (cherry picked from commit cfc5cf68f6e83b07bb9006986d0903d6be418ec6)	2019-08-09 00:22:49 +03:00
David Roberts	14545f8958	[ML-DataFrame] Combine task_state and indexer_state in _stats (#45324 ) This commit replaces task_state and indexer_state in the data frame _stats output with a single top level state that combines the two. It is defined as: - failed if what's currently reported as task_state is failed - stopped if there is no persistent task - Otherwise what's currently reported as indexer_state Backport of #45276	2019-08-08 16:24:26 +01:00
Ioannis Kakavas	99ddb8b3d8	Allow empty token endpoint for implicit flow (#45038 ) When using the implicit flow in OpenID Connect, the op.token_endpoint_url should not be mandatory as there is no need to contact the token endpoint of the OP.	2019-08-08 12:50:53 +03:00

1 2 3 4 5 ...

3263 Commits