OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jason Tedor	a8d4ee1620	Remove PipelineExecutionService#executeIndexRequest (#29537 ) With the move long ago to execute all single-document indexing requests as bulk indexing request, the method PipelineExecutionService#executeIndexRequest is unused and will never be used in production code. This commit removes this method and cuts over all tests to use PipelineExecutionService#executeBulkRequest.	2018-04-16 14:55:26 -04:00
Igor Motov	e334baf6fc	Fix overflow error in parsing of long geohashes (#29418 ) Fixes a possible overflow error that geohashes longer than 12 characters can cause during parsing. Fixes #24616	2018-04-16 12:37:38 -04:00
David Turner	34ec403a2e	Remove unused index.ttl.disable_purge setting (#29527 ) This setting does nothing, and is deprecated in the 6.x series by #29526. This change removes it entirely in 7.0.	2018-04-16 17:10:55 +01:00
Ke Li	0bfb59dcf2	Using ObjectParser in UpdateRequest (#29293 ) CRUD: Parsing changes for UpdateRequest (#29293) Use `ObjectParser` to parse `UpdateRequest` so we reject unknown fields and drop support for the `_fields` parameter because it was deprecated in 5.x.	2018-04-16 08:39:35 -04:00
Christoph Büscher	a004a33803	Prevent accidental changes of default values (#29528 ) The default percentiles values and the default highlighter per- and post-tags are currently publicly accessible and can be altered any time. This change prevents this by restricting field access.	2018-04-16 13:41:42 +02:00
Jason Tedor	00fd73acc4	Avoid self-deadlock in the translog (#29520 ) Today when reading an operation from the current generation fails tragically we attempt to close the translog. However, by invoking close before releasing the read lock we end up in self-deadlock because closing tries to acquire the write lock and the read lock can not be upgraded to a write lock. To avoid this, we move the close invocation outside of the try-with-resources that acquired the read lock. As an extra guard against this, we document the problem and add an assertion that we are not trying to invoke close while holding the read lock.	2018-04-15 16:26:09 -04:00
javanna	485d5d19bc	Mute TranslogTests#testFatalIOExceptionsWhileWritingConcurrently This test has been failing quite a few times with a suite timeout, opened #29509 for it.	2018-04-13 17:03:09 +02:00
Simon Willnauer	694e2a9970	Add remote cluster client (#29495 ) This change adds a client that is connected to a remote cluster. This allows plugins and internal structures to invoke actions on remote clusters just like a if it's a local cluster. The remote cluster must be configured via the cross cluster search infrastructure.	2018-04-13 15:23:44 +02:00
Simon Willnauer	eab530ce11	Ensure flush happens on shard idle This adds 2 testcases that test if a shard goes idle pending (uncommitted) segments are committed and unreferenced files will be freed. Relates to #29482	2018-04-13 15:06:51 +02:00
Chandan83	782517b452	Adds SpanGapQueryBuilder in the query DSL (#28636 ) This change adds the support for a `span_gap` query inside the span query DSL.	2018-04-13 14:51:03 +02:00
Mayya Sharipova	5dcfdb09cb	Control max size and count of warning headers (#28427 ) Control max size and count of warning headers Add a static persistent cluster level setting "http.max_warning_header_count" to control the maximum number of warning headers in client HTTP responses. Defaults to unbounded. Add a static persistent cluster level setting "http.max_warning_header_size" to control the maximum total size of warning headers in client HTTP responses. Defaults to unbounded. With every warning header that exceeds these limits, a message will be logged in the main ES log, and any more warning headers for this response will be ignored.	2018-04-13 05:55:33 -04:00
Adrien Grand	553c718d66	Make index APIs work without types. (#29479 ) Unlike the `indices.create`, `indices.get_mapping` and `indices.put_mapping` APIs, the index APIs do not need the `include_type_name` option, they can work work with and without types withouth knowing whether types are being used. Internally, `_doc` is used as a type if no type is provided, like for the `indices.put_mapping` API.	2018-04-13 09:08:45 +02:00
Adrien Grand	ebd6b5b7ba	Deprecate filtering on `_type`. (#29468 ) As indices are only allowed to have one type now, and types are going away in the future, we should deprecate filtering by `_type`. Relates #15613	2018-04-13 09:07:51 +02:00
Nhat Nguyen	f96e00badf	Add primary term to translog header (#29227 ) This change adds the current primary term to the header of the current translog file. Having a term in a translog header is a prerequisite step that allows us to trim translog operations given the max valid seq# for that term. This commit also updates tests to conform the primary term invariant which guarantees that all translog operations in a translog file have its terms at most the term stored in the translog header.	2018-04-12 13:57:59 -04:00
Lee Hinman	14097359a4	Move TimeValue into elasticsearch-core project (#29486 ) This commit moves the `TimeValue` class into the elasticsearch-core project. This allows us to use this class in many of our other projects without relying on the entire `server` jar. Relates to #28504	2018-04-12 10:24:58 -06:00
Igor Motov	0aa19186ae	Fix NPE in InternalGeoCentroidTests#testReduceRandom (#29481 ) In some rare cases all inputs might have zero count and resulting in zero totalCount, and null in centroid causing NPE. Closes #29480	2018-04-12 10:13:40 -04:00
Martijn van Groningen	fac009630d	test: Index more docs, so that it is less likely the search request does not time out. Closes #29221	2018-04-12 11:41:41 +02:00
Nhat Nguyen	067fbb8ecd	Backport periodic flush count to v6.3.0 Relates #29360	2018-04-11 17:14:28 -04:00
Lee Hinman	263349f628	Decouple TimeValue from Elasticsearch server classes (#29454 ) * Decouple TimeValue from Elasticsearch server classes This commit decouples the `TimeValue` class from the other server classes. This is in preperation to move `TimeValue` into the `elasticsearch-core` jar, allowing us to use it from projects that cannot depend on the elasticsearch-core library. Relates to #28504	2018-04-11 14:58:15 -06:00
Nhat Nguyen	0ae627fc79	ElasticsearchMergePolicy extend from MergePolicyWrapper (#29476 ) The skeleton of ElasticsearchMergePolicy is quite similar to MergePolicyWrapper. This commit therefore makes ElasticsearchMergePolicy inherited from MergePolicyWrapper instead of MergePolicy.	2018-04-11 11:32:19 -04:00
Nhat Nguyen	4e6a8900a3	Add periodic flush count to flush stats (#29360 ) Currently, a flush stats contains only the total flush which is the sum of manual flush (via API) and periodic flush (async triggered when the uncommitted translog size is exceeded the flush threshold). Sometimes, it's useful to know these two numbers independently. This commit tracks and returns a periodic flush count in a flush stats.	2018-04-11 11:15:33 -04:00
Adrien Grand	6a6c0ea5e6	Add an `include_type_name` option. (#29453 ) This adds an `include_type_name` option to the `indices.create`, `indices.get_mapping` and `indices.put_mapping` APIs, which defaults to `true`. When set to `false`, then mappings will be returned directly in the body of the `indices.get_mapping` API, without keying them by the type name, the `indices.create` will expect mappings directly under the `mappings` key, and the `indices.put_mapping` will use `_doc` as a type name and fail if a `type` is provided explicitly. Relates #15613	2018-04-11 15:54:16 +02:00
Simon Willnauer	45e7e24736	Restrict Document list access in ParseContext (#29463 ) Today we expose a mutable list of documents in ParseContext via ParseContext#docs(). This, on the one hand places knowledge how to access nested documnts in multiple places and on the other allows for potential illegal access to nested only docs after the docs are reversed. This change restricts the access and streamlines nested / non-root doc access.	2018-04-11 15:09:44 +02:00
Jim Ferenczi	1b6d5e531b	Fail _search request with trailing tokens (#29428 ) This change validates that the `_search` request does not have trailing tokens after the main object and fails the request with a parsing exception otherwise. Closes #28995	2018-04-11 13:10:22 +02:00
Adrien Grand	4918924fae	Remove legacy mapping code. (#29224 ) Some features have been deprecated since `6.0` like the `_parent` field or the ability to have multiple types per index. This allows to remove quite some code, which in-turn will hopefully make it easier to proceed with the removal of types.	2018-04-11 09:41:37 +02:00
Andrew Odendaal	d15cad4afb	Grammar matters.. (#29462 ) Update `all indices on this node will marked read-only` to `all indices on this node will be marked read-only`	2018-04-11 09:30:33 +02:00
Zachary Tong	c341b41c54	[TEST] Temporarily silence MovAvgIT tests due to change in double comparisons #29409 removed the nearlyEquals() double comparison snippet, which makes these tests very flaky because they can generate very large or very small doubles which don't work well with absolute error comparison. We need to either refactor these tests to guarantee they stay in a small range (which could be difficult due to holt/holt-winters) or re-implement the more robust double comparison. Tracking issue: #29456	2018-04-10 20:45:33 +00:00
Jason Tedor	bca192a327	Simplify TranslogWriter#closeWithTragicEvent (#29412 ) This commit simplifies the exception handling in TranslogWriter#closeWithTragicEvent. When invoking this method, the inner close method could throw an exception which we always catch and suppress into the exception that led us to tragically close. This commit moves that repeated logic into closeWithTragicException and now callers simply need to catch, invoke closeWithTragicException, and rethrow.	2018-04-10 10:15:54 -04:00
Lee Hinman	0f40199d10	Remove custom PeriodType formatting from TimeValue (#29433 ) In order to decouple TimeValue from Joda, this removes the unused `format` methods. Relates to #28504	2018-04-10 08:02:56 -06:00
Adrien Grand	aeac682869	Make purely negative queries return scores of 0. (#26015 ) It would make them consistent with queries that are only made of filters. Closes #23449	2018-04-10 14:31:06 +02:00
Adrien Grand	a091d950a7	Deprecate slicing on `_uid`. (#29353 ) Deprecate slicing on `_uid`. `_id` should be used instead on 6.x.	2018-04-10 14:28:30 +02:00
Vladimir Dolzhenko	03d1a7e132	Version conflict exception message enhancement (#29432 ) Report doc is not found rather on PUT ?version=X rather current version [-1] is different than the one provided Closes #21278	2018-04-10 13:42:59 +02:00
Christoph Büscher	13da9dd7c0	Remove 5x bwc in LocaleUtils#parse (#29417 ) Remove the special treatment of parsing the locale property for old 5.x indices since in 7.0 we only need to support reading from 6.x indices.	2018-04-10 12:40:36 +02:00
tomcallahan	ec65710926	Remove copy-pasted code (#29409 ) * Remove copy-pasted code We had two instances of copy-pasted code with a bad license from another website. The code was doing something rather simple, and that functionality already exists within junit. This PR simply leverages the junit functionality.	2018-04-09 18:32:32 -04:00
Adrien Grand	dfcce2d872	Speed up some of our slowest unit tests. (#29414 ) `BaseRandomBinaryDocValuesRangeQueryTestCase.testRandomBig` should only run with nightly tests. It doesn't make sense to make it part of every test run. `UUIDTests` had a slow test for compression, which I made a bit faster by decreasing the number of indexed docs.	2018-04-09 16:35:47 +02:00
Jim Ferenczi	d755fcfd4b	Fix date and ip sources in the composite aggregation (#29370 ) This commit fixes the formatting of the values in the composite aggregation response. `date` fields should return timestamp as longs when used in a `terms` source and `ip` fields should always be formatted as strings. This commit also fixes the parsing of the `after` key for these field types. Finally, this commit disables the index optimization for the `ip` field and any source that provides a `missing` value.	2018-04-09 10:49:29 +02:00
Jason Tedor	11a534932d	Simplify Translog#closeOnTragicEvent (#29413 ) This commit simplifies the invocations to Translog#closeOnTragicEvent. This method already catches all possible exceptions and suppresses the non-AlreadyClosedExceptions into the exception that triggered the invocation. Therefore, there is no need for callers to do this same logic (which would never execute).	2018-04-06 17:59:42 -04:00
Lee Hinman	a07ba9e400	Move Streams.copy into elasticsearch-core and make a multi-release jar (#29322 ) * Move Streams.copy into elasticsearch-core and make a multi-release jar This moves the method `Streams.copy(InputStream in, OutputStream out)` into the `elasticsearch-core` project (inside the `o.e.core.internal.io` package). It also makes this class into a multi-release class where the Java 9 equivalent uses `InputStream#transferTo`. This is a followup from https://github.com/elastic/elasticsearch/pull/29300#discussion_r178147495	2018-04-06 11:07:20 -06:00
Lee Hinman	a93c942927	Move ObjectParser into the x-content lib (#29373 ) * Move ObjectParser into the x-content lib This moves `ObjectParser`, `AbstractObjectParser`, and `ConstructingObjectParser` into the libs/x-content dependency. This decoupling allows them to be used for parsing for projects that don't want to depend on the entire Elasticsearch jar. Relates to #28504	2018-04-06 09:41:14 -06:00
Lee Hinman	160d25fcdb	Move Tuple into elasticsearch-core (#29375 ) * Move Tuple into elasticsearch-core This allows us to use Tuple from other projects that don't want to rely on the entire Elasticsearch jar. I have also added very simple tests, since there were none. Relates tangentially to #28504	2018-04-06 08:58:24 -06:00
Jason Tedor	cb3295b212	Close translog writer if exception on write channel (#29401 ) Today we close the translog write tragically if we experience any I/O exception on a write. These tragic closes lead to use closing the translog and failing the engine. Yet, there is one case that is missed which is when we touch the write channel during a read (checking if reading from the writer would put us past what has been flushed). This commit addresses this by closing the writer tragically if we encounter an I/O exception on the write channel while reading. This becomes interesting when we consider that this method is invoked from the engine through the translog as part of getting a document from the translog. This means we have to consider closing the translog here as well which will cascade up into us finally failing the engine. Note that there is no semantic change to, for example, primary/replica resync and recovery. These actions will take a snapshot of the translog which syncs the translog to disk. If an I/O exception occurs during the sync we already close the writer tragically and once we have synced we do not ever read past the position that was synced while taking the snapshot.	2018-04-06 10:33:21 -04:00
Colin Goodheart-Smithe	55c8e80532	Fixes query_string query equals timezone check (#29406 ) * Fixes query_string query equals timezone check This change fixes a bug where two `QueryStringQueryBuilder`s were found to be equal if they had the same timezone set even if the query string in the builders were different Closes #29403 * Adds mutate function to QueryStringQueryBuilderTests * iter	2018-04-06 11:45:34 +01:00
Menno Oudshoorn	28631d7163	Fix some code smells in equals methods (#29348 ) Fixes instances of - Equals methods without type check - Equals methods where the field of `this` was compared to the same field of `this` instead of the `that` object that is compared to	2018-04-06 10:41:25 +01:00
Tanguy Leroux	ae2a9f7108	[Test] Fix SnapshotShardsServiceIT.testRetryPostingSnapshotStatusMessages This test requires a bit more time than 10 seconds for the the snapshot to be completed, it is now 30s. Closes #29270	2018-04-06 10:24:55 +02:00
Jason Tedor	451a328281	Remove double space in BaseTranslogReader (#29400 ) My eyes! The goggles do nothing!	2018-04-05 17:54:59 -04:00
Jason Tedor	e9576806e8	Remove dead write checkpoint method in translog (#29402 ) This commit removes a dead method from TranslogWriter.java.	2018-04-05 17:54:47 -04:00
David Turner	fb1aba9389	Improve NodeVersionAllocationDecider messages (#29356 ) Since #26542 the NodeVersionAllocationDecider tries to explain its NO decisions as follows: ... may not support codecs or postings formats for a newer Lucene version However, this message often appears during a rolling upgrade, and experience has shown that it seems to cause more confusion and worry than it needs to. This change fixes that by removing the explanation again, reducing the message to a statement of fact about the respective nodes' versions. Additionally, the same wording was used for version incompatibilities when allocating a primary (vs its previous location) and a replica (vs its primary). This change separates these two cases so they can have separate, clearer wording. Fixes #29228	2018-04-05 15:13:48 +01:00
Igor Motov	2c20f7a164	Allow using distance measure in the geo context precision (#29273 ) Adds support for distance measure, such as "4km", "5m" in the precision field of the geo location context in context suggesters. Fixes #24807	2018-04-04 17:39:30 -04:00
Jim Ferenczi	644e5ea97a	Fixed quote_field_suffix in query_string (#29332 ) This change fixes the handling of the `quote_field_suffix` option on `query_string` query. The expansion was not applied to default fields query. Closes #29324	2018-04-04 17:29:09 +02:00
Luca Cavanna	25d411eb32	Remove undocumented action.master.force_local setting (#29351 ) `action.master.force_local` was only ever used internally and never documented. It was one of those settings that were automatically added to a tribe node, to make sure that cluster state read operations would work locally rather than failing when trying to forward the request to the master (as the tribe node never had a master). Given that we recently removed the tribe node, we can also remove this setting.	2018-04-04 14:50:23 +02:00
Jason Tedor	c95e7539e7	Enhance error for out of bounds byte size settings (#29338 ) Today when you input a byte size setting that is out of bounds for the setting, you get an error message that indicates the maximum value of the setting. The problem is that because we use ByteSize#toString, we end up with a representation of the value that does not really tell you what the bound is. For example, if the bound is 2^31 - 1 bytes, the output would be 1.9gb which does not really tell you want the limit as there are many byte size values that we format to the same 1.9gb with ByteSize#toString. We have a method ByteSize#getStringRep that uses the input units to the value as the output units for the string representation, so we end up with no loss if we use this to report the bound. This commit does this.	2018-04-04 07:22:13 -04:00
Stéphane Campinas	38a651e5f1	[Docs] Correct javadoc of GetIndexRequest (#29364 )	2018-04-04 12:11:29 +02:00
Yannick Welsch	1891d4f83d	Check presence of multi-types before validating new mapping (#29316 ) Before doing any kind of validation on a new mapping, we should first do the multi-type validation in order to provide better error messages. For #29313, this means that the exception message will be Rejecting mapping update to [range_index_new] as the final mapping would have more than 1 type: [_doc, mytype] instead of [expected_attendees] is defined as an object in mapping [mytype] but this name is already used for a field in other types	2018-04-04 10:26:50 +01:00
Jason Tedor	8fdca6a89a	Align cat thread pool info to thread pool config (#29195 ) Today we report thread pool info using a common object. This means that we use a shared set of terminology that is not consistent with the terminology used to the configure thread pools. This holds in particular for the minimum and maximum number of threads in the thread pool where we use the following terminology: thread pool info \| fixed \| scaling min core size max max size A previous change addressed this for the nodes info API. This commit changes the display of thread pool info in the cat thread pool API too to be dependent on the type of the thread pool so that we can align the terminology in the output of thread pool info with the terminology used to configure a thread pool.	2018-04-03 17:27:26 -04:00
Nhat Nguyen	8e2f2be249	Track Lucene operations in engine explicitly (#29357 ) Today we reply on `IndexWriter#hasDeletions` to check if an index contains "update" operations. However, this check considers both deletes and updates. This commit replaces that check by tracking and checking Lucene operations explicitly. This would provide us stronger assertions.	2018-04-03 16:45:53 -04:00
Adrien Grand	569d0c0e89	Improve similarity integration. (#29187 ) This improves the way similarities are plugged in in order to: - reject the classic similarity on 7.x indices and emit a deprecation warning otherwise - reject unkwown parameters on 7.x indices and emit a deprecation warning otherwise Even though this breaks the plugin API, I'd like to backport to 7.x so that users can get deprecation warnings when they are doing something that will become unsupported in the future. Closes #23208 Closes #29035	2018-04-03 16:45:25 +02:00
Lee Hinman	db8ed36436	Move Nullable into core (#29341 ) This moves the `Nullable` annotation into the elasticsearch-core project, so it may be used without relying entirely on the server jar. This will allow us to decouple more pieces to make them smaller. In addition, there were two different `Nullable` annotations, these have all been moved to the ES version rather than the inject version.	2018-04-03 07:57:21 -06:00
Adrien Grand	befa66ae35	Elasticsearch 6.3.0 is now on Lucene 7.3.	2018-04-03 14:21:16 +02:00
Yannick Welsch	d4538df893	Improve exception handling on TransportMasterNodeAction (#29314 ) We have seen exceptions bubble up to the uncaught exception handler. Checking the blocks can lead for example to IndexNotFoundException when the indices are resolved. In order to make TransportMasterNodeAction more resilient against such expected exceptions, this code change wraps the execution of doStart() into a try catch and informs the listener in case of failures.	2018-04-03 11:57:58 +02:00
Yannick Welsch	2dc546ccec	Don't break allocation if resize source index is missing (#29311 ) DiskThresholdDecider currently assumes that the source index of a resize operation (e.g. shrink) is available, and throws an IndexNotFoundException otherwise, thereby breaking any kind of shard allocation. This can be quite harmful if the source index is deleted during a shrink, or if the source index is unavailable during state recovery. While this behavior has been partly fixed in 6.1 and above (due to #26931), it relies on the order in which AllocationDeciders are executed (i.e. that ResizeAllocationDecider returns NO, ensuring that DiskThresholdDecider does not run, something that for example does not hold for the allocation explain API). This change adds a more complete fix, and also solves the situation for 5.6.	2018-04-03 11:51:06 +02:00
rationull	0028563aac	Pass through script params in scripted metric agg (#29154 ) * Pass script level params into scripted metric aggs (#28819) Now params that are passed at the script level and at the aggregation level are merged and can both be used in the aggregation scripts. If there are any conflicts, aggregation level params will win. This may be followed by another change detecting that case and throwing an exception to disallow such conflicts. * Disallow duplicate parameter names between scripted agg and script (#28819) If a scripted metric aggregation has aggregation params and script params which have the same name, throw an IllegalArgumentException when merging the parameter lists.	2018-04-03 09:57:49 +01:00
Adrien Grand	3bdfc8f3fb	Upgrade to lucene-7.3.0-snapshot-98a6b3d. (#29298 ) Most notable changes include: - this release doesn't have the 7.2.1 version constant so I had to create one - spatial4j and jts were upgraded	2018-04-03 09:27:14 +02:00
Jason Tedor	1df43a09b7	Remove HTTP max content length leniency (#29337 ) I am not sure why we have this leniency for HTTP max content length, it has been there since the beginning (`5ac51ee93f`) with no explanation of its source. That said, our philosophy today is different than the philosophy of the past where Elasticsearch would be quite lenient in its handling of settings and today we aim for predictability for both users and us. This commit removes leniency in the parsing of http.max_content_length.	2018-04-02 20:20:01 -04:00
Lee Hinman	6b2167f462	Begin moving XContent to a separate lib/artifact (#29300 ) * Begin moving XContent to a separate lib/artifact This commit moves a large portion of the XContent code from the `server` project to the `libs/xcontent` project. For the pieces that have been moved, some helpers have been duplicated to allow them to be decoupled from ES helper classes. In addition, `Booleans` and `CheckedFunction` have been moved to the `elasticsearch-core` project. This decoupling is a move so that we can eventually make things like the high-level REST client not rely on the entire ES jar, only the parts it needs. There are some pieces that are still not decoupled, in particular some of the XContent tests still remain in the server project, this is because they test a large portion of the pluggable xcontent pieces through `XContentElasticsearchException`. They may be decoupled in future work. Additionally, there may be more piecese that we want to move to the xcontent lib in the future that are not part of this PR, this is a starting point. Relates to #28504	2018-04-02 15:58:31 -06:00
David Turner	3be960d1c2	Minor cleanup in the InternalEngine (#29241 ) Fix a couple of minor things in the InternalEngine: * Rename loadOrGenerateHistoryUUID to reflect that it always generates a UUID * Move .acquire() call next to the associated try {} block.	2018-04-02 10:07:28 +01:00
Mayya Sharipova	e70cd35bda	Revert "REST high-level client: add support for Indices Update Settings API (#28892 )" (#29323 ) This reverts commit `b67b5b1bbd`.	2018-03-30 16:26:46 -07:00
Andy Bristol	b7e6fb9ac5	[test] remove Streamable serde assertions (#29307 ) Removes a set of assertions in the test framework that verified that Streamable objects could be serialized and deserialized across different versions. When this was discussed the consensus was that this approach has not caught many bugs in a long time and that serialization testing of objects was best left to their respective unit and integration tests. This commit also removes a transport interceptor that was used in ESIntegTestCase tests to make these assertions about objects coming in or off the wire.	2018-03-30 14:09:26 -07:00
javanna	bcc9cbfba7	Resolve unchecked cast warnings introduced with #28892	2018-03-30 10:58:40 +02:00
olcbean	b67b5b1bbd	REST high-level client: add support for Indices Update Settings API (#28892 ) Relates to #27205	2018-03-30 10:53:29 +02:00
Ryan Ernst	54f8f819ef	Search: Validate script query is run with a single script (#29304 ) The parsing code for script query currently silently skips by any tokens it does not know about within its parsing loop. The only token it does not catch is an array, which means pasing multiple scripts in via an array will cause the last script to be parsed and one, silently dropping the others. This commit adds validation that arrays are not seen while parsing.	2018-03-29 22:10:03 -07:00
Nhat Nguyen	04dd738782	TEST: trim unsafe commits before opening engine Since #29260, unsafe commits must be trimmed before opening an engine. This makes the engine constructor follow Lucene standard semantics and use the last commit. However, we haven't fully applied this change in some tests. Relates #29260	2018-03-29 14:25:42 -04:00
Boaz Leskes	eb8b31746a	Move trimming unsafe commits from engine ctor to store (#29260 ) As follow up to #28245 , this PR removes the logic for selecting the right start commit from the Engine constructor in favor of explicitly trimming them in the Store, before the engine is opened. This makes the constructor in engine follow standard Lucene semantics and use the last commit. Relates #28245 Relates #29156	2018-03-29 13:35:57 -04:00
Igor Motov	04d0edc8ee	Fix incorrect geohash for lat 90, lon 180 (#29256 ) Due to special treatment for the 0xFFFFFF... value in GeoHashUtils' encodeLatLon method, the hashcode for lat 90, lon 180 is incorrectly encoded as `"000000000000"` instead of "zzzzzzzzzzzz". This commit removes the special treatment and fixes the issue. Closes #22163	2018-03-29 09:23:43 -04:00
Tanguy Leroux	b6568d0cfd	Do not load global state when deleting a snapshot (#29278 ) When deleting a snapshot, it is not necessary to load and to parse the global metadata of the snapshot to delete. Now indices are stored in the snapshot metadata file, we have all the information to resolve the shards files to delete. This commit removes the readSnapshotMetaData() method that was used to load both global and index metadata files. Test coverage should be enough as SharedClusterSnapshotRestoreIT already contains several deletion tests. Related to #28934	2018-03-29 09:16:53 +02:00
Nhat Nguyen	9bc167466f	TEST: add log testDoNotRenewSyncedFlushWhenAllSealed This test was failed recently. This commit enables debug log and prints out seals. https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=oraclelinux/2234/console https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+intake/1437/console	2018-03-28 22:05:00 -04:00
Jason Tedor	4ef3de40bc	Fix handling of bad requests (#29249 ) Today we have a few problems with how we handle bad requests: - handling requests with bad encoding - handling requests with invalid value for filter_path/pretty/human - handling requests with a garbage Content-Type header There are two problems: - in every case, we give an empty response to the client - in most cases, we leak the byte buffer backing the request! These problems are caused by a broader problem: poor handling preparing the request for handling, or the channel to write to when the response is ready. This commit addresses these issues by taking a unified approach to all of them that ensures that: - we respond to the client with the exception that blew us up - we do not leak the byte buffer backing the request	2018-03-28 16:25:01 -04:00
Simon Willnauer	13e19e7428	Allow _update and upsert to read from the transaction log (#29264 ) We historically removed reading from the transaction log to get consistent results from _GET calls. There was also the motivation that the read-modify-update principle we apply should not be hidden from the user. We still agree on the fact that we should not hide these aspects but the impact on updates is quite significant especially if the same documents is updated before it's written to disk and made serachable. This change adds back the ability to read from the transaction log but only for update calls. Calls to the _GET API will always do a refresh if necessary to return consistent results ie. if stored fields or DocValues Fields are requested. Closes #26802	2018-03-28 18:03:34 +02:00
Christoph Büscher	27e45fc552	Remove IndicesOptions bwc serialization layer (#29281 ) On master we don't need to talk to pre-6.0 nodes anymore.	2018-03-28 16:19:45 +02:00
Luca Cavanna	245dd73156	Bulk processor#awaitClose to close scheduler (#29263 ) When the `BulkProcessor` is used with the high-level REST client, a scheduler is internally created that allows to schedule tasks. Such scheduler is not exposed to users and needs to be closed once the `BulkProcessor` is closed. There are two ways to close the `BulkProcessor` though, one is the ordinary `close` method and the other one is `awaitClose`. The former closes the scheduler while the latter doesn't, leaving threads lingering.	2018-03-28 16:09:18 +02:00
Yannick Welsch	cacf759213	Remove RELOCATED index shard state (#29246 ) as this information is already covered by ReplicationTracker.primaryMode.	2018-03-28 12:25:46 +02:00
Robin Neatherway	ea8e3661d0	Fix a type check that is always false (#27726 ) DocumentParser: The checks for Text and Keyword were masked by the earlier check for String, which they are child classes of. As String field types are no longer supported, this check can be removed.	2018-03-28 10:20:20 +02:00
Tanguy Leroux	36f8531bf4	Don't load global state when only restoring indices (#29239 ) Restoring a snapshot, or getting the status of finished snapshots, currently always load the global state metadata file from the repository even if it not required. This slows down the restore process (or listing statuses process) and can also be an issue if the global state cannot be deserialized (because it has unknown customs for example). This commit splits the Repository.getSnapshotMetadata() method into two distincts methods: getGlobalMetadata() and getIndexMetadata() that are now called only when needed.	2018-03-28 09:35:05 +02:00
Lee Hinman	eebda6974d	Decouple NamedXContentRegistry from ElasticsearchException (#29253 ) * Decouple NamedXContentRegistry from ElasticsearchException This commit decouples `NamedXContentRegistry` from using either `ElasticsearchException`, `ParsingException`, or `UnknownNamedObjectException`. This will allow us to move NamedXContentRegistry to its own lib as part of the xcontent extraction work. Relates to #28504	2018-03-27 16:51:31 -06:00
Lee Hinman	7df66abaf5	[TEST] Fix issue with HttpInfo passed invalid parameter HttpInfo is passed the maxContentLength as a parameter, but this value should never be negative. This fixes the test to only pass a positive random value.	2018-03-27 14:20:06 -06:00
Lee Hinman	b4c78019b0	Remove all dependencies from XContentBuilder (#29225 ) * Remove all dependencies from XContentBuilder This commit removes all of the non-JDK dependencies from XContentBuilder, with the exception of `CollectionUtils.ensureNoSelfReferences`. It adds a third extension point around dealing with time-based fields and formatters to work around the Joda dependency. This decoupling allows us to be able to move XContentBuilder to a separate lib so it can be available for things like the high level rest client. Relates to #28504	2018-03-27 12:58:22 -06:00
Jim Ferenczi	3db6f1c9d5	Fix sporadic failure in CompositeValuesCollectorQueueTests This commit fixes a test bug that causes an NPE on empty segments. Closes #29269	2018-03-27 20:11:21 +02:00
Jim Ferenczi	2aaa057387	Propagate ignore_unmapped to inner_hits (#29261 ) In 5.2 `ignore_unmapped` was added to `inner_hits` in order to ignore invalid mapping. This value was automatically set to the value defined in the parent query (`nested`, `has_child`, `has_parent`) but the refactoring of the parent/child in 5.6 removed this behavior unintentionally. This commit restores this behavior but also makes sure that we always automatically enforce this value when the query builder is used directly (previously this was only done by the XContent deserialization). Closes #29071	2018-03-27 18:55:42 +02:00
Nhat Nguyen	dfc9e721d8	TEST: Increase timeout for testPrimaryReplicaResyncFailed The default timeout (eg. 10 seconds) may not be enough for CI to re-allocate shards after the partion is healed. This commit increases the timeout to 30 seconds and enables logging in order to have more detailed information in case this test failed again. Closes #29060	2018-03-27 12:18:09 -04:00
Nhat Nguyen	d1d3edf156	TEST: Use different translog dir for a new engine In #testPruneOnlyDeletesAtMostLocalCheckpoint, we create a new engine but mistakenly use the same translog directory of the existing engine. This prevents translog files from cleaning up when closing the engines. ERROR 0.12s J2 \| InternalEngineTests.testPruneOnlyDeletesAtMostLocalCheckpoint <<< FAILURES! > Throwable #1: java.io.IOException: could not remove the following files (in the order of attempts): > translog-primary-060/translog-2.tlog: java.io.IOException: access denied: This commit makes sure to use a separate directory for each engine in this tes.	2018-03-27 09:45:51 -04:00
Christoph Büscher	8d6832c5ee	Make SearchStats implement Writeable (#29258 ) Moves another class over from Streamable to Writeable. By this, also some constructors can be removed or made private.	2018-03-27 15:21:11 +02:00
Nhat Nguyen	0ac89a32cc	Do not optimize append-only if seen normal op with higher seqno (#28787 ) When processing an append-only operation, primary knows that operations can only conflict with another instance of the same operation. This is true as the id was freshly generated. However this property doesn't hold for replicas. As soon as an auto-generated ID was indexed into the primary, it can be exposed to a search and users can issue a follow up operation on it. In extremely rare cases, the follow up operation can be arrived and processed on a replica before the original append-only request. In this case we can't simply proceed with the append-only request and blindly add it to the index without consulting the version map. The following scenario can cause difference between primary and replica. 1. Primary indexes an auto-gen-id doc. (id=X, v=1, s#=20) 2. A refresh cycle happens on primary 3. The new doc is picked up and modified - say by a delete by query request - Primary gets a delete doc (id=X, v=2, s#=30) 4. Delete doc is processed first on the replica (id=X, v=2, s#=30) 5. Indexing operation arrives on the replica, since it's an auto-gen-id request and the retry marker is lower, we put it into lucene without any check. Replica has a doc the primary doesn't have. To deal with a potential conflict between an append-only operation and a normal operation on replicas, we need to rely on sequence numbers. This commit maintains the max seqno of non-append-only operations on replica then only apply optimization for an append-only operation only if its seq# is higher than the seq# of all non-append-only.	2018-03-26 16:56:12 -04:00
Nhat Nguyen	87957603c0	Prune only gc deletes below local checkpoint (#28790 ) Once a document is deleted and Lucene is refreshed, we will not be able to look up the `version/seq#` associated with that delete in Lucene. As conflicting operations can still be indexed, we need another mechanism to remember these deletes. Therefore deletes should still be stored in the Version Map, even after Lucene is refreshed. Obviously, we can't remember all deletes forever so a trimming mechanism is needed. Currently, we remember deletes for at least 1 minute (the default GC deletes cycle) and clean them periodically. This is, at the moment, the best we can do on the primary for user facing APIs but this arbitrary time limit is problematic for replicas. Furthermore, we can't rely on the primary and replicas doing the trimming in a synchronized manner, and failing to do so results in the replica and primary making different decisions. The following scenario can cause inconsistency between primary and replica. 1. Primary index doc (index, id=1, v2) 2. Network packet issue causes index operation to back off and wait 3. Primary deletes doc (delete, id=1, v3) 4. Replica processes delete (delete, id=1, v3) 5. 1+ minute passes (GC deletes runs replica) 6. Indexing op is finally sent to the replica which no processes it because it forgot about the delete. We can reply on sequence-numbers to prevent this issue. If we prune only deletes whose seqno at most the local checkpoint, a replica will correctly remember what it needs. The correctness is explained as follows: Suppose o1 and o2 are two operations on the same document with seq#(o1) < seq#(o2), and o2 arrives before o1 on the replica. o2 is processed normally since it arrives first; when o1 arrives it should be discarded: 1. If seq#(o1) <= LCP, then it will be not be added to Lucene, as it was already previously added. 2. If seq#(o1) > LCP, then it depends on the nature of o2: - If o2 is a delete then its seq# is recorded in the VersionMap, since seq#(o2) > seq#(o1) > LCP, so a lookup can find it and determine that o1 is stale. - If o2 is an indexing then its seq# is either in Lucene (if refreshed) or the VersionMap (if not refreshed yet), so a real-time lookup can find it and determine that o1 is stale. In this PR, we prefer to deploy a single trimming strategy, which satisfies both requirements, on primary and replicas because: - It's simpler - no need to distinguish if an engine is running at primary mode or replica mode or being promoted. - If a replica subsequently is promoted, user experience is fully maintained as that replica remembers deletes for the last GC cycle. However, the version map may consume less memory if we deploy two different trimming strategies for primary and replicas.	2018-03-26 13:42:08 -04:00
Boaz Leskes	bca264699a	remove testUnassignedShardAndEmptyNodesInRoutingTable testUnassignedShardAndEmptyNodesInRoutingTable and that test is as old as time and does a very bogus thing. it is an IT test which extracts the GatewayAllocator from the node and tells it to allocated unassigned shards, while giving it a conjured cluster state with no nodes in it (it uses the DiscoveryNodes.EMPTY_NODES. This is never a cluster state we want to reroute on (we always have at least master node in it). I'm going to just delete the test as I don't think it adds much value. Closes #21463	2018-03-26 17:10:57 +02:00
Boaz Leskes	f5d4550e93	Fold EngineDiskUtils into Store, for better lock semantics (#29156 ) #28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change translog and lucene commit points. That util class bundled everything that's needed to create and empty shard, bootstrap a shard from a lucene index that was just restored etc. In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That would sometime fail due to concurrent shard store fetching or other short activities that require the files not to be changed while they read from them. Since there is no way to wait on the index writer lock, the `Store` class has other locks to make sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the translog manipulations on their own.	2018-03-26 14:08:03 +02:00
Christoph Büscher	318b0af953	Remove execute mode bit from source files Some source files seem to have the execute bit (a+x) set, which doesn't really seem to hurt but is a bit odd. This change removes those, making the permissions similar to other source files in the repository.	2018-03-26 13:37:55 +02:00
Jim Ferenczi	5288235ca3	Optimize the composite aggregation for match_all and range queries (#28745 ) This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate the collection when the leading source value is greater than the lowest value in the queue. Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents in the order of the values present in the leading source. For instance the following aggregation: ``` "composite" : { "sources" : [ { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } } ], "size": 10 } ``` ... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents. For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited. This mode can execute iff: * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`. * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition. * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only). If these conditions are not met this aggregation visits each document like any other agg.	2018-03-26 09:51:37 +02:00
Nicholas Knize	fede633563	Add Z value support to geo_shape This enhancement adds Z value support (source only) to geo_shape fields. If vertices are provided with a third dimension, the third dimension is ignored for indexing but returned as part of source. Like beofre, any values greater than the 3rd dimension are ignored. closes #23747	2018-03-23 08:50:55 -05:00
Nhat Nguyen	794de63232	Remove type casts in logging in server component (#28807 ) This commit removes type-casts in logging in the server component (other components will be done later). This also adds a parameterized message test which would catch breaking-changes related to lambdas in Log4J.	2018-03-23 07:35:50 -04:00
Yu	4a8099c696	Change BroadcastResponse from ToXContentFragment to ToXContentObject (#28878 ) While working on #27799, we find that it might make sense to change BroadcastResponse from ToXContentFragment to ToXContentObject, seeing that it's rather a complete XContent object and also the other Responses are normally ToXContentObject. By doing this, we can also move the XContent build logic of BroadcastResponse's subclasses, from Rest Layer to the concrete classes themselves. Relates to #3889	2018-03-23 10:53:37 +01:00
Milan Chovatiya	8328b9c5cd	REST : Split `RestUpgradeAction` into two actions (#29124 ) Closes #29062	2018-03-23 10:37:31 +01:00

1 2 3 4 5 ...

483 Commits