OpenSearch

Commit Graph

Author	SHA1	Message	Date
Armin Braun	c02850f335	Fix S3ClientSettings Leak (#56703 ) (#56862 ) Fixes the fact that repository metadata with the same settings still results in multiple settings instances being cached as well as leaking settings on closing a repository. Closes #56702	2020-05-17 09:18:20 +02:00
Armin Braun	cac85a6f18	Shorter Path in Netty ByteBuf Unwrap (#56740 ) (#56857 ) In most cases we are seeing a `PooledHeapByteBuf` here now. No need to redundantly create an new `ByteBuffer` and single element array for it here when we can just directly unwrap its internal `byte[]`.	2020-05-16 11:54:36 +02:00
Ioannis Kakavas	239ada1669	Test adjustments for FIPS 140 (#56526 ) This change aims to fix our setup in CI so that we can run 7.x in FIPS 140 mode. The major issue that we have in 7.x and did not have in master is that we can't use the diagnostic trust manager in FIPS mode in Java 8 with SunJSSE in FIPS approved mode as it explicitly disallows the wrapping of X509TrustManager. Previous attempts like #56427 and #52211 focused on disabling the setting in all of our tests when creating a Settings object or on setting fips_mode.enabled accordingly (which implicitly disables the diagnostic trust manager). The attempts weren't future proof though as nothing would forbid someone to add new tests without setting the necessary setting and forcing this would be very inconvenient for any other case ( see #56427 (comment) for the full argumentation). This change introduces a runtime check in SSLService that overrides the configuration value of xpack.security.ssl.diagnose.trust and disables the diagnostic trust manager when we are running in Java 8 and the SunJSSE provider is set in FIPS mode.	2020-05-15 18:10:45 +03:00
Alan Woodward	d33d13f2be	Simplify generics on Mapper.Builder (#56747 ) Mapper.Builder currently has some complex generics on it to allow fluent builder construction. However, the second parameter, a return type from the build() method, is unnecessary, as we can use covariant return types. This commit removes this second generic parameter.	2020-05-15 12:14:49 +01:00
Francisco Fernández Castaño	1530bff0cb	Move azure client logic from AzureStorageService to AzureBlobStore (#56806 ) Backport of #56782	2020-05-15 11:30:15 +02:00
Ryan Ernst	9fb80d3827	Move publishing configuration to a separate plugin (#56727 ) This is another part of the breakup of the massive BuildPlugin. This PR moves the code for configuring publications to a separate plugin. Most of the time these publications are jar files, but this also supports the zip publication we have for integ tests.	2020-05-14 20:23:07 -07:00
Mark Vieira	0fd756d511	Enforce strict license distribution requirements (#56642 )	2020-05-14 13:57:56 -07:00
Armin Braun	14a042fbe5	Make No. of Transport Threads == Available CPUs (#56488 ) (#56780 ) We never do any file IO or other blocking work on the transport threads so no tangible benefit can be derived from using more threads than CPUs for IO. There are however significant downsides to using more threads than necessary with Netty in particular. Since we use the default setting for `io.netty.allocator.useCacheForAllThreads` which is `true` we end up using up to `16MB` of thread local buffer cache for each transport thread. Meaning we potentially waste CPUs * 16MB of heap for unnecessary IO threads in addition to obvious inefficiencies of artificially adding extra context switches.	2020-05-14 21:33:46 +02:00
Mark Tozzi	b718193a01	Clean up DocValuesIndexFieldData (#56372 ) (#56684 )	2020-05-14 12:42:37 -04:00
Francisco Fernández Castaño	97bf47f5b9	Track GET/LIST GoogleCloudStorage API calls (#56758 ) Backporting #56585 to 7.x branch. Adds tracking for the API calls performed by the GoogleCloudStorage underlying SDK. It hooks an HttpResponseInterceptor to the SDK transport layer and does http request filtering based on the URI paths that we are interested to track. Unfortunately we cannot hook a wrapper into the ServiceRPC interface since we're using different levels of abstraction to implement retries during reads (GoogleCloudStorageRetryingInputStream).	2020-05-14 14:03:21 +02:00
Nik Everett	b98b260048	Merge significant_terms into the terms package (backport of #56699 ) (#56715 ) This merges the code for the `significant_terms` agg into the package for the code for the `terms` agg. They are super entangled already, this mostly just admits that to ourselves. Precondition for the terms work in #56487	2020-05-13 17:36:21 -04:00
Ignacio Vera	b4521d5183	upgrade to Lucene 8.6.0 snapshot (#56661 )	2020-05-13 14:25:16 +02:00
Armin Braun	0a879b95d1	Save Bounds Checks in BytesReference (#56577 ) (#56621 ) Two spots that allow for some optimization: * We are often creating a composite reference of just a single item in the transport layer => special cased via static constructor to make sure we never do that * Also removed the pointless case of an empty composite bytes ref * `ByteBufferReference` is practically always created from a heap buffer these days so there is no point of dealing with all the bounds checks and extra references to sliced buffers from that and we can just use the underlying array directly	2020-05-12 20:33:45 +02:00
David Turner	8f4af292a7	Hide c.a.a.p.i.BasicProfileConfigFileLoader noise (#56346 ) A recent AWS SDK upgrade has introduced a new source of spurious `WARN` logs when the security manager prevents access to the user's home directory and therefore to their shared client configuration. This is actually the behaviour we want, and it's harmless and handled by the SDK as if the profile config doesn't exist, so this log message is unnecessary noise. This commit suppresses this noisy logging by default. Relates #20313 Closes #56333	2020-05-07 17:00:58 +01:00
Armin Braun	60b6d4eddc	Increase Timeout in S3 Cooldown Test (#56267 ) (#56323 ) Moving from `5s` to `10s` here because of #56095. This adds `10s` to the overall runtime of the test which should be a reasonable tradeoff for stability. Closes #56095	2020-05-07 11:23:07 +02:00
Jason Tedor	33669c0420	Upgrade to Jackson 2.10.4 (#56188 ) Another Jackson release is available. There are some CVEs addressed, none of which impact us, but since we can now bump Jackson easily, let us move along with the train to avoid the false positives from security scanners.	2020-05-06 17:20:23 -04:00
Julie Tibshirani	e852bb29b7	Simplify signature of FieldMapper#parseCreateField. (#56144 ) `FieldMapper#parseCreateField` accepts the parse context, plus a list of fields as an output parameter. These fields are immediately added to the document through `ParseContext#doc()`. This commit simplifies the signature by removing the list of fields, and having the mappers add the fields directly to `ParseContext#doc()`. I think this is nicer for implementors, because previously fields could be added either through the list, or the context (through `add`, `addWithKey`, etc.)	2020-05-06 11:12:09 -07:00
Tim Brooks	6a51017cb2	Upgrade netty to 4.1.49.Final (#56059 )	2020-05-05 10:40:23 -06:00
Armin Braun	3a64ecb6bf	Allow Deleting Multiple Snapshots at Once (#55474 ) (#56083 ) * Allow Deleting Multiple Snapshots at Once (#55474) Adds deleting multiple snapshots in one go without significantly changing the mechanics of snapshot deletes otherwise. This change does not yet allow mixing snapshot delete and abort. Abort is still only allowed for a single snapshot delete by exact name.	2020-05-03 20:30:58 +02:00
Tim Brooks	80662f31a1	Introduce mechanism to stub request handling (#55832 ) Currently there is a clear mechanism to stub sending a request through the transport. However, this is limited to testing exceptions on the sender side. This commit reworks our transport related testing infrastructure to allow stubbing request handling on the receiving side.	2020-04-27 16:57:15 -06:00
Rory Hunter	d66af46724	Always use deprecateAndMaybeLog for deprecation warnings (#55319 ) Backport of #55115. Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...), with an appropriate key, so that all messages can potentially be deduplicated.	2020-04-23 09:20:54 +01:00
Armin Braun	db7eb8e8ff	Remove Redundant CS Update on Snapshot Finalization (#55276 ) (#55528 ) This change folds the removal of the in-progress snapshot entry into setting the safe repository generation. Outside of removing an unnecessary cluster state update, this also has the advantage of removing a somewhat inconsistent cluster state where the safe repository generation points at `RepositoryData` that contains a finished snapshot while it is still in-progress in the cluster state, making it easier to reason about the state machine of upcoming concurrent snapshot operations.	2020-04-21 15:33:17 +02:00
Yannick Welsch	ba39c261e8	Use streaming reads for GCS (#55506 ) To read from GCS repositories we're currently using Google SDK's official BlobReadChannel, which issues a new request every 2MB (default chunk size for BlobReadChannel) using range requests, and fully downloads the chunk before exposing it to the returned InputStream. This means that the SDK issues an awfully high number of requests to download large blobs. Increasing the chunk size is not an option, as that will mean that an awfully high amount of heap memory will be consumed by the download process. The Google SDK does not provide the right abstractions for a streaming download. This PR uses the lower-level primitives of the SDK to implement a streaming download, similar to what S3's SDK does. Also closes #55505	2020-04-21 13:22:26 +02:00
Ignacio Vera	4783f1894c	mute test testReadRangeBlobWithRetries (#55507 ) (#55508 )	2020-04-21 10:59:35 +02:00
Yannick Welsch	b9da307cd1	Add GCS support for searchable snapshots (#55403 ) Adds ranged read support for GCS repositories in order to enable searchable snapshot support for GCS. As part of this PR, I've extracted some of the test infrastructure to make sure that GoogleCloudStorageBlobContainerRetriesTests and S3BlobContainerRetriesTests are covering similar test (as I saw those diverging in what they cover)	2020-04-20 13:02:59 +02:00
Armin Braun	5550d8f3f6	Fix Path Style Access Setting Priority (#55439 ) (#55444 ) * Fix Path Style Access Setting Priority Fixing obvious bug in handling path style access if it's the only setting overridden by the repository settings. Closes #55407	2020-04-20 11:47:41 +02:00
Jason Tedor	0a1b566c65	Fix security manager bug writing large blobs to GCS (#55421 ) * Fix security manager bug writing large blobs to GCS This commit addresses a security manager permissions issue writing large blobs (on the resumable upload path) to GCS. The underlying issue here is that we need to wrap the close and write calls on the channel. It is not enough to do this: SocketAccess.doPrivilegedVoidIOException( () -> Streams.copy( inputStream, Channels.newOutputStream(client().writer(blobInfo, writeOptions)))); This reason that this is not enough is because Streams#copy will be in the stacktrace and it is not granted the security manager permissions needed to close or write this channel. We only grant those permissions to classes loaded in the plugin classloader, and Streams#copy is from the parent classloader. This is why we must wrap the close and write calls as privileged, to truncate the Streams#copy call out of the stacktrace. The reason that this issue is not caught in testing is because the size of data that we use in testing is too small to trigger the large blob resumable upload path. Therefore, we address this by adding a system property to control the threshold, which we can then set in tests to exercise this code path. Prior to rewriting the writeBlobResumable method to wrap the close and write calls as privileged, with this additional test, we are able to reproduce the security manager permissions issue. After adding the wrapping, this test now passes. * Fix forbidden APIs issue * Remove leftover debugging	2020-04-17 18:49:10 -04:00
William Brafford	49e30b15a2	Deprecate disabling basic-license features (#54816 ) (#55405 ) We believe there's no longer a need to be able to disable basic-license features completely using the "xpack..enabled" settings. If users don't want to use those features, they simply don't need to use them. Having such features always available lets us build more complex features that assume basic-license features are present. This commit deprecates settings of the form "xpack..enabled" for basic-license features, excluding "security", which is a special case. It also removes deprecated settings from integration tests and unit tests where they're not directly relevant; e.g. monitoring and ILM are no longer disabled in many integration tests.	2020-04-17 15:04:17 -04:00
Armin Braun	73ab3719e8	Mute GCS Retry Tests on JDK8 (#55372 ) Same as #53119 but for the retries tests. Closes #55317	2020-04-17 12:19:35 +02:00
William Brafford	2ba3be9db6	Remove deprecated third-party methods from tests (#55255 ) (#55269 ) I've noticed that a lot of our tests are using deprecated static methods from the Hamcrest matchers. While this is not a big deal in any objective sense, it seems like a small good thing to reduce compilation warnings and be ready for a new release of the matcher library if we need to upgrade. I've also switched a few other methods in tests that have drop-in replacements.	2020-04-15 17:54:47 -04:00
Ryan Ernst	29b70733ae	Use task avoidance with forbidden apis (#55034 ) Currently forbidden apis accounts for 800+ tasks in the build. These tasks are aggressively created by the plugin. In forbidden apis 3.0, we will get task avoidance (https://github.com/policeman-tools/forbidden-apis/pull/162), but we need to ourselves use the same task avoidance mechanisms to not trigger these task creations. This commit does that for our foribdden apis usages, in preparation for upgrading to 3.0 when it is released.	2020-04-15 13:27:53 -07:00
Ignacio Vera	a677b63daa	Upgrade to lucene 8.5.1 release (#55229 ) (#55235 ) Upgrade to lucene 8.5.1 release that contains a bug fix for a bug that might introduce index corruption when deleting data from an index that was previously shrunk.	2020-04-15 17:35:42 +02:00
Armin Braun	2f91e2aab7	Fix Race in Snapshot Abort (#54873 ) (#55233 ) We can be a little more efficient when aborting a snapshot. Since we know the new repository data after finalizing the aborted snapshot when can pass it down to the snapshot completion listeners. This way, we don't have to fork off to the snapshot threadpool to get the repository data when the listener completes and can directly submit the delete task with high priority straight from the cluster state thread.	2020-04-15 15:42:15 +02:00
Mark Vieira	ce85063653	[7.x] Re-add origin url information to publish POM files (#55173 )	2020-04-14 13:24:15 -07:00
Yannick Welsch	a610513ec7	Provide repository-level stats for searchable snapshots (#55051 ) Provides basic repository-level stats that will allow us to get some insight into how many requests are actually being made by the underlying SDK. Currently only tracks GET and LIST calls for S3 repositories. Most of the code is unfortunately boiler plate to add a new endpoint that will help us better understand some of the low-level dynamics of searchable snapshots.	2020-04-14 14:34:08 +02:00
Jake Landis	a2fafa6af4	[7.x] Lazy test cluster module and plugins (#54852 ) (#55087 ) This change converts the module and plugin parameters for testClusters to be lazy. Meaning that the values are not resolved until they are actually used. This removes the requirement to use project.afterEvaluate to be able to resolve the bundle artifact. Note - this does not completely remove the need for afterEvaluate since it is still needed for the custom resource extension.	2020-04-13 10:53:35 -05:00
Jason Tedor	9eeae59a83	Clarify available processors (#54907 ) The use of available processors, the terminology, and the settings around it have evolved over time. This commit cleans up some places in the codes and in the docs to adjust to the current terminology.	2020-04-10 08:48:27 -04:00
Armin Braun	f6bdd30165	Fix S3 Blob Container Retries Test Range Handling (#55000 ) (#55002 ) The ranges in HTTP headers are using inclusive values for start and end of the range. The math we used was off in so far that start equals end for the range resulted in length `0` instead of the correct value of `1`. Closes #54981 Closes #54995	2020-04-09 10:58:42 +02:00
Mark Vieira	ac6d1f7b24	Mute S3BlobContainerRetriesTests.testReadRangeBlobWithRetries	2020-04-08 16:45:38 -07:00
Mark Vieira	264bfaca56	Mute S3BlobContainerRetriesTests.testReadBlobWithPrematureConnectionClose	2020-04-08 13:05:35 -07:00
Armin Braun	411dc2f607	Fix Broken Math in S3 Retries Tests (#54952 ) (#54972 ) If we run into `length == 0` we trip an assertion in `randomIntBetween(0, length -1)`.	2020-04-08 20:32:21 +02:00
Ryan Ernst	37795d259a	Remove guava from transitive compile classpath (#54309 ) (#54695 ) Guava was removed from Elasticsearch many years ago, but remnants of it remain due to transitive dependencies. When a dependency pulls guava into the compile classpath, devs can inadvertently begin using methods from guava without realizing it. This commit moves guava to a runtime dependency in the modules that it is needed. Note that one special case is the html sanitizer in watcher. The third party dep uses guava in the PolicyFactory class signature. However, only calling a method on the PolicyFactory actually causes the class to be loaded, a reference alone does not trigger compilation to look at the class implementation. There we utilize a MethodHandle for invoking the relevant method at runtime, where guava will continue to exist.	2020-04-07 23:20:17 -07:00
Tim Brooks	619028c33e	Implement transport circuit breaking in aggregator (#54927 ) This commit moves the action name validation and circuit breaking into the InboundAggregator. This work is valuable because it lays the groundwork for incrementally circuit breaking as data is received. This PR includes the follow behavioral change: Handshakes contribute to circuit breaking, but cannot be broken. They currently do not contribute nor are they broken.	2020-04-07 17:10:31 -06:00
Tim Brooks	9cf2406cf1	Move network stats marking into InboundPipeline (#54908 ) This is a follow-up to #48263. It moves the inbound stats tracking inside of the InboundPipeline.	2020-04-07 13:34:05 -06:00
Tanguy Leroux	4d36917e52	Merge feature/searchable-snapshots branch into 7.x (#54803 ) (#54825 ) This is a backport of #54803 for 7.x. This pull request cherry picks the squashed commit from #54803 with the additional commits: 6f50c92 which adjusts master code to 7.x a114549 to mute a failing ILM test (#54818) 48cbca1 and 50186b2 that cleans up and fixes the previous test aae12bb that adds a missing feature flag (#54861) 6f330e3 that adds missing serialization bits (#54864) bf72c02 that adjust the version in YAML tests a51955f that adds some plumbing for the transport client used in integration tests Co-authored-by: David Turner <david.turner@elastic.co> Co-authored-by: Yannick Welsch <yannick@welsch.lu> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-07 13:28:53 +02:00
Jason Tedor	f3a0018175	Update link to JDK 14 compiler bug This commit updates the link to the JDK 14 compiler bug that we have found. At the time that we committed the workaround, we had a submission ID, but not yet the public bug URL. This commit adds the public bug URL.	2020-04-07 06:26:14 -04:00
Jason Tedor	f2590b9984	Workaround JDK 14 compiler bug (#54689 ) This commit workarounds a bug in the JDK 14 compiler. It is choking on a method reference, so we substitute a lambda expression instead. The JDK bug ID is 9064309.	2020-04-02 19:45:52 -04:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Zachary Tong	c9db2de41d	[7.x] Comprehensively test supported/unsupported field type:agg combinations (#54451 ) * Comprehensively test supported/unsupported field type:agg combinations (#52493) This adds a test to AggregatorTestCase that allows us to programmatically verify that an aggregator supports or does not support a particular field type. It fetches the list of registered field type parsers, creates a MappedFieldType from the parser and then attempts to run a basic agg against the field. A supplied list of supported VSTypes are then compared against the output (success or exception) and suceeds or fails the test accordingly. Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com> * Skip fields that are not aggregatable * Use newIndexSearcher() to avoid incompatible readers (#52723) Lucene's `newSearcher()` can generate readers like ParallelCompositeReader which we can't use. We need to instead use our helper `newIndexSearcher`	2020-03-31 14:35:03 -04:00
Martijn van Groningen	4b4fbc160d	Refactor AliasOrIndex abstraction. (#54394 ) Backport of #53982 In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams, the abstraction needs to be made more flexible, because currently it really can be only an alias or an index. * Renamed `AliasOrIndex` to `IndexAbstraction`. * Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is. * Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum. * Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface. * Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`. * Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method. Relates to #53100	2020-03-30 10:12:16 +02:00
Tim Brooks	2ccddbfa88	Move transport decoding and aggregation to server (#54360 ) Currently all of our transport protocol decoding and aggregation occurs in the individual transport modules. This means that each implementation (test, netty, nio) must implement this logic. Additionally, it means that the entire message has been read from the network before the server package receives it. This commit creates a pipeline in server which can be passed arbitrary bytes to handle. Internally, the pipeline will decode, decompress, and aggregate the messages. Additionally, this allows us to run many megabytes of bytes through the pipeline in tests to ensure that the logic works. This work will enable future work: Circuit breaking or backoff logic based on message type and byte in the content aggregator. Sharing bytes with the application layer using the ref counted releasable network bytes. Improved network monitoring based specifically on channels. Finally, this fixes the bug where we do not circuit break on the correct message size when compression is enabled.	2020-03-27 14:13:10 -06:00
Tim Brooks	f5b4020819	Remove netty BytesReference implementations (#54355 ) Elasticsearch has a number of different BytesReference implementations. These implementations can all implement the interface in different ways with subtly different behavior and performance characteristics. On the other-hand, the JVM only represents bytes as an array or a direct byte buffer. This commit deletes the specialized Netty implementations and moves to using a generic ByteBuffer reference type. This will allow us to focus on standardizing performance and behave around a smaller number of implementations that can be used by all components in Elasticsearch.	2020-03-27 11:01:33 -06:00
Armin Braun	d9d11f6d16	Remove Unused Apache Http Dependency from GCS Repo Plugin (#54331 ) (#54342 ) We are not using the Apache HTTP client backed http transport with the GCS repo. Same as with the app engine type transport we can save ourselves the dependency on the http client here and ignore the missing classes.	2020-03-27 15:10:19 +01:00
Armin Braun	70b378cd1b	Upgrade GCS Dependency to 1.106.0 (#54092 ) (#54112 ) * Upgrade GCS Dependency to 1.106.0 (#54092) Upgrading GCS Dep + related dependencies as it seems some more retry bugs were fixed between .104 and .106	2020-03-25 19:05:01 +01:00
James Baiera	b84c74cf70	Update the HDFS version used by HDFS Repo (#53693 ) (#54125 )	2020-03-25 14:01:29 -04:00
Mark Vieira	7728ccd920	Encore consistent compile options across all projects (#54120 ) (cherry picked from commit ddd068a7e92dc140774598664efdc15155ab05c2)	2020-03-25 08:24:21 -07:00
Armin Braun	4271963462	Revert "Use Azure Bulk Deletes in Azure Repository (#53919 )" (#54089 ) (#54111 ) This reverts commit 23cccf088810b8416ed278571352393cc2de9523. Unfortunately SAS token auth still doesn't work with bulk deletes so we can't use them yet. Closes #54080	2020-03-25 12:13:25 +01:00
Ioannis Kakavas	7d4ae7d982	Upgrade Tika to 1.24 (#54130 ) (#54150 ) Also updates commons-compress to 1.19, pdfbox to 2.0.19 and POI to 4.1.2. Adds a compile dependency to commons-math3 3.6.1 and SparseBitSet 1.2	2020-03-25 11:03:26 +02:00
Alan Woodward	39d7d0dc10	Upgrade to lucene 8.5.0 release (#54077 ) Upgrades our lucene dependency to the released 8.5.0 version.	2020-03-24 13:45:50 +00:00
Mark Vieira	70cfedf542	Refactor global build info plugin to leverage JavaInstallationRegistry (#54026 ) This commit removes the configuration time vs execution time distinction with regards to certain BuildParms properties. Because of the cost of determining Java versions for configuration JDK locations we deferred this until execution time. This had two main downsides. First, we had to implement all this build logic in tasks, which required a bunch of additional plumbing and complexity. Second, because some information wasn't known during configuration time, we had to nest any build logic that depended on this in awkward callbacks. We now defer to the JavaInstallationRegistry recently added in Gradle. This utility uses a much more efficient method for probing Java installations vs our jrunscript implementation. This, combined with some optimizations to avoid probing the current JVM as well as deferring some evaluation via Providers when probing installations for BWC builds we can maintain effectively the same configuration time performance while removing a bunch of complexity and runtime cost (snapshotting inputs for the GenerateGlobalBuildInfoTask was very expensive). The end result should be a much more responsive build execution in almost all scenarios. (cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)	2020-03-23 15:30:10 -07:00
Namgyu Kim	bc2289c258	Add nori_number token filter in analysis-nori (#53583 ) This change adds the `nori_number` token filter. It also adds a `discard_punctuation` option in nori_tokenizer that should be used in conjunction with the new filter.	2020-03-23 19:53:34 +01:00
Armin Braun	754d071c4e	Upgrade to AWS SDK 1.11.749 (#53962 ) (#53974 ) Upgrading AWS SDK to v1.11.749. Required building clients inside privileged contexts because some class loading that requires privileges now happens there and working around a new SDK bug in the S3 client builder. Closes #53191	2020-03-23 15:31:29 +01:00
Armin Braun	b51ea25a00	Use Azure Bulk Deletes in Azure Repository (#53919 ) (#53967 ) Now that we upgraded the Azure SDK to 8.6.2 in #53865 we can make use of bulk deletes.	2020-03-23 13:35:05 +01:00
Armin Braun	69a35158ce	Fix Azure Repository with HTTPs Endpoint (#53903 ) (#53963 ) Upgrading to 8.6.2 in #53865 broke running against HTTPs endpoints (and hence real azure) because the https url connection needs the newly added permission to work.	2020-03-23 12:16:33 +01:00
Armin Braun	41301d74b0	Upgrade to Azure SDK 8.6.2 (#53865 ) (#53886 ) This fixes some bugs around retrying and URL encoding and should enable a follow-up that finally adds bulk deletes on Azure.	2020-03-20 18:27:02 +01:00
Armin Braun	a70ebef366	Longer Timeout in S3 Retries Test (#53841 ) (#53847 ) The lower end of the timeout range of 100ms is prone to time out on CI before the mock REST server gets to sending a response that is not supposed to be a timeout. Using 1-3s here should make this safe at the cost of randomly making this test take a few seconds. Closes #53506	2020-03-20 12:23:40 +01:00
Jake Landis	db3420d757	[7.x] Optimize which Rest resources are used by the Rest tests… (#53766 ) This should help with Gradle's incremental compile such that projects only depend upon the resources they use. related #52114	2020-03-19 12:28:59 -05:00
Ryan Ernst	5c472fcb47	Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642 ) Re-applies the change from #53523 along with test fixes. closes #53626 closes #53624 closes #53622 closes #53625 Co-authored-by: Nik Everett <nik9000@gmail.com> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Jake Landis <jake.landis@elastic.co>	2020-03-17 10:28:51 -07:00
Alan Woodward	71b703edd1	Rename AtomicFieldData to LeafFieldData (#53554 ) This conforms with lucene's LeafReader naming convention, and matches other per-segment structures in elasticsearch.	2020-03-17 12:30:12 +00:00
Mark Vieira	2f0aca992b	Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 )" This reverts commit `b7dbadeea0`.	2020-03-15 18:10:40 -07:00
Jason Tedor	b7dbadeea0	Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 ) This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2 dependency to 2.13.1. Relates #53523	2020-03-14 13:28:06 -04:00
Jason Tedor	32dd852210	Update jackson-databind to 2.8.11.6 (#53522 ) This commit upgrades the jackson-databind depdendency to 2.8.11.6. Additionally, we revert a previous change that put ingest-geoip on the version of jackson-databind from the version properties file. This is because upgrading ingest-geoip to a later version of jackson-databind also requires an upgrade to the geoip2 dependency which is currently blocked. Therefore, if we can get to a point where we otherwise upgrade our Jackson dependencies, we do not want ingest-geoip to automatically come along with it.	2020-03-12 20:15:13 -04:00
Alan Woodward	5c861cfe6e	Upgrade to final lucene 8.5.0 snapshot (#53293 ) Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use the latest snapshot to check that there are no last-minute bugs or regressions.	2020-03-10 09:32:59 +00:00
Nhat Nguyen	5476a49833	Revert "upgrade to lucene-snapshot-fa75139efea (#53150 ) (#53151 )" This reverts commit `058113aa42`.	2020-03-05 17:33:00 -05:00
Armin Braun	204c366a4e	Upgrade GCS SDK to 1.104.0 (#52839 ) (#53152 ) Upgrading the GCS SDK to the most recent version. Adjusting (i.e. improving) the REST mock accordingly. This should significantly boost performance by pulling in https://github.com/googleapis/java-core/issues/86 in some cases.	2020-03-05 11:18:18 +01:00
Ignacio Vera	058113aa42	upgrade to lucene-snapshot-fa75139efea (#53150 ) (#53151 )	2020-03-05 10:04:05 +01:00
Tanguy Leroux	52d4807f8d	Mute GoogleCloudStorageBlobStoreRepositoryTests on jdk8 (#53119 ) Tests in GoogleCloudStorageBlobStoreRepositoryTests are known to be flaky on JDK 8 (#51446, #52430 ) and we suspect a JDK bug (https://bugs.openjdk.java.net/browse/JDK-8180754) that triggers some assertion on the server side logic that emulates the Google Cloud Storage service. Sadly we were not able to reproduce the failures, even when using the same OS (Debian 9, Ubuntu 16.04) and JDK (Oracle Corporation 1.8.0_241 [Java HotSpot(TM) 64-Bit Server VM 25.241-b07]) of almost all the test failures on CI. While we spent some time fixing code (#51933, #52431) to circumvent the JDK bug they are still flaky on JDK-8. This commit mute these tests for JDK-8 only. Close ##52906	2020-03-05 09:18:05 +01:00
Nhat Nguyen	e6755afeeb	Upgrade to Lucene 8.5.0-snapshot-c4475920b08 (#52950 ) (#52977 ) To give LUCENE-9228 more CI cycles	2020-02-29 09:29:16 -05:00
Lee Hinman	a47e404732	Mute GoogleCloudStorageBlobStoreRepositoryTests (#52926 ) These intermittently fail due to an assertion triggered by a JDK bug. Relates to #52906	2020-02-27 15:16:48 -07:00
Mark Vieira	f46b370e7a	Fix cacheability of repository-hdfs integ tests (#52858 )	2020-02-27 09:53:51 -08:00
Mark Vieira	bc9c3f0135	Ignore test seed in third party test system property inputs (#52849 )	2020-02-26 14:29:34 -08:00
Mark Vieira	f06d692706	[Backport] Consolidate docker availability logic (#52656 )	2020-02-21 15:24:05 -08:00
markharwood	96d603979b	Upgrade Lucene to 8.5.0-snapshot-b01d7cb (#52584 ) Upgrading 7x to same Lucene 8.5 version used in master	2020-02-21 10:25:03 +00:00
Armin Braun	5a7db0c520	Fix GCS Test testReadLargeBlobWithRetries (#52619 ) (#52624 ) The countdown didn't work well here because it only returns `true` once the countdown reaches `0` but can on subsequent executions return `false` again if a countdown at `0` is counted down again, leading to more than the expected number of simulated failures. Closes #52607	2020-02-21 10:34:53 +01:00
Armin Braun	1662cd45a4	Add Region and Signer Algorithm Overrides to S3 Repos (#52112 ) (#52562 ) Exposes S3 SDK signing region and algorithm override settings as requested in #51861. Closes #51861	2020-02-21 10:21:20 +01:00
Armin Braun	0a09e15959	Add Caching for RepositoryData in BlobStoreRepository (#52341 ) (#52566 ) Cache latest `RepositoryData` on heap when it's absolutely safe to do so (i.e. when the repository is in strictly consistent mode). `RepositoryData` can safely be assumed to not grow to a size that would cause trouble because we often have at least two copies of it loaded at the same time when doing repository operations. Also, concurrent snapshot API status requests currently load it independently of each other and so on, making it safe to cache on heap and assume as "small" IMO. The benefits of this move are: * Much faster repository status API calls * listing all snapshot names becomes instant * Other operations are sped up massively too because they mostly operate in two steps: load repository data then load multiple other blobs to get the additional data * Additional cloud cost savings * Better resiliency, saving another spot where an IO issue could break the snapshot * We can simplify a number of spots in the current code that currently pass around the repository data in tricky ways to avoid loading it multiple times in follow ups.	2020-02-21 10:20:07 +01:00
Armin Braun	4bb780bc37	Refactor Inflexible Snapshot Repository BwC (#52365 ) (#52557 ) * Refactor Inflexible Snapshot Repository BwC (#52365) Transport the version to use for a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere. Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.	2020-02-21 09:14:34 +01:00
Mark Vieira	4bce9984e6	Mute GoogleCloudStorageBlobContainerRetriesTests.testReadLargeBlobWithRetries Signed-off-by: Mark Vieira <portugee@gmail.com>	2020-02-20 15:13:34 -08:00
Armin Braun	aeb7b777e6	Add Blob Download Retries to GCS Repository (#52479 ) (#52521 ) * Add Blob Download Retries to GCS Repository Exactly as #46589 (and kept as close to it as possible code wise so we can dry things up in a follow-up potentially) but for GCS. Closes #52319	2020-02-19 18:29:13 +01:00
Tim Brooks	e752221fc6	Upgrade netty to 4.1.45.Final (#51689 ) Upgrade netty.	2020-02-18 09:11:29 -07:00
Ioannis Kakavas	d9ce0e6733	Update BouncyCastle to 1.64 (#52185 ) (#52464 ) This commit upgrades the bouncycastle dependency from 1.61 to 1.64.	2020-02-18 14:11:34 +02:00
Armin Braun	a9c7557ac4	Fix Failure to Drain Stream in GCS Repo Tests (#52431 ) (#52454 ) Same as #51933 but for the custom handler just used in this test. Closes #52430	2020-02-18 11:37:34 +01:00
Marios Trivyzas	dac720d7a1	Add a cluster setting to disallow expensive queries (#51385 ) (#52279 ) Add a new cluster setting `search.allow_expensive_queries` which by default is `true`. If set to `false`, certain queries that have usually slow performance cannot be executed and an error message is returned. - Queries that need to do linear scans to identify matches: - Script queries - Queries that have a high up-front cost: - Fuzzy queries - Regexp queries - Prefix queries (without index_prefixes enabled - Wildcard queries - Range queries on text and keyword fields - Joining queries - HasParent queries - HasChild queries - ParentId queries - Nested queries - Queries on deprecated 6.x geo shapes (using PrefixTree implementation) - Queries that may have a high per-document cost: - Script score queries - Percolate queries Closes: #29050 (cherry picked from commit a8b39ed842c7770bd9275958c9f747502fd9a3ea)	2020-02-12 22:56:14 +01:00
Armin Braun	6ea3f5ada1	Move EC2 Discovery Tests to Mock Rest API (#50605 ) (#52270 ) Move EC2 discovery tests to using the mock REST API introduced in https://github.com/elastic/elasticsearch/pull/50550 instead of mocking the AWS SDK classes manually. Move the trivial remaining AWS SDK mocks to the single test suit that was using them.	2020-02-12 18:35:50 +01:00
Ignacio Vera	80e3c97210	Upgrade to lucene-8.5.0-snapshot-d62f6307658 (#52039 ) (#52130 )	2020-02-10 10:13:22 +01:00
Ioannis Kakavas	343fb36c7f	Test modifications for FIPS 140 mode (#51832 ) (#52128 ) - Enable SunJGSS provider for Kerberos tests - Handle the fact that in the decrypt method in KeyStoreWrapper might not throw immediately when the GCM cipher is from BouncyCastle FIPS and we end up with a DataInputStream that has reached it's end. - Disable tests, jarHell, testingConventions for ingest attachment plugin. We don't support this plugin (and document this) in FIPS mode. - Don't attempt to install ingest-attachment in smoke-test-plugins	2020-02-10 10:57:03 +02:00
Jay Modi	3edadfefd0	RestHandlers declare handled routes (#52123 ) This commit changes how RestHandlers are registered with the RestController so that a RestHandler no longer needs to register itself with the RestController. Instead the RestHandler interface has new methods which when called provide information about the routes (method and path combinations) that are handled by the handler including any deprecated and/or replaced combinations. This change also makes the publication of RestHandlers safe since they no longer publish a reference to themselves within their constructors. Closes #51622 Co-authored-by: Jason Tedor <jason@tedor.me> Backport of #51950	2020-02-09 22:48:32 -07:00
Ioannis Kakavas	8c0b49cd32	Adjust jarHell and 3rd party audit exclusions (#51733 ) (#51766 ) Now that the FIPS 140 security provider is simply a test dependency we don't need the thirdPartyAudit exceptions, but plugin-cli and transport-netty4 do need jarHell disabled as they use the non fips BouncyCastle security provider as a test dependency too.	2020-02-10 07:38:59 +02:00
Julie Tibshirani	337d73a7c6	Rename MapperService#fullName to fieldType. The new name more accurately describes what the method returns.	2020-02-07 10:35:53 -08:00
Armin Braun	91e938ead8	Add Trace Logging of REST Requests (#51684 ) (#52015 ) Being able to trace log all REST requests to a node would make debugging a number of issues a lot easier.	2020-02-07 09:03:20 +01:00
Maria Ralli	8d3e73b3a0	Add host address to BindTransportException message (#51269 ) When bind fails, show the host address in addition to the port. This helps debugging cases with wrong "network.host" values. Closes #48001	2020-02-04 17:13:19 +00:00
Mayya Sharipova	42b885f050	Upgrade to lucene-8.5.0-snapshot-3333ce7da6d (#51749 ) Backport for #51327	2020-01-31 11:20:15 -05:00
Ioannis Kakavas	1dc965f03f	Mute ec2 test in FIPS 140 mode (#51686 ) (#51726 ) as it needs an extra permission, until we can figure out how to grant the permission in FIPS 140 mode too. See: https://github.com/elastic/elasticsearch/issues/51685	2020-01-31 09:35:20 +02:00
Armin Braun	74e3694234	Optimize GCS Repo Uploads (#51596 ) (#51618 ) For small uploads (that can still be up to 5MB!) we needlessly reading the `InputStream` into a BAOS which entailed allocating the `byte[]` for the stream contents twice (because to `toByteArray` on the BAOS copies). Also, for resumeable uploads we were needlessly wrapping the output channel and running each individual write in its own privileged context when we could just wrap the whole upload in a single privileged context. Relates #51593	2020-01-29 16:07:30 +01:00
Armin Braun	7914c1a734	Optimize GCS Mock (#51593 ) (#51594 ) This test was still very GC heavy in Java 8 runs in particular which seems to slow down request processing to the point of timeouts in some runs. This PR completely removes the large number of O(MB) `byte[]` allocations that were happening in the mock http handler which cuts the allocation rate by about a factor of 5 in my local testing for the GC heavy `testSnapshotWithLargeSegmentFiles` run. Closes #51446 Closes #50754	2020-01-29 11:06:05 +01:00
Ioannis Kakavas	ee202a642f	Enable tests in FIPS 140 in JDK 11 (#49485 ) This change changes the way to run our test suites in JVMs configured in FIPS 140 approved mode. It does so by: - Configuring any given runtime Java in FIPS mode with the bundled policy and security properties files, setting the system properties java.security.properties and java.security.policy with the == operator that overrides the default JVM properties and policy. - When runtime java is 11 and higher, using BouncyCastle FIPS Cryptographic provider and BCJSSE in FIPS mode. These are used as testRuntime dependencies for unit tests and internal clusters, and copied (relevant jars) explicitly to the lib directory for testclusters used in REST tests - When runtime java is 8, using BouncyCastle FIPS Cryptographic provider and SunJSSE in FIPS mode. Running the tests in FIPS 140 approved mode doesn't require an additional configuration either in CI workers or locally and is controlled by specifying -Dtests.fips.enabled=true	2020-01-27 11:14:52 +02:00
Armin Braun	3e3673b518	Fix ByteBuf Leak in Nio HTTP Tests (#51444 ) (#51457 ) It is the job of the http server transport to release the request in the handler but the mock fails to do so since we never override `incomingRequest`.	2020-01-25 16:19:49 +01:00
Armin Braun	c29b235a5a	Stop Copying Bulk HTTP Requests in NIO Networking (#49819 ) (#51393 ) Same as #44564 but for NIO.	2020-01-24 11:23:16 +01:00
Mark Vieira	f86de2a9cb	Always test against default distribution when in a FIPS JVM (#51273 ) (#51333 )	2020-01-23 14:54:57 -08:00
Mark Vieira	c08c282c0e	Revert "Always test against default distribution when in a FIPS JVM (#51273 )" This reverts commit `0169498711`. This reverts commit `c5a032b594`.	2020-01-22 12:15:57 -08:00
Mark Vieira	c5a032b594	Always test against default distribution when in a FIPS JVM (#51273 ) (cherry picked from commit e34d7fdaf7b511627c64a9e16805fd82f980b8c6)	2020-01-22 11:30:25 -08:00
Armin Braun	c5f1a90159	Add CoolDown Period to S3 Repository (#51074 ) (#51213 ) Add cool down period after snapshot finalization and delete to prevent eventually consistent AWS S3 from corrupting shard level metadata as long as the repository is using the old format metadata on the shard level.	2020-01-20 12:18:16 +01:00
Nik Everett	f6c89b4599	Move test of custom sig heuristic to plugin (#50891 ) (#51067 ) This moves the testing of custom significance heuristic plugins from an `ESIntegTestCase` to an example plugin. This is much more "real" and can be used as an example for anyone that needs to actually build such a plugin. The old test had testing concerns and the example all jumbled together.	2020-01-16 14:49:12 -05:00
Armin Braun	4a7e09f624	Enforce Logging of Errors in GCS Rest RetriesTests (#50761 ) (#50783 ) It's impossible to tell why #50754 fails without this change. We're failing to close the `exchange` somewhere and there is no write timeout in the GCS SDK (something to look into separately) only a read timeout on the socket so if we're failing on an assertion without reading the full request body (at least into the read-buffer) we're locking up waiting forever on `write0`. This change ensure the `exchange` is closed in the tests where we could lock up on a write and logs the failure so we can find out what broke #50754.	2020-01-09 10:46:07 +01:00
Adrien Grand	4f2299c714	Upgrade to Lucene 8.4.0. (#50518 ) (#50750 )	2020-01-08 18:53:59 +01:00
Armin Braun	a725896c92	Fix and Reenable SnapshotTool Minio Tests (#50736 ) (#50745 ) This solves half of the problem in #46813 by moving the S3 tests to using the shared minio fixture so we at least have some non-3rd-party, constantly running coverage on these tests.	2020-01-08 16:33:36 +01:00
Armin Braun	8819fa4ebe	Make EC2 Discovery Cache Empty Seed Hosts List (#50607 ) (#50626 ) Follow up to #50550. Cache empty nodes lists (`fetchDynamicNodes` will return an empty list in case of failure) now that the plugin properly retries requests to AWS EC2 APIs.	2020-01-03 21:32:36 +01:00
Armin Braun	8092a4991e	Make EC2 Discovery Plugin Retry Requests (#50550 ) (#50558 ) Use the default retry condition instead of never retrying in the discovery plugin causing hot retries upstream and add a test that verifies retrying works. Closes #50462	2020-01-02 17:39:59 +01:00
Alexander Reelsen	541dc262bb	Remove accidentally added license files (#50370 ) As license infos and sha files belong to the licenses/ folder, these files seem to have been added accidentally some time ago.	2019-12-20 13:53:55 +01:00
Stuart Tettemer	689df1f28f	Scripting: ScriptFactory not required by compile (#50344 ) (#50392 ) Avoid backwards incompatible changes for 8.x and 7.6 by removing type restriction on compile and Factory. Factories may optionally implement ScriptFactory. If so, then they can indicate determinism and thus cacheability. Backport Relates: #49466	2019-12-19 12:50:25 -07:00
Tanguy Leroux	903305284d	Remove snapshots left by previous tests failures (#50380 ) When a third party test failed, it potentially left some snapshots in the repository. In case of tests running against an external service like Azure, the remaining snapshots can fail the future test executions are they are not supposed to exist. Similarly to what has been done for S3 and GCS, this commit cleans up remaining snapshots before the test execution. Closes #50304	2019-12-19 17:51:51 +01:00
Armin Braun	ce294e1564	Better Logging S3 Bulk Delete Failures (#50203 ) (#50262 ) Unfortunately bulk delete exceptions don't show the individual delete errors when a bulk delete fails when you log them outright so I added this work-around to get the individual details to get useful logging.	2019-12-17 09:42:39 +01:00
Armin Braun	761d6e8e4b	Remove BlobContainer Tests against Mocks (#50194 ) (#50220 ) * Remove BlobContainer Tests against Mocks Removing all these weird mocks as asked for by #30424. All these tests are now part of real repository ITs and otherwise left unchanged if they had independent tests that didn't call the `createBlobStore` method previously. The HDFS tests also get added coverage as a side-effect because they did not have an implementation of the abstract repository ITs. Closes #30424	2019-12-16 11:37:09 +01:00
Ignacio Vera	b5ec227de8	upgrade to lucene 8.4.0-snapshot-08b8d116f8f (#50129 ) (#50132 )	2019-12-12 13:13:37 +01:00
Armin Braun	6eee41e253	Remove Unused Single Delete in BlobStoreRepository (#50024 ) (#50123 ) * Remove Unused Single Delete in BlobStoreRepository There are no more production uses of the non-bulk delete or the delete that throws on missing so this commit removes both these methods. Only the bulk delete logic remains. Where the bulk delete was derived from single deletes, the single delete code was inlined into the bulk delete method. Where single delete was used in tests it was replaced by bulk deleting.	2019-12-12 11:17:46 +01:00
Armin Braun	d19c8db4e4	Fix GCS Mock Batch Delete Behavior (#50034 ) (#50084 ) Batch deletes get a response for every delete request, not just those that actually hit an existing blob. The fact that we only responded for existing blobs leads to a degenerate response that throws a parse exception if a batch delete only contains non-existant blobs.	2019-12-11 17:40:25 +01:00
Adrien Grand	87e72156ce	Upgrade to lucene 8.4.0-snapshot-662c455. (#50016 ) (#50039 ) Lucene 8.4 is about to be released so we should check it doesn't cause problems with Elasticsearch.	2019-12-10 18:04:58 +01:00
Jason Tedor	bfb2dc1353	Enable dependent settings values to be validated (#49942 ) Today settings can declare dependencies on another setting. This declaration is implemented so that if the declared setting is not set when the declaring setting is, settings validation fails. Yet, in some cases we want not only that the setting is set, but that it also has a specific value. For example, with the monitoring exporter settings, if xpack.monitoring.exporters.my_exporter.host is set, we not only want that xpack.monitoring.exporters.my_exporter.type is set, but that it is also set to local. This commit extends the settings infrastructure so that this declaration is possible. The use of this in the monitoring exporter settings will be implemented in a follow-up.	2019-12-09 12:45:50 -05:00
Stuart Tettemer	17cda5b2c0	Scripting: Groundwork for caching script results (#49895 ) (#49944 ) In order to cache script results in the query shard cache, we need to check if scripts are deterministic. This change adds a default method to the script factories, `isResultDeterministic() -> false` which is used by the `QueryShardContext`. Script results were never cached and that does not change here. Future changes will implement this method based on whether the results of the scripts are deterministic or not and therefore cacheable. Refs: #49466 Backport	2019-12-06 15:08:05 -07:00
Jake Landis	1c5a139968	Update jackson-databind to 2.8.11.4 (#49347 ) (#49937 )	2019-12-06 13:39:33 -06:00
Alexander Reelsen	d299bf5760	Add tests for ingesting CBOR data attachments (#49715 ) Our docs specifically mention that CBOR is supported when ingesting attachments. However this is not tested anywhere. This adds a test, that uses specifically CBOR format in its IndexRequest and another one that behaves like CBOR in the ingest attachment unit tests.	2019-12-06 14:33:39 +01:00
Stuart Tettemer	426c7a5e8f	Scripting: add available languages & contexts API (#49652 ) (#49815 ) Adds `GET /_script_language` to support Kibana dynamic scripting language selection. Response contains whether `inline` and/or `stored` scripts are enabled as determined by the `script.allowed_types` settings. For each scripting language registered, such as `painless`, `expression`, `mustache` or custom, available contexts for the language are included as determined by the `script.allowed_contexts` setting. Response format: ``` { "types_allowed": [ "inline", "stored" ], "language_contexts": [ { "language": "expression", "contexts": [ "aggregation_selector", "aggs" ... ] }, { "language": "painless", "contexts": [ "aggregation_selector", "aggs", "aggs_combine", ... ] } ... ] } ``` Fixes: #49463 Backport	2019-12-04 16:18:22 -07:00
Armin Braun	996cddd98b	Stop Copying Every Http Request in Message Handler (#44564 ) (#49809 ) * Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way * Relates #32228 * I think the issue that preventet that PR that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks (I'd argue the bounds checks we add, we save when copying the composite buffer) * I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)	2019-12-04 08:41:42 +01:00
Armin Braun	813b49adb4	Make BlobStoreRepository Aware of ClusterState (#49639 ) (#49711 ) * Make BlobStoreRepository Aware of ClusterState (#49639) This is a preliminary to #49060. It does not introduce any substantial behavior change to how the blob store repository operates. What it does is to add all the infrastructure changes around passing the cluster service to the blob store, associated test changes and a best effort approach to tracking the latest repository generation on all nodes from cluster state updates. This brings a slight improvement to the consistency by which non-master nodes (or master directly after a failover) will be able to determine the latest repository generation. It does not however do any tricky checks for the situation after a repository operation (create, delete or cleanup) that could theoretically be used to get even greater accuracy to keep this change simple. This change does not in any way alter the behavior of the blobstore repository other than adding a better "guess" for the value of the latest repo generation and is mainly intended to isolate the actual logical change to how the repository operates in #49060	2019-11-29 14:57:47 +01:00
Mayya Sharipova	2dafecc398	Upgrade lucene to 8.4.0-snapshot-e648d601efb (#49641 )	2019-11-28 11:59:58 -05:00
Jim Ferenczi	d6445fae4b	Add a cluster setting to disallow loading fielddata on _id field (#49166 ) This change adds a dynamic cluster setting named `indices.id_field_data.enabled`. When set to `false` any attempt to load the fielddata for the `_id` field will fail with an exception. The default value in this change is set to `false` in order to prevent fielddata usage on this field for future versions but it will be set to `true` when backporting to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472 is implemented. Closes #43599	2019-11-28 09:35:28 +01:00
Armin Braun	3862400270	Remove Redundant EsBlobStoreTestCase (#49603 ) (#49605 ) All the implementations of `EsBlobStoreTestCase` use the exact same bootstrap code that is also used by their implementation of `EsBlobStoreContainerTestCase`. This means all tests might as well live under `EsBlobStoreContainerTestCase` saving a lot of code duplication. Also, there was no HDFS implementation for `EsBlobStoreTestCase` which is now automatically resolved by moving the tests over since there is a HDFS implementation for the container tests.	2019-11-26 20:57:19 +01:00
Alan Woodward	fe2c65185e	Annotated text type should extend TextFieldType (#49555 ) The annotated text mapper has a field type that currently extends StringFieldType, which means that all the positional-related query factory methods need to be copied over from TextFieldType. In addition, MappedFieldType.intervals() hasn't been overridden, so you can't use intervals queries with annotated text - a major drawback, since one of the purposes of annotated text is to be able to run positional queries against annotations. This commit changes the annotated text field type to extend TextFieldType instead, adding tests to ensure that position queries work correctly. Closes #49289	2019-11-26 16:52:21 +00:00
Armin Braun	495b543e63	Improve Stability of GCS Mock API (#49592 ) (#49597 ) Same as #49518 pretty much but for GCS. Fixing a few more spots where input stream can get closed without being fully drained and adding assertions to make sure it's always drained. Moved the no-close stream wrapper to production code utilities since there's a number of spots in production code where it's also useful (will reuse it there in a follow-up).	2019-11-26 16:53:51 +01:00
Armin Braun	231d079bf8	Fix Azure Mock Issues (#49377 ) (#49381 ) Fixing a few small issues found in this code: 1. We weren't reading the request headers but the response headers when checking for blob existence in the mocked single upload path 2. Error code can never be `null` removed the dead code that resulted 3. In the logging wrapper we weren't checking for `Throwable` so any failing assertions in the http mock would not show up since they run on a thread managed by the mock http server	2019-11-21 19:57:50 +01:00
Tanguy Leroux	6bad28a835	Mute AzureBlobStoreRepositoryTests (#49364 ) Relates #48978	2019-11-20 11:16:16 +01:00
Tanguy Leroux	f753fa2265	HttpHandlers should return correct list of objects (#49283 ) This commit fixes the server side logic of "List Objects" operations of Azure and S3 fixtures. Until today, the fixtures were returning a " flat" view of stored objects and were not correctly handling the delimiter parameter. This causes some objects listing to be wrongly interpreted by the snapshot deletion logic in Elasticsearch which relies on the ability to list child containers of BlobContainer (#42653) to correctly delete stale indices. As a consequence, the blobs were not correctly deleted from the emulated storage service and stayed in heap until they got garbage collected, causing CI failures like #48978. This commit fixes the server side logic of Azure and S3 fixture when listing objects so that it now return correct common blob prefixes as expected by the snapshot deletion process. It also adds an after-test check to ensure that tests leave the repository empty (besides the root index files). Closes #48978	2019-11-20 09:26:42 +01:00
Tanguy Leroux	ca4f55f2e4	Add docker-compose fixtures for S3 integration tests (#49107 ) (#49229 ) Similarly to what has been done for Azure (#48636) and GCS (#48762), this committ removes the existing Ant fixture that emulates a S3 storage service in favor of multiple docker-compose based fixtures. The goals here are multiple: be able to reuse a s3-fixture outside of the repository-s3 plugin; allow parallel execution of integration tests; removes the existing AmazonS3Fixture that has evolved in a weird beast in dedicated, more maintainable fixtures. The server side logic that emulates S3 mostly comes from the latest HttpHandler made for S3 blob store repository tests, with additional features extracted from the (now removed) AmazonS3Fixture: authentication checks, session token checks and improved response errors. Chunked upload request support for S3 object has been added too. The server side logic of all tests now reside in a single S3HttpHandler class. Whereas AmazonS3Fixture contained logic for basic tests, session token tests, EC2 tests or ECS tests, the S3 fixtures are now dedicated to each kind of test. Fixtures are inheriting from each other, making things easier to maintain.	2019-11-18 05:56:59 -05:00
Rory Hunter	c46a0e8708	Apply 2-space indent to all gradle scripts (#49071 ) Backport of #48849. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-14 11:01:23 +00:00
Tanguy Leroux	20fc1dbe18	Move MinIO fixture in its own project (#49036 ) This commit moves the MinIO docker-compose fixture from the :plugins:repository-s3 to its own :test:minio-fixture Gradle project.	2019-11-13 10:03:59 -05:00
Tanguy Leroux	8a14ea5567	Add docker-composed based test fixture for GCS (#48902 ) Similarly to what has be done for Azure in #48636, this commit adds a new :test:fixtures:gcs-fixture project which provides two docker-compose based fixtures that emulate a Google Cloud Storage service. Some code has been extracted from existing tests and placed into this new project so that it can be easily reused in other projects.	2019-11-07 13:27:22 -05:00
Mark Vieira	6ab4645f4e	[7.x] Introduce type-safe and consistent pattern for handling build globals (#48818 ) This commit introduces a consistent, and type-safe manner for handling global build parameters through out our build logic. Primarily this replaces the existing usages of extra properties with static accessors. It also introduces and explicit API for initialization and mutation of any such parameters, as well as better error handling for uninitialized or eager access of parameter values. Closes #42042	2019-11-01 11:33:11 -07:00
Andrey Ershov	088988bb37	GCS snapshot cleanup tool backport to 7.x (#48750 ) This is the backport of #45076 with dependent changes.	2019-10-31 18:21:36 +03:00
Tanguy Leroux	989467ca1e	Add docker-compose based test fixture for Azure (#48736 ) This commit adds a new :test:fixtures:azure-fixture project which provides a docker-compose based container that runs a AzureHttpFixture Java class that emulates an Azure Storage service. The logic to emulate the service is extracted from existing tests and placed in AzureHttpHandler into the new project so that it can be easily reused. The :plugins:repository-azure project is an example of such utilization. The AzureHttpFixture fixture is just a wrapper around AzureHttpHandler and is now executed within the docker container. The :plugins:repository-azure:qa:microsoft-azure project uses the new test fixture and the existing AzureStorageFixture has been removed.	2019-10-31 10:43:43 +01:00
Tanguy Leroux	24f6985235	Reduce allocations when draining HTTP requests bodies in repository tests (#48541 ) In repository integration tests, we drain the HTTP request body before returning a response. Before this change this operation was done using Streams.readFully() which uses a 8kb buffer to read the input stream, it now uses a 1kb for the same operation. This should reduce the allocations made during the tests and speed them up a bit on CI. Co-authored-by: Armin Braun <me@obrown.io>	2019-10-29 09:15:06 +01:00
Tim Brooks	45e42f4e18	Upgrade to Netty 4.1.43 (#48484 ) With this update we can remove the mitigation in our custom allocator which forces heap buffer allocations.	2019-10-25 10:17:25 -06:00
Tanguy Leroux	06d2cc5cef	Add missing azure error code (#48520 ) In #47176 we changed the internal HTTP server that emulates the Azure Storage service so that it includes a response body for injected errors. This fixed most of the issues reported in #47120 but sadly I missed to map one error to its Azure equivalent, and it triggered some CI failures today. Closes #47120	2019-10-25 16:50:51 +02:00
Tim Brooks	c0b545f325	Make BytesReference an interface (#48486 ) BytesReference is currently an abstract class which is extended by various implementations. This makes it very difficult to use the delegation pattern. The implication of this is that our releasable BytesReference is a PagedBytesReference type and cannot be used as a generic releasable bytes reference that delegates to any reference type. This commit makes BytesReference an interface and introduces an AbstractBytesReference for common functionality.	2019-10-24 15:39:30 -06:00
Tanguy Leroux	e1dd0e753d	Differentiate service account tokens in GCS tests (#48382 ) This commit changes the test so that each node use a specific service account and private key. It also changes how unique request ids are generated for refresh token request using the token itself, so that error count will be specific per node (each node should execute a single refresh token request as tokens are valid for 1 hour).	2019-10-23 16:57:35 +02:00
Tanguy Leroux	4790ee4c32	Reenable azure repository tests and remove some randomization in http servers (#48283 ) Relates #47948 Relates #47380	2019-10-23 09:06:50 +02:00
Ignacio Vera	b1224fca8c	upgrade to Lucene-8.3.0-snapshot-25968e3b75e (#48227 )	2019-10-21 08:21:09 +02:00
Armin Braun	5caa101345	Fix Bug in Azure Repo Exception Handling (#47968 ) (#48030 ) We were incorrectly handling `IOExceptions` thrown by the `InputStream` side of the upload operation, resulting in a `ClassCastException` as we expected to never get `IOException` from the Azure SDK code but we do in practice. This PR also sets an assertion on `markSupported` for the streams used by the SDK as adding the test for this scenario revealed that the SDK client would retry uploads for non-mark-supporting streams on `IOException` in the `InputStream`.	2019-10-15 12:10:19 +02:00
Tim Brooks	8814bf07f1	Upgrade to Netty 4.1.42 (#48015 ) Upgrades the netty version.	2019-10-14 13:54:02 -06:00
Nick Knize	7f01b0a670	Mute AzureBlobStoreRepositoryTests.testIndicesDeletedFromRepository (#47949 )	2019-10-11 14:07:24 -05:00
Jim Ferenczi	bd6e2592a7	Remove the SearchContext from the highlighter context (#47733 ) Today built-in highlighter and plugins have access to the SearchContext through the highlighter context. However most of the information exposed in the SearchContext are not needed and a QueryShardContext would be enough to perform highlighting. This change replaces the SearchContext by the informations that are absolutely required by highlighter: a QueryShardContext and the SearchContextHighlight. This change allows to reduce the exposure of the complex SearchContext and remove the needs to clone it in the percolator sub phase. Relates #47198 Relates #46523	2019-10-10 10:34:10 +02:00
Armin Braun	302e09decf	Simplify some Common ActionRunnable Uses (#47799 ) (#47828 ) Especially in the snapshot code there's a lot of logic chaining `ActionRunnables` in tricky ways now and the code is getting hard to follow. This change introduces two convinience methods that make it clear that a wrapped listener is invoked with certainty in some trickier spots and shortens the code a bit.	2019-10-09 23:29:50 +02:00
Ryan Ernst	f32692208e	Add explanations to script score queries (#46693 ) (#47548 ) While function scores using scripts do allow explanations, they are only creatable with an expert plugin. This commit improves the situation for the newer script score query by adding the ability to set the explanation from the script itself. To set the explanation, a user would check for `explanation != null` to indicate an explanation is needed, and then call `explanation.set("some description")`.	2019-10-03 21:05:05 -07:00
Alpar Torok	0a14bb174f	Remove eclipse conditionals (#44075 ) * Remove eclipse conditionals We used to have some meta projects with a `-test` prefix because historically eclipse could not distinguish between test and main source-sets and could only use a single classpath. This is no longer the case for the past few Eclipse versions. This PR adds the necessary configuration to correctly categorize source folders and libraries. With this change eclipse can import projects, and the visibility rules are correct e.x. auto compete doesn't offer classes from test code or `testCompile` dependencies when editing classes in `main`. Unfortunately the cyclic dependency detection in Eclipse doesn't seem to take the difference between test and non test source sets into account, but since we are checking this in Gradle anyhow, it's safe to set to `warning` in the settings. Unfortunately there is no setting to ignore it. This might cause problems when building since Eclipse will probably not know the right order to build things in so more wirk might be necesarry.	2019-10-03 11:55:00 +03:00
Tanguy Leroux	f5c5411fe8	Differentiate base paths in repository integration tests (#47284 ) (#47300 ) This commit change the repositories base paths used in Azure/S3/GCS integration tests so that they don't conflict with each other when tests run in parallel on real storage services. Closes #47202	2019-10-01 08:39:55 +02:00
James Rodewig	e01465eb88	[DOCS] Correct typo in ICU Analysis plugin description (#47175 ) (#47219 )	2019-09-27 13:04:14 -04:00
Henning Andersen	a1e2e208ce	Mute Snapshot/Restore with repository-azure (#47204 ) Relates #47201	2019-09-27 12:13:01 +02:00
Tanguy Leroux	42ae76ab7c	Injected response errors in Azure repository tests should have a body (#47176 ) The Azure SDK client expects server errors to have a body, something that looks like: <?xml version="1.0" encoding="utf-8"?> <Error> <Code>string-value</Code> <Message>string-value</Message> </Error> I've forgot to add such errors in Azure tests and that triggers some NPE in the client like the one reported in #47120. Closes #47120	2019-09-27 09:43:29 +02:00
Tanguy Leroux	b1bf05bb89	Add blob container retries tests for Azure SDK client (#47032 ) Similarly to what has been done for S3 and GCS, this commit adds unit tests that verify the retry logic of the Azure SDK client implementation when the remote service returns errors. It only tests the retry logic in case of errors and not in case of timeouts because Azure client timeout options are not exposed as settings.	2019-09-25 09:19:48 +02:00
Armin Braun	00f2e7f627	Update AWS SDK for repository-s3 plugin to support IAM Roles for Service Accounts (#46969 ) (#47004 ) * Update AWS SDK for repository-s3 and discovery-ec2 plugins	2019-09-24 17:15:11 +02:00
Tanguy Leroux	6986d7f968	Add blob container retries tests for Google Cloud Storage (#46968 ) Similarly to what has been done for S3 in #45383, this commit adds unit tests that verify the behavior of the SDK client and blob container implementation for Google Storage when the remote service returns errors. The main purpose was to add an extra test to the specific retry logic for 410-Gone errors added in #45963. Relates #45963	2019-09-24 08:58:24 +02:00
Alpar Torok	5fd7505efc	Testfixtures allow a single service only (#46780 ) This PR adds some restrictions around testfixtures to make sure the same service ( as defiend in docker-compose.yml ) is not shared between multiple projects. Sharing would break running with --parallel. Projects can still share fixtures as long as each has it;s own service within. This is still useful to share some of the setup and configuration code of the fixture. Project now also have to specify a service name when calling useCluster to refer to a specific service. If this is not the case all services will be claimed and the fixture can't be shared. For this reason fixtures have to explicitly specify if they are using themselves ( fixture and tests in the same project ).	2019-09-23 14:13:49 +03:00
Tanguy Leroux	add7148f3b	GCS deleteBlobsIgnoringIfNotExists should catch StorageException (#46832 ) GoogleCloudStorageBlobStore.deleteBlobsIgnoringIfNotExists() does not correctly catch StorageException thrown by batch.submit(). In the case a snapshot is deleted through BlobStoreRepository.deleteSnapshot() a storage exception is not caught (only IOException are) so the deletion is interrupted and indices cannot be cleaned up. The storage exception bubbles up to SnapshotService.deleteSnapshotFromRepository() but the listener that removes the deletion from the cluster state is not executed, leaving the deletion in the cluster state. This bug has been reported in #46772 where batch.submit() threw an exception in the test testIndicesDeletedFromRepository and following tests failed because a snapshot deletion was running. Relates #46772	2019-09-20 10:02:23 +02:00
Tanguy Leroux	3ae51f25dd	Move testSnapshotWithLargeSegmentFiles to ESMockAPIBasedRepositoryIntegTestCase (#46802 ) This commit moves the common test testSnapshotWithLargeSegmentFiles to the ESMockAPIBasedRepositoryIntegTestCase base class.	2019-09-18 15:41:30 +02:00
Tanguy Leroux	799f7def9f	Add block support to AzureBlobStoreRepositoryTests (#46664 ) This commit adds support for Put Block API to the internal HTTP server used in Azure repository integration tests. This allows to test the behavior of the Azure SDK client when the Azure Storage service returns errors when uploading Blob in multiple blocks or when downloading a blob using ranged downloads.	2019-09-18 09:43:08 +02:00
Tanguy Leroux	fd42358a6d	Add support for Multipart upload to S3 repository integration tests (#46704 ) This commit adds support for Multipart upload to the internal HTTP server used in S3 repository integration tests.	2019-09-18 09:40:25 +02:00
Tanguy Leroux	4db37801d0	Add resumable uploads support to GCS repository integration tests (#46562 ) This commit adds support for resumable uploads to the internal HTTP server used in GoogleCloudStorageBlobStoreRepositoryTests. This way we can also test the behavior of the Google's client when the service returns server errors in response to resumable upload requests. The BlobStore implementation for GCS has the choice between 2 methods to upload a blob: resumable and multipart. In the current implementation, the client executes a resumable upload if the blob size is larger than LARGE_BLOB_THRESHOLD_BYTE_SIZE, otherwise it executes a multipart upload. This commit makes this logic overridable in tests, allowing to randomize the decision of using one method or the other. The commit add support for single request resumable uploads and chunked resumable uploads (the blob is uploaded into multiple 2Mb chunks; each chunk being a resumable upload). For this last case, this PR also adds a test testSnapshotWithLargeSegmentFiles which makes it more probable that a chunked resumable upload is executed.	2019-09-18 09:33:05 +02:00
Armin Braun	371c355bca	Retry GCS Resumable Upload on Error 410 (#45963 ) (#46783 ) A resumable upload session can fail on with a 410 error and should be retried in that case. I added retrying twice using resetting of the given `InputStream` as the retry mechanism since the same approach is used by the AWS S3 SDK already as well and relied upon by the S3 repository implementation. Related GCS documentation: https://cloud.google.com/storage/docs/json_api/v1/status-codes#410_Gone	2019-09-17 19:06:43 +02:00
Armin Braun	b00de8edf3	Ensure SAS Tokens in Test Use Minimal Permissions (#46112 ) (#46628 ) There were some issues with the Azure implementation requiring permissions to list all containers ue to a container exists check. This was caught in CI this time, but going forward we should ensure that CI is executed using a token that does not allow listing containers. Relates #43288	2019-09-17 15:40:11 +02:00
David Turner	65dc888623	Resume partial download from S3 on connection drop (#46589 ) Today if the connection to S3 times out or drops after starting to download an object then the SDK does not attempt to recover or resume the download, causing the restore of the whole shard to fail and retry. This commit allows Elasticsearch to detect such a mid-stream failure and to resume the download from where it failed.	2019-09-17 13:11:36 +01:00
Luca Cavanna	e57756492a	Update http-core and http-client dependencies (#46549 ) Relates to #45808 Closes #45577	2019-09-12 09:45:29 +02:00
Mark Vieira	ccf656a9d0	Repository plugin test cacheability fixes (#46572 )	2019-09-11 08:24:55 -07:00
Tanguy Leroux	88bed09119	Mutualize code in cloud-based repository integration tests (#46483 ) This commit factors out some common code between the cloud-based repository integration tests that were recently improved. Relates #46376	2019-09-09 16:02:14 +02:00
Tanguy Leroux	023cf44025	Inject random server errors in AzureBlobStoreRepositoryTests (#46371 ) This commit modifies the HTTP server used in AzureBlobStoreRepositoryTests so that it randomly returns server errors for any type of request executed by the Azure client.	2019-09-09 10:00:09 +02:00
Tanguy Leroux	8e3dc68454	Inject random server errors in GoogleCloudStorageBlobStoreRepositoryTests (#46376 ) This commit modifies the HTTP server used in GoogleCloudStorageBlobStoreRepositoryTests so that it randomly returns server errors. The test does not inject server errors for the following types of request: batch request, resumable upload request.	2019-09-09 09:59:59 +02:00
David Turner	cc092b1be1	Add support for OneZoneInfrequentAccess storage (#46436 ) The `repository-s3` plugin has supported a storage class of `onezone_ia` since the SDK upgrade in #30723, but we do not test or document this fact. This commit adds this storage class to the docs and adds a test to ensure that the documented storage classes are all accepted by S3 too. Fixes #30474	2019-09-09 07:54:44 +01:00
Tanguy Leroux	2290865559	Fix usage of randomIntBetween() in testWriteBlobWithRetries (#46380 ) This commit fixes the usage of randomIntBetween() in the test testWriteBlobWithRetries, when the test generates a random array of a single byte.	2019-09-06 09:10:38 +02:00
Tanguy Leroux	28974b5723	Replace mocked client in GCSBlobStoreRepositoryTests by HTTP server (#46255 ) This commit removes the usage of MockGoogleCloudStoragePlugin in GoogleCloudStorageBlobStoreRepositoryTests and replaces it by a HttpServer that emulates the Storage service. This allows the repository tests to use the real Google's client under the hood in tests and will allow us to test the behavior of the snapshot/restore feature for GCS repositories by simulating random server-side internal errors. The HTTP server used to emulate the Storage service is intentionally simple and minimal to keep things understandable and maintainable. Testing full client options on the server side (like authentication, chunked encoding etc) remains the responsibility of the GoogleCloudStorageFixture.	2019-09-05 10:37:37 +02:00
Tanguy Leroux	6d1a82134c	Add repository integration tests for Azure (#46263 ) Similarly to what had been done for S3 (#46081) and GCS (#46255) this commit adds repository integration tests for Azure, based on an internal HTTP server instead of mocks.	2019-09-05 09:26:42 +02:00
Tanguy Leroux	bd7a04cd55	Disable request throttling in S3BlobStoreRepositoryTests (#46226 ) When some high values are randomly picked up - for example the number of indices to snapshot or the number of snapshots to create - the tests in S3BlobStoreRepositoryTests can generate a high number of requests to the internal S3 server. In order to test the retry logic of the S3 client, the internal server is designed to randomly generate random server errors. When many requests are made, it is possible that the S3 client reaches its maximum number of successive retries capacity. Then the S3 client will stop retrying requests until enough retry attempts succeed, but it means that any request could fail before reaching the max retries count and make the test fail too. Closes #46217 Closes #46218 Closes #46219	2019-09-02 16:44:43 +02:00
Henning Andersen	d68e05aade	Mute 2 tests in S3BlobStoreRepositoryTests (#46221 ) Muted testSnapshotAndRestore and testMultipleSnapshotAndRollback Relates #46218 and #46219	2019-09-02 10:38:03 +02:00
Tanguy Leroux	0c1b263e8d	Inject random errors in S3BlobStoreRepositoryTests (#46125 ) This commit modifies the HTTP server used in S3BlobStoreRepositoryTests so that it randomly returns server errors for any type of request executed by the SDK client. It is now possible to verify that the repository tests are s uccessfully completed even if one or more errors were returned by the S3 service in response of a blob upload, a blob deletion or a object listing request etc. Because injecting errors forces the SDK client to retry requests, the test limits the maximum errors to send in response for each request at 3 retries.	2019-08-30 11:58:09 +02:00
Tanguy Leroux	b526309fbd	Replace MockAmazonS3 usage in S3BlobStoreRepositoryTests by a HTTP server (#46081 ) This commit removes the usage of MockAmazonS3 in S3BlobStoreRepositoryTests and replaces it by a HttpServer that emulates the S3 service. This allows the repository tests to use the real Amazon's S3 client under the hood in tests and will allow to test the behavior of the snapshot/restore feature for S3 repositories by simulating random server-side internal errors. The HTTP server used to emulate the S3 service is intentionally simple and minimal to keep things understandable and maintainable. Testing full client options on the server side (like authentication, chunked encoding etc) remains the responsibility of the AmazonS3Fixture.	2019-08-29 13:16:59 +02:00
Tanguy Leroux	9e14ffa8be	Few clean ups in ESBlobStoreRepositoryIntegTestCase (#46068 )	2019-08-28 16:29:46 +02:00
Jason Tedor	3d64605075	Remove node settings from blob store repositories (#45991 ) This commit starts from the simple premise that the use of node settings in blob store repositories is a mistake. Here we see that the node settings are used to get default settings for store and restore throttle rates. Yet, since there are not any node settings registered to this effect, there can never be a default setting to fall back to there, and so we always end up falling back to the default rate. Since this was the only use of node settings in blob store repository, we move them. From this, several places fall out where we were chaining settings through only to get them to the blob store repository, so we clean these up as well. That leaves us with the changeset in this commit.	2019-08-26 16:26:13 -04:00
Tanguy Leroux	a3d918bddb	Refactor RepositoryCredentialsTests (#45919 ) This commit refactors the S3 credentials tests in RepositoryCredentialsTests so that it now uses a single node (ESSingleNodeTestCase) to test how secure/insecure credentials are overriding each other. Using a single node makes it much easier to understand what each test is actually testing and IMO better reflect how things are initialized. It also allows to fold into this class the test testInsecureRepositoryCredentials which was wrongly located in S3BlobStoreRepositoryTests. By moving this test away, the S3BlobStoreRepositoryTests class does not need the allow_insecure_settings option anymore and thus can be executed as part of the usual gradle test task.	2019-08-26 15:14:43 +02:00
Tanguy Leroux	aee92d573c	Allow partial request body reads in AWS S3 retries tests (#45847 ) This commit changes the tests added in #45383 so that the fixture that emulates the S3 service now sometimes consumes all the request body before sending an error, sometimes consumes only a part of the request body and sometimes consumes nothing. The idea here is to beef up a bit the tests that writes blob because the client's retry logic relies on marking and resetting the blob's input stream. This pull request also changes the testWriteBlobWithRetries() so that it (rarely) tests with a large blob (up to 1mb), which is more than the client's default read limit on input streams (131Kb). Finally, it optimizes the ZeroInputStream so that it is a bit more effective (now works using an internal buffer and System.arraycopy() primitives).	2019-08-23 13:43:31 +02:00
Tanguy Leroux	57a36eb373	Add tests to check that requests are retried when writing/reading blobs on S3 (#45383 ) This commit adds tests to verify the behavior of the S3BlobContainer and its underlying AWS SDK client when the remote S3 service is responding errors or not responding at all. The expected behavior is that requests are retried multiple times before the client gives up and the S3BlobContainer bubbles up an exception. The test verifies the behavior of BlobContainer.writeBlob() and BlobContainer.readBlob(). In the case of S3 writing a blob can be executed as a single upload or using multipart requests; the test checks both scenario by writing a small then a large blob.	2019-08-22 11:41:40 +02:00
Armin Braun	6aaee8aa0a	Repository Cleanup Endpoint (#43900 ) (#45780 ) * Repository Cleanup Endpoint (#43900) * Snapshot cleanup functionality via transport/REST endpoint. * Added all the infrastructure for this with the HLRC and node client * Made use of it in tests and resolved relevant TODO * Added new `Custom` CS element that tracks the cleanup logic. Kept it similar to the delete and in progress classes and gave it some (for now) redundant way of handling multiple cleanups but only allow one * Use the exact same mechanism used by deletes to have the combination of CS entry and increment in repository state ID provide some concurrency safety (the initial approach of just an entry in the CS was not enough, we must increment the repository state ID to be safe against concurrent modifications, otherwise we run the risk of "cleaning up" blobs that just got created without noticing) * Isolated the logic to the transport action class as much as I could. It's not ideal, but we don't need to keep any state and do the same for other repository operations (like getting the detailed snapshot shard status)	2019-08-21 17:59:49 +02:00
Jim Ferenczi	fe2a7523ec	Add support for inlined user dictionary in the Kuromoji plugin (#45489 ) This change adds a new option called user_dictionary_rules to Kuromoji's tokenizer. It can be used to set additional tokenization rules to the Japanese tokenizer directly in the settings (instead of using a file). This commit also adds a check that no rules are duplicated since this is not allowed in the UserDictionary. Closes #25343	2019-08-21 16:28:30 +02:00
Igor Motov	1818c5fa44	Ingest Attachment: Upgrade tika to v1.22 (#45575 ) Upgrades: Apache Tika: 1.19.1 -> 1.22. pdfbox : 2.0.12 -> 2.0.16 poi : 4.0.0 -> 4.0.1	2019-08-19 18:17:16 -04:00

... 2 3 4 5 6 ...

2718 Commits