OpenSearch

Commit Graph

Author	SHA1	Message	Date
Vacha	d151082832	Upgrade hadoop dependencies for hdfs plugin (#1335 ) * Upgrade hadoop dependencies for hdfs plugin Signed-off-by: Vacha <vachshah@amazon.com> * Fixing gradle check failures Signed-off-by: Vacha <vachshah@amazon.com> * Upgrading htrace-core4 to 4.1.0 Signed-off-by: Vacha <vachshah@amazon.com>	2021-10-14 14:43:49 -04:00
Andriy Redko	3779576c51	Modernize and consolidate JDKs usage across all stages of the build. Use JDK-17 as bundled JDK distribution to run tests (#1358 ) * Modernize and consolidate JDKs usage across all stages of the build. Use JDK-17 as bundled JDK distribution to run tests Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Using -Djava.security.egd=file:/dev/urandom explicitly for cli tests Signed-off-by: Andriy Redko <andriy.redko@aiven.io>	2021-10-13 17:25:48 -04:00
Andriy Redko	cdbc84f09d	Update Jackson to 2.12.5 (#1247 ) Signed-off-by: Andriy Redko <andriy.redko@aiven.io>	2021-09-21 18:33:20 -04:00
Andriy Redko	b6c8bdf872	Drop mocksocket in favour of custom security manager checks (tests only) (#1205 ) * Drop mocksocket in favour of custom security manager checks (tests only) Signed-off-by: Andriy Redko <andriy.redko@aiven.io> * Slightly relaxed host checks to allow all local addresses Signed-off-by: Andriy Redko <andriy.redko@aiven.io>	2021-09-16 17:21:47 -04:00
Abbas Hussain	fa8126004c	Upgrade apache commons-compress to 1.21 (#1197 ) Signed-off-by: Abbas Hussain <abbas_10690@yahoo.com>	2021-09-02 08:35:42 +05:30
Nick Knize	5ae00456a0	Upgrade to Lucene 8.9 (#1080 ) This commit upgrades to the official lucene 8.9 release Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2021-08-20 11:28:06 -05:00
Nick Knize	ff7e7904ca	[DEPRECATE] SimpleFS in favor of NIOFS (#1073 ) Lucene 9 removes support for SimpleFS File System format. This commit deprecates the SimpleFS format in favor of NIOFS. Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2021-08-19 17:56:55 -05:00
Sven R	dcd9cef56c	alt bash path support (#1047 ) Signed-off-by: hackacad <admin@hackacad.net>	2021-08-06 11:09:29 -04:00
Vacha	c7617b03e8	Replacing docs-beta links with /docs (#957 ) Signed-off-by: Vacha Shah <vachshah@amazon.com>	2021-07-13 07:46:05 -07:00
Vacha	e17ce53eb7	Adding broken links checker (#877 ) * Adding broken links checker Signed-off-by: Vacha Shah <vachshah@amazon.com> * Adding exclusions for links Signed-off-by: Vacha Shah <vachshah@amazon.com> * Correcting broken link Signed-off-by: Vacha Shah <vachshah@amazon.com> * Removing the benchmarks link Signed-off-by: Vacha Shah <vachshah@amazon.com>	2021-07-12 14:07:56 -07:00
Tianli Feng	18625952a9	update external library 'pdfbox' version to 2.0.24 to reduce vulnerability (#883 )	2021-06-25 13:18:15 -07:00
Abbas Hussain	3e92821c82	[CVE] Upgrade dependencies for Azure related plugins to mitigate CVEs (#688 ) * Update commons-io-2.4.jar to 2.7 for plugins/discovery-azure-classic module * Remove unused jackson dependency and respective LICENSE and NOTICE * Update guava dependency to mitigate CVE for repository-azure plugin Signed-off-by: Abbas Hussain <abbas_10690@yahoo.com>	2021-05-26 03:27:36 +05:30
Rabi Panda	50abf6d066	[CVE] Upgrade dependencies to mitigate CVEs (#657 ) This PR upgrade the following dependencies to fix CVEs. - commons-codec:1.12 (->1.13) apache/commons-codec@48b6157 - ant:1.10.8 (->1.10.9) https://ant.apache.org/security.html - jackson-databind:2.10.4 (->2.11.0) FasterXML/jackson-databind#2589 - jackson-dataformat-cbor:2.10.4 (->2.11.0) https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-28491 - apache-httpclient:4.5.10 (->4.5.13) https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2020-13956 - checkstyle:8.20 (->8.29) https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-10782 - junit:4.12 (->4.13.1) https://github.com/junit-team/junit4/security/advisories/GHSA-269g-pwp5-87pp - netty:4.1.49.Final (->4.1.59) https://github.com/netty/netty/security/advisories/GHSA-5mcr-gq6c-3hq2 Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-05-18 11:37:24 -07:00
Rabi Panda	943c778a7f	[CVE-2018-11765] Upgrade hadoop dependencies for hdfs plugin (#654 ) Hadoop 2.8.5 has been reported to have CVEs (https://bugzilla.redhat.com/show_bug.cgi?id=1883549). We need to upgrade this to 2.10.1. This also updates the hadoop-minicluster version to 2.10.1 as well. This upgrade also brings in two additional dependencies, woodstox-core and stax2-api that are added along with the sha1s, licenses and notices. Also upgrade guava to the latest as per the CVE https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8908 Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-05-13 14:56:47 -07:00
Rabi Panda	6550e099b3	[CVE-2020-7692] Upgrade google-oauth clients for goolge cloud plugins (#662 ) For discovery-gce and repository-gcs plugins update the google-oauth-client library to version 1.31.0. See CVE details at https://nvd.nist.gov/vuln/detail/CVE-2020-7692 Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-05-13 12:19:57 -07:00
Rabi Panda	0e180f4703	Update dependencies for ingest-attachment plugin. (#666 ) This PR resolves the CVEs for dependencies in the ingest-attachment plugin. tika : '1.24' -> '1.24.1' (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-9489) pdfbox : '2.0.19' -> '2.0.23' (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-27807) commons-io:commons-io : '2.6' -> '2.7' (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-29425) Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-05-11 10:40:33 -07:00
Nick Knize	c5a3c3cb41	Update lucene version to 8.8.2 (#557 ) This commit updates the codebase to the latest released version of Lucene. Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2021-04-23 09:48:41 -05:00
Rabi Panda	3fede8be3c	Rename the distribution used in test clusters. (#603 ) For test clusters, we are using the archive(zip, tar), so we rename the distribution accordingly. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-04-22 14:21:32 -07:00
Nick Knize	0ba0e7cc26	[Versioning] Rebase to OpenSearch version 1.0.0 (#555 ) This commit rebases the versioning to OpenSearch 1.0.0 Co-authored-by: Rabi Panda <adnapibar@gmail.com> Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2021-04-15 17:06:47 -05:00
Nick Knize	ee6d15e26a	[License] Add SPDX License Header to security policies (#531 ) This commit adds the SPDX license header and modifications copyright to security policy files. Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2021-04-12 22:59:36 -05:00
Rabi Panda	8727afbcd3	Use the correct domain to fix failing integration tests. (#519 ) This commit fixes a renaming issue (opensearch.co -> opensearch.org) which was causing few integration test failures. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-04-10 09:42:39 -07:00
Rabi Panda	2a3ce0bb75	Fix rename issues and failing repository-hdfs tests. (#518 ) This commit fixes some partial rename issues and as a result fixes the failing secure repository-hdfs tests. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-04-09 17:51:27 -07:00
Nick Knize	9168f1fb43	[License] Add SPDX and OpenSearch Modification license header (#509 ) This commit adds the SPDX Apache-2.0 license header along with an additional copyright header for all modifications. Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2021-04-09 14:28:18 -05:00
Rabi Panda	2dca3462f2	Fix stragglers from renaming to OpenSearch work. (#483 ) This commit fixes more instances where we missed renaming to OpenSearch. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-04-05 11:51:20 -07:00
Harold Wang	5971a518d0	Replace nio and nitty test endpoint (#475 ) Signed-off-by: Harold Wang <harowang@amazon.com>	2021-03-31 13:37:22 -07:00
Harold Wang	fd4c3968ab	[Rename] org.opensearch.ingest.attachment.IngestAttachmentClientYamlTestSuiteIT (#463 ) * Change "Test elasticsearch" back * Update content, language and size of test attachement * Regenerate test attachment content with updated date and author Signed-off-by: Harold Wang <harowang@amazon.com>	2021-03-26 21:59:23 -07:00
Rabi Panda	3460a8c213	Fix a few more renaming issues. (#464 ) This commit fixes some more missed instances where we can perform the renaming to OpenSearch. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-26 12:05:16 -07:00
Rabi Panda	0bdd1293c1	Use alternate example data in OpenSearch test cases. (#454 ) This commit updates some of the sample test data used in test cases in OpenSearch. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-25 08:52:07 -07:00
Rabi Panda	2e3055c9e2	Fix more failing tests as a result of renaming (#457 ) This commit fixes some more renaming issues and as a result fixes the failing tests, * :qa:logging-config:test * :example-plugins:painless-whitelist:yamlRestTest * :modules:reindex:test Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-24 09:33:05 -07:00
Rabi Panda	8469519413	Fix Checkstyle issues. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	8bba6603da	[Rename] Replace more instances of Elasticsearch with OpenSearch. (#432 ) This commit replaces more replaceable instances of Elasticsearch with OpenSearch. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Nick Knize	7051167c83	[Rename] remaining elasticsearch pass 1 (#416 ) This commit refactors instances of 'elasticsearch' with opensearch everywhere except references to issues, and other places needed to test compatibility with old elasticsearch clusters. Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2021-03-21 20:56:34 -05:00
Rabi Panda	597b52992d	[Rename] File names replace elasticsearch with opensearch. (#419 ) This commit renames several files that contain the name elasticsearch and replace that with opensearch. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	eddfe6760d	[Rename] Fix issues for gradle precommit task. (#418 ) Fix miscellaneous issues identified during `gradle precommit`. These issues are the side effects of the renaming to OpenSearch work. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	df11cc9de4	[Rename] Fix gradle build as part of the renaming process. (#397 ) This commit fixes the currently broken gradle build resulted from the renaming work. It reverts a few dependencies and comments out the `opensearch_distibutions` task which is currently failing for some builds. We will address these separately in the future once we have a working build. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	13f6d23e40	[Rename] Property and metadata keys with prefix es. (#389 ) Rename all property and metadata keys with prefix 'es.' to 'opensearch.'. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Nick Knize	5b46a05702	[Rename] remaining packages and resources in test/fixture (#364 ) This commit refactors the remaining o.e.index and o.e.test packages in the test/fixtures module. References throughout the codebase are also refactored. Signed-off-by: Nicholas Walter Knize <nknize@apache.org>	2021-03-21 20:56:34 -05:00
Harold Wang	82f9ff93cb	[Rename] plugins (#193 ) * [Rename] plugins (#193) This PR refactors files under "plugins" folders part of the Elasticsearch to OpenSearch renaming effort. Signed-off-by: Harold Wang <harowang@amazon.com>	2021-03-21 20:56:34 -05:00
Nick Knize	923ea001f5	[Rename] o.e.action.support classes (#253 ) This commit refactors the classes in o.e.action.support to o.opensearch.action.support. The remaining directories will be refactored in a separate commit. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	991b3650b6	[Rename] refactor server/snapshots package. (#251 ) Refactor `server/snapshots` to rename the package names from `org.elasticsearch.snapshots` to `org.opensearch.snapshots` as part of the rename to OpenSearch work. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	584efd7970	[Rename] modules/lang-painless (#210 ) Refactor lang-painless module as part rename to OpenSearch work. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	3eee5183d1	[Rename] server/rest (#229 ) This commit refactors the `server/rest` package as part of the Elasticsearch to OpenSearch renaming. Signed-off-by: Rabi Panda <adnapibar@gmail.com>	2021-03-21 20:56:34 -05:00
Nick Knize	8aa818e93e	[Rename] refactor o.e.action.admin.cluster (#207 ) This commit refactors all classes in o.e.action.admin.cluster to org.opensearch.action.admin.cluster. References are updated throughout the codebase. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Nick Knize	1203aa7302	[Rename] refactor o.e.action classes (#203 ) This commit refactors top level classes in o.e.action to o.opensearch.action. References throughout the rest of the codebase have been updated. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Nick Knize	0c81a5cf65	[Rename] refactor o.e.action.admin.indices (#209 ) This commit refactors o.e.action.admin.indices package to o.opensearch.action.admin.indices. References through out the codebase have been updated to reflect the new package location. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Nick Knize	2aa9906c42	[Rename] ElasticsearchParseException class in server module (#169 ) This commit refactors ElasticsearchParseException class in the server module to OpenSearchParseException. References and usages throughout the rest of the codebase are fully refactored. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Nick Knize	ccceb381db	[Rename] ElasticsearchException class in server module (#165 ) This commit refactors the ElasticsearchException class located in the server module to OpenSearchException. References and usages throughout the rest of the codebase are fully refactored. Signed-off-by: Nicholas Knize <nknize@amazon.com>	2021-03-21 20:56:34 -05:00
Rabi Panda	38e9c9750a	[PURIFY] Remove the AuthorizationEnginePlugin from examples. (#26 ) Signed-off-by: Peter Nied <petern@amazon.com>	2021-03-13 10:36:09 -06:00
Rabi Panda	c856534394	[PURIFY] Remove remaining x-pack license. (#25 ) Signed-off-by: Peter Nied <petern@amazon.com>	2021-03-13 10:36:09 -06:00
Nick Knize	a0b91cb230	Cleanup build script to exclude security-authorization-engine (#8 ) (#8 ) * Cleanup build-scan, remove publish scan to elastic server * Cleanup build script to exclude security-authorization-engine which test has dependency on xpack * Cleanup build script to exclude security-authorization-engine which test has dependency on xpack Co-authored-by: Huan Jiang <huanji@amazon.com> Signed-off-by: Peter Nied <petern@amazon.com>	2021-03-13 10:36:06 -06:00
Nick Knize	3a52e9ddc1	[PURIFY] update build.gradle files to ensure build completes; gradle check fails (#7 ) Signed-off-by: Peter Nied <petern@amazon.com>	2021-03-13 10:36:06 -06:00
Alan Woodward	fb84b6710d	Restore use of default search and search_quote analyzers (#65491 ) (#65562 ) In the refactoring of TextFieldMapper, we lost the ability to define a default search or search_quote analyzer in index settings. This commit restores that ability, and adds some more comprehensive testing. Fixes #65434	2020-11-26 18:34:59 +00:00
Armin Braun	51e9d6f227	Revert Serializing Outbound Transport Messages on IO Threads (#64632 ) (#64654 ) Serializing outbound transport message on the IO loop was introduced in https://github.com/elastic/elasticsearch/pull/56961. Unfortunately it turns out that this is incompatible with assumptions made by CCR code here: `f22ddf822e/x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/repositories/GetCcrRestoreFileChunkAction.java (L60-L61)` and that are not easy to work around on short notice. Raising reverting this move (as a temporary solution, it's still a valuable change long-term) as a blocker therefore as this seriously affects the stability of the initial phase of the CCR following by causing corrupted bytes to be send to the follower.	2020-11-05 16:29:12 +01:00
Ignacio Vera	4851bc7bae	Upgrade to Lucene-8.7.0 (#64532 ) (#64537 )	2020-11-03 16:57:04 +01:00
Ignacio Vera	d0f5066310	Upgrade to lucene-8.7.0-snapshot-72d8528c3a6 (#63912 ) (#63928 ) (#63933 )	2020-10-20 15:08:06 +02:00
Julie Tibshirani	ae2fc4118d	Add factory methods for common value fetchers. (#63438 ) This PR adds factory methods for the most common implementations: * `SourceValueFetcher.identity` to pass through the source value untouched. * `SourceValueFetcher.toString` to simply convert the source value to a string.	2020-10-08 12:14:53 -07:00
Mayya Sharipova	e022b78198	Upgrade to lucene-8.7.0-snapshot-5c4168d (#63466 ) This disables sort optim on _doc, which may still be unstable. Backport for #63444	2020-10-08 08:20:43 -04:00
Mayya Sharipova	e236ea43e9	Upgrade to lucene-8.7.0-snapshot-e914862 (#63401 ) Backport for: #63395	2020-10-07 09:45:14 -04:00
Alan Woodward	88b45dfa61	Convert TextFieldMapper to parametrized form (#63269 ) (#63392 ) As a result of this, we can remove a chunk of code from TypeParsers as well. Tests for search/index mode analyzers have moved into their own file. This commit also rationalises the serialization checks for parameters into a single SerializerCheck interface that takes the values includeDefaults, isConfigured and the value itself. Relates to #62988	2020-10-07 13:26:25 +01:00
Mayya Sharipova	f2ba62b894	Upgrade to lucene- 8.7.0-snapshot-66c49a35402 (#63372 ) This includes fixing a bug in doc iteration during sort optimization Backport for #63349	2020-10-06 22:38:58 -04:00
Julie Tibshirani	f17ca18dfa	Make array value parsing flag more robust. (#63371 ) When constructing a value fetcher, the 'parsesArrayValue' flag must match `FieldMapper#parsesArrayValue`. However there is nothing in code or tests to help enforce this. This PR reworks the value fetcher constructors so that `parsesArrayValue` is 'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must explicitly set it to true and ensure the behavior is covered by tests. Follow-up to #62974.	2020-10-06 17:49:25 -07:00
Nhat Nguyen	1a6837883a	Upgrade to Lucene-8.7.0-snapshot-77396dbf339 (#63222 ) Includes LUCENE-9554, which exposes the pendingNumDocs from IndexWriter.	2020-10-05 14:39:30 -04:00
Rene Groeschke	f58ebe58ee	Use services for archive and file operations in tasks (#62968 ) (#63201 ) Referencing a project instance during task execution is discouraged by Gradle and should be avoided. E.g. It is incompatible with Gradles incubating configuration cache. Instead there are services available to handle archive and filesystem operations in task actions. Brings us one step closer to #57918	2020-10-05 15:52:15 +02:00
Alan Woodward	01950bc80f	Move FieldMapper#valueFetcher to MappedFieldType (#62974 ) (#63220 ) For runtime fields, we will want to do all search-time interaction with a field definition via a MappedFieldType, rather than a FieldMapper, to avoid interfering with the logic of document parsing. Currently, fetching values for runtime scripts and for building top hits responses need to call a method on FieldMapper. This commit moves this method to MappedFieldType, incidentally simplifying the current call sites and freeing us up to implement runtime fields as pure MappedFieldType objects.	2020-10-04 14:54:59 +01:00
Alan Woodward	de08ba58bf	Convert percolator, murmur3 and histogram mappers to parametrized form (#63004 ) Relates to #62988	2020-09-29 14:42:26 +01:00
Mayya Sharipova	4c8c3c8df6	Upgrade lucene to lucene-8.7.0-snapshot-3b59906 (#62978 ) Backport for #62970	2020-09-28 16:52:31 -04:00
Tim Brooks	59dd889c10	Split up large HTTP responses in outbound pipeline (#62666 ) Currently Netty will batch compression an entire HTTP response regardless of its content size. It allocates a byte array at least of the same size as the uncompressed content. This causes issues with our attempts to remove humungous G1GC allocations. This commit resolves the issue by split responses into 128KB chunks. This has the side-effect of making large outbound HTTP responses that are compressed be send as chunked transfer-encoding.	2020-09-24 16:35:52 -06:00
Tim Brooks	43a4882951	Move CorsHandler to server (#62007 ) Currently we duplicate our specialized cors logic in all transport plugins. This is unnecessary as it could be implemented in a single place. This commit moves the logic to server. Additionally it fixes a but where we are incorrectly closing http channels on early Cors responses.	2020-09-24 16:32:59 -06:00
Alan Woodward	e28750b001	Add parameter update and conflict tests to MapperTestCase (#62828 ) (#62902 ) This commit adds a mechanism to MapperTestCase that allows implementing test classes to check that their parameters can be updated, or throw conflict errors as advertised. Child classes override the registerParameters method and tell the passed-in UpdateChecker class about their parameters. Simple conflicts can be checked, using the existing minimal mappings as a base to compare against, or alternatively a particular initial mapping can be provided to check edge cases (eg, norms can be updated from true to false, but not vice versa). Updates are registered with a predicate that checks that the update has in fact been applied to the resulting FieldMapper. Fixes #61631	2020-09-24 20:38:12 +01:00
Armin Braun	83ec8dd4e2	Upgrade GCS SDK to 1.113.1 (#62848 ) (#62864 ) Just staying on top of upgrades to the SDK and its dependencies.	2020-09-24 15:43:21 +02:00
Luca Cavanna	862fab06d3	Share same existsQuery impl throughout mappers (#57607 ) Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers. There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available. This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method. At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.	2020-09-23 11:00:53 +02:00
Luca Cavanna	5ca86d541c	Move stored flag from TextSearchInfo to MappedFieldType (#62717 ) (#62770 )	2020-09-23 09:40:34 +02:00
markharwood	a0df0fb074	Search - add case insensitive flag for "term" family of queries #61596 (#62661 ) Backport of fe9145f Closes #61546	2020-09-22 13:56:51 +01:00
Luca Cavanna	9ae29713fd	Dense vector field type minor fixes (#62631 ) The dense vector field is not aggregatable although it produces fielddata through its BinaryDocValuesField. It should pass up hasDocValues set to true to its parent class in its constructor, and return isAggregatable false. Same for the sparse vector field (only in 7.x). This may not have consequences today, but it will be important once we try to share the same exists query implementation throughout all of the mappers with #57607.	2020-09-22 10:40:51 +02:00
Christos Soulios	6a298970fd	[7.x] Allow metadata fields in the _source (#62616 ) Backports #61590 to 7.x So far we don't allow metadata fields in the document _source. However, in the case of the _doc_count field mapper (#58339) we want to be able to set This PR adds a method to the metadata field parsers that exposes if the field can be included in the document source or not. This way each metadata field can configure if it can be included in the document _source	2020-09-18 19:56:41 +03:00
Adrien Grand	4de8579455	Upgrade to lucene-8.7.0-snapshot-830bd186a8d. (#62596 )	2020-09-18 09:51:34 +02:00
David Turner	0a3f2c453f	Hide c.a.s.s.i.UseArnRegionResolver noise (#62522 ) A recent AWS SDK upgrade has introduced a new source of spurious `WARN` logs when the security manager prevents access to the user's home directory and therefore to `$HOME/.aws/config`. This is the behaviour we want, and it's harmless and handled by the SDK as if the config doesn't exist, so this log message is unnecessary noise. This commit suppresses this noisy logging by default. Relates #20313, #56346, #53962 Closes #62493	2020-09-18 08:30:39 +01:00
Tanguy Leroux	e6777810ba	Fix S3BlobContainerRetriesTests (#62464 ) (#62551 ) The AssertingInputStream in S3BlobContainerRetriesTests verifies that InputStream are either fully consumed or aborted, but the eof flag is only set when the underlying stream returns it. When buffered read are executed and when the exact number of remaining bytes are read, the eof flag is not set to true. Instead the test should rely on the total number of bytes read to know if the stream has been fully consumed. Close #62390	2020-09-17 17:12:34 +02:00
Adrien Grand	9a8225bbc1	Upgrade to lucene-8.7.0-snapshot-9cd3af50f80. (#62450 ) (#62476 ) This new snapshot contains the following JIRAs that we're interested in: - [LUCENE-9525](https://issues.apache.org/jira/browse/LUCENE-9525) Better handling of small documents. This should improve retrieval times when documents are less than ~1kB. - [LUCENE-9510](https://issues.apache.org/jira/browse/LUCENE-9510) Faster flushes when index sorting is enabled by not compressing the temporary files that store stored fields and term vectors.	2020-09-17 10:28:20 +02:00
Nik Everett	24a24d050a	Implement fields fetch for runtime fields (backport of #61995 ) (#62416 ) This implements the `fields` API in `_search` for runtime fields using doc values. Most of that implementation is stolen from the `docvalue_fields` fetch sub-phase, just moved into the same API that the `fields` API uses. At this point the `docvalue_fields` fetch phase looks like a special case of the `fields` API. While I was at it I moved the "which doc values sub-implementation should I use for fetching?" question from a bunch of `instanceof`s to a method on `LeafFieldData` so we can be much more flexible with what is returned and we're not forced to extend certain classes just to make the fetch phase happy. Relates to #59332	2020-09-15 20:24:10 -04:00
Armin Braun	98f525f8a7	Faster Azure Blob InputStream (#61812 ) (#62387 ) Building our own that should perform better than the one in the SDK. Also, as a result saving a HEAD call for each ranged read on Azure.	2020-09-15 18:27:22 +02:00
Adrien Grand	6db8afefc2	Upgrade to lucene-8.7.0-snapshot-cdfdc1e0851. (#62376 ) Upgrade to a new Lucene snapshot that (at least partially) addresses the indexing rate regression when index sorting is enabled. Backport of #62334.	2020-09-15 17:48:07 +02:00
Tanguy Leroux	faf96c175e	Abort non-fully consumed S3 input stream (#62167 ) (#62370 ) Today when an S3RetryingInputStream is closed the remaining bytes that were not consumed are drained right before closing the underlying stream. In some contexts it might be more efficient to not consume the remaining bytes and just drop the connection. This is for example the case with snapshot backed indices prewarming, where there is not point in reading potentially large blobs if we know the cache file we want to write the content of the blob as already been evicted. Draining all bytes here takes a slot in the prewarming thread pool for nothing.	2020-09-15 14:33:37 +02:00
Francisco Fernández Castaño	21303e8e15	Take into account sas tokens while metering put object requests on azure (#62244 ) Backport of #62225 Closes #62208	2020-09-10 19:47:58 +02:00
Ignacio Vera	c8981ea93d	upgrade to lucene-8.7.0-snapshot-b313618cc1d (#62213 ) (#62222 )	2020-09-10 16:23:18 +02:00
Jake Landis	d8dad9ab2c	[7.x] Remove integTest task from PluginBuildPlugin (#61879 ) (#62135 ) This commit removes `integTest` task from all es-plugins. Most relevant projects have been converted to use yamlRestTest, javaRestTest, or internalClusterTest in prior PRs. A few projects needed to be adjusted to allow complete removal of this task * x-pack/plugin - converted to use yamlRestTest and javaRestTest * plugins/repository-hdfs - kept the integTest task, but use `rest-test` plugin to define the task * qa/die-with-dignity - convert to javaRestTest * x-pack/qa/security-example-spi-extension - convert to javaRestTest * multiple projects - remove the integTest.enabled = false (yay!) related: #61802 related: #60630 related: #59444 related: #59089 related: #56841 related: #59939 related: #55896	2020-09-09 14:25:41 -05:00
Nik Everett	b8e9a7125f	Speed up empty highlighting many fields (backport of #61860 ) (#62122 ) Kibana often highlights everything like this: ``` POST /_search { "query": ..., "size": 500, "highlight": { "fields": { "": { ... } } } } ``` This can get slow when there are hundreds of mapped fields. I tested this locally and unscientifically and it took a request from 20ms to 150ms when there are 100 fields. I've seen clusters with 2000 fields where simple search go from 500ms to 1500ms just by turning on this sort of highlighting. Even when the query is just a `range` that and the fields are all numbers and stuff so it won't highlight anything. This speeds up the `unified` highlighter in this case in a few ways: 1. Build the highlighting infrastructure once field rather than once pre document per field. This cuts out a ton* of work analyzing the query over and over and over again. 2. Bail out of the highlighter before loading values if we can't produce any results. Combined these take that local 150ms case down to 65ms. This is unlikely to be really useful when there are only a few fetched docs and only a few fields, but we often end up having many fields with many fetched docs.	2020-09-08 15:49:50 -04:00
Francisco Fernández Castaño	2bb5716b3d	Add repositories metering API (#62088 ) This pull request adds a new set of APIs that allows tracking the number of requests performed by the different registered repositories. In order to avoid losing data, the repository statistics are archived after the repository is closed for a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the statistics for the active repositories as well as the modified/closed repositories. Backport of #60371	2020-09-08 14:01:04 +02:00
Ignacio Vera	31c026f25c	upgrade to Lucene-8.7.0-snapshot-61ea26a (#61957 ) (#61974 )	2020-09-04 13:46:20 +02:00
Ryan Ernst	d6e17170c3	Simplify adding plugins and modules to testclusters (#61886 ) There are currently half a dozen ways to add plugins and modules for test clusters to use. All of them require the calling project to peek into the plugin or module they want to use to grab its bundlePlugin task, and then both depend on that task, as well as extract the archive path the task will produce. This creates cross project dependencies that are difficult to detect, and if the dependent plugin/module has not yet been configured, the build will fail because the task does not yet exist. This commit makes the plugin and module methods for testclusters symmetetric, and simply adding a file provider directly, or a project path that will produce the plugin/module zip. Internally this new variant uses normal configuration/dependencies across projects to get the zip artifact. It also has the added benefit of no longer needing the caller to add to the test task a dependsOn for bundlePlugin task.	2020-09-03 19:37:46 -07:00
Alan Woodward	e2f006eeb4	Merge FetchSubPhase hitsExecute and hitExecute methods (#60907 ) (#61893 ) FetchSubPhase has two 'execute' methods, one which takes all hits to be examined, and one which takes a single HitContext. It's not obvious which one should be implemented by a given sub-phase, or if implementing both is a possibility; nor is it obvious that we first run the hitExecute methods of all subphases, and then subsequently call all the hitsExecute methods. This commit reworks FetchSubPhase to replace these two variants with a processor class, `FetchSubPhaseProcessor`, that is returned from a single `getProcessor` method. This processor class has two methods, `setNextReader()` and `process`. FetchPhase collects processors from all its subphases (if a subphase does not need to execute on the current search context, it can return `null` from `getProcessor`). It then sorts its hits by docid, and groups them by lucene leaf reader. For each reader group, it calls `setNextReader()` on all non-null processors, and then passes each doc id to `process()`. Implementations of fetch sub phases can divide their concerns into per-request, per-reader and per-document sections, and no longer need to worry about sorting docs or dealing with reader slices. FetchSubPhase now provides a FetchSubPhaseExecutor that exposes two methods, setNextReader(LeafReaderContext) and execute(HitContext). The parent FetchPhase collects all these executors together (if a phase should not be executed, then it returns null here); then it sorts hits, and groups them by reader; for each reader it calls setNextReader, and then execute for each hit in turn. Individual sub phases no longer need to concern themselves with sorting docs or keeping track of readers; global structures can be built in getExecutor(SearchContext), per-reader structures in setNextReader and per-doc in execute.	2020-09-03 12:20:55 +01:00
Tim Brooks	e573fa9abc	Add data.path fast path for FilePermission (#61302 ) The recursive data.path FilePermission check is an extremely hot codepath in Elasticsearch. Unfortunately the FilePermission check in Java is extremely allocation heavy. As it iterates through different file permissions, it allocates byte arrays for each Path component that must be compared. This PR improves the situation by adding the recursive data.path FilePermission it its own PermissionsCollection object which is checked first.	2020-09-01 12:03:22 -06:00
Jason Tedor	64cd229b35	Upgrade to Lucene 8.6.2 (#61688 ) This commit upgrades the Lucene dependencies to 8.6.2.	2020-08-31 09:54:07 -04:00
Armin Braun	0da20579ca	Cleanly Handle S3 SDK Exceptions in Request Counting (#61686 ) (#61698 ) It looks like it is possible for a request to throw an exception early before any API interaciton has happened. This can lead to the request count map containing a `null` for the request count key. The assertion is not correct and we should not NPE here (as that might also hide the original exception since we are running this code in a `finally` block from within the S3 SDK). Closes #61670	2020-08-31 11:05:59 +02:00
Luca Cavanna	f769821bc8	Pass SearchLookup supplier through to fielddataBuilder (#61430 ) (#61638 ) Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not. To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method. As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors. With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition. Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch. This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>. Relates to #59332 Co-authored-by: Nik Everett <nik9000@gmail.com>	2020-08-27 18:09:56 +02:00
Przemyslaw Gomulka	9f566644af	Do not create two loggers for DeprecationLogger backport(#58435 ) (#61530 ) DeprecationLogger's constructor should not create two loggers. It was taking parent logger instance, changing its name with a .deprecation prefix and creating a new logger. Most of the time parent logger was not needed. It was causing Log4j to unnecessarily cache the unused parent logger instance. depends on #61515 backports #58435	2020-08-26 16:04:02 +02:00
Nik Everett	87cf81e179	Migrate some more mapper test cases (#61507 ) (#61552 ) Migrate some more mapper test cases from `ESSingleNodeTestCase` to `MapperTestCase`.	2020-08-25 15:27:26 -04:00
markharwood	8b56441d2b	Search - add case insensitive support for regex queries. (#59441 ) (#61532 ) Backport to add case insensitive support for regex queries. Forks a copy of Lucene’s RegexpQuery and RegExp from Lucene master. This can be removed when 8.7 Lucene is released. Closes #59235	2020-08-25 17:18:59 +01:00
Przemyslaw Gomulka	f3f7d25316	Header warning logging refactoring backport(#55941 ) (#61515 ) Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog). Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed. relates #55699 relates #52369 backports #55941	2020-08-25 16:35:54 +02:00
Julie Tibshirani	997c73ec17	Correct how field retrieval handles multifields and copy_to. (#61391 ) Before when a value was copied to a field through a parent field or `copy_to`, we parsed it using the `FieldMapper` from the source field. Instead we should parse it using the target `FieldMapper`. This ensures that we apply the appropriate mapping type and options to the copied value. To implement the fix cleanly, this PR refactors the value parsing strategy. Now instead of looking up values directly, field mappers produce a helper object `ValueFetcher`. The value fetchers are responsible for almost all aspects of fetching, including looking up the right paths in the _source. The PR is fairly big but each commit can be reviewed individually. Fixes #61033.	2020-08-20 15:53:35 -07:00
Rory Hunter	be4ebfbf46	Remove old test mute code (#61277 ) It seems that some old test mute code, added as part of #31498, was never removed. This meant that the HDFS tests would fail when run under JDK 11.	2020-08-19 09:40:59 +01:00
Jake Landis	cb9f4cdae2	Fix the REST FIPS tests (#61001 ) Adds bouncycastle to classpath for tests and testclusters	2020-08-13 16:23:54 -07:00
Alan Woodward	54279212cf	Make MetadataFieldMapper extend ParametrizedFieldMapper (#59847 ) (#60924 ) This commit cuts over all metadata field mappers to parametrized format.	2020-08-11 09:02:28 +01:00
Armin Braun	3e2dfc6eac	Remove GCS Bucket Exists Check (#60899 ) (#60914 ) Same as https://github.com/elastic/elasticsearch/pull/43288 for GCS. We don't need to do the bucket exists check before using the repo, that just needlessly increases the necessary permissions for using the GCS repository.	2020-08-11 09:54:27 +02:00
Rene Groeschke	bdd7347bbf	Merge test runner task into RestIntegTest (7.x backport) (#60600 ) * Merge test runner task into RestIntegTest (#60261) * Merge test runner task into RestIntegTest * Reorganizing Standalone runner and RestIntegTest task * Rework general test task configuration and extension * Fix merge issues * use former 7.x common test configuration	2020-08-04 14:46:32 +02:00
Armin Braun	7ae9dc2092	Unify Stream Copy Buffer Usage (#56078 ) (#60608 ) We have various ways of copying between two streams and handling thread-local buffers throughout the codebase. This commit unifies a number of them and removes buffer allocations in many spots.	2020-08-04 09:54:52 +02:00
Rene Groeschke	ed4b70190b	Replace immediate task creations by using task avoidance api (#60071 ) (#60504 ) - Replace immediate task creations by using task avoidance api - One step closer to #56610 - Still many tasks are created during configuration phase. Tackled in separate steps	2020-07-31 13:09:04 +02:00
Julie Tibshirani	dfd7f226f0	Clarify SourceLookup sharing across fetch subphases. (#60484 ) The `SourceLookup` class provides access to the _source for a particular document, specified through `SourceLookup#setSegmentAndDocument`. Previously the search context contained a single `SourceLookup` that was shared between different fetch subphases. It was hard to reason about its state: is `SourceLookup` set to the expected document? Is the _source already loaded and available? Instead of using a global source lookup, the fetch hit context now provides access to a lookup that is set to load from the hit document. This refactor closes #31000, since the same `SourceLookup` is no longer shared between the 'fetch _source phase' and script execution.	2020-07-30 13:22:31 -07:00
Julie Tibshirani	5359417ec3	Minor clean-up around search highlight context. (#60422 ) * Rename SearchContextHighlight -> SearchHighlightContext. * Rename HighlighterContext to FieldHighlightContext. * Make the search highlight context immutable. * Avoid storing SearchHighlightContext on HighlighterContext.	2020-07-29 11:39:17 -07:00
Jake Landis	6ce30bea08	[7.x] Convert most OSS plugins from integTest to [yaml \| java]RestTest or internalClusterTest (#59444 ) (#60343 ) For all OSS plugins (except repository-* and discovery-) integTest task is now a no-op and all of the tests are now executed via a test, yamlRestTest, javaRestTest, or internalClusterTest. This commit does NOT convert the discovery- and repository-* since they are bit more complex then the rest of tests and this PR is large enough. Those plugins will be addressed in a future PR(s). This commit also fixes a minor issue that did not copy the rest api for projects that only had YAML TEST tests. related: #56841	2020-07-29 13:06:13 -05:00
Jake Landis	f6abd67029	[7.x] Convert discovery-* from integTest to [yaml \| java]RestTest or internalClusterTest (#60084 ) (#60344 ) For OSS plugins that begin with discovery-*, the integTest task is now a no-op and all of the tests are now executed via a test, yamlRestTest, javaRestTest, or internalClusterTest. related: #56841 related: #59444	2020-07-29 11:20:19 -05:00
Jake Landis	96b7122917	[7.x] Convert repository-* from integTest to [yaml \| java]RestTest or internalClusterTest (#60085 ) (#60404 ) For OSS plugins that being with repository-*, integTest task is now a no-op and all of the tests are now executed via a test, yamlRestTest, javaRestTest, or internalClusterTest. related: #56841 related: #59444	2020-07-29 11:19:44 -05:00
David Turner	bbacad648a	Fix network logging test failures (#60334 ) In #60297 we added some tests related to logging from the transport layer, but these tests failed occasionally since the cluster was kept alive between test invocations but the logging framework expected it only to be used for a single test. With this commit we reduce the scope of the internal test cluster to `TEST` to solve this problem. Closes #60321.	2020-07-29 08:29:09 +01:00
Julie Tibshirani	c7bfb5de41	Add search `fields` parameter to support high-level field retrieval. (#60258 ) This feature adds a new `fields` parameter to the search request, which consults both the document `_source` and the mappings to fetch fields in a consistent way. The PR merges the `field-retrieval` feature branch. Addresses #49028 and #55363.	2020-07-28 10:58:20 -07:00
David Turner	9c62b5cb96	Mute tests for #60321	2020-07-28 18:12:54 +01:00
David Turner	9450ea08b4	Log and track open/close of transport connections (#60297 ) Transport connections between nodes remain in place until one or other node shuts down or the connection is disrupted by a flaky network. Today it is very difficult to demonstrate that transient failures and cluster instability are caused by the network even though this is often the case. In particular, transport connections open and close without logging anything, even at `DEBUG` level, making it very hard to quantify the scale of the problem or to correlate the networking problems with external events. This commit adds the missing `DEBUG`-level logging when transport connections open and close, and also tracks the total number of transport connections a node has opened as a measure of the stability of the underlying network.	2020-07-28 17:08:04 +01:00
Yannick Welsch	ffe114b890	Set specific keepalive options by default on supported platforms (#59278 ) keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of distributed systems such as Elasticsearch. This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations that support it (>= Java 11 & (MacOS \|\| Linux)) and where the system defaults are set to something higher than 5 minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings unless they are deemed to be set too high by providing better out-of-the-box defaults.	2020-07-28 11:10:04 +02:00
Armin Braun	ebb6677815	Formalize and Streamline Buffer Sizes used by Repositories (#59771 ) (#60051 ) Due to complicated access checks (reads and writes execute in their own access context) on some repositories (GCS, Azure, HDFS), using a hard coded buffer size of 4k for restores was needlessly inefficient. By the same token, the use of stream copying with the default 8k buffer size for blob writes was inefficient as well. We also had dedicated, undocumented buffer size settings for HDFS and FS repositories. For these two we would use a 100k buffer by default. We did not have such a setting for e.g. GCS though, which would only use an 8k read buffer which is needlessly small for reading from a raw `URLConnection`. This commit adds an undocumented setting that sets the default buffer size to `128k` for all repositories. It removes wasteful allocation of such a large buffer for small writes and reads in case of HDFS and FS repositories (i.e. still using the smaller buffer to write metadata) but uses a large buffer for doing restores and uploading segment blobs. This should speed up Azure and GCS restores and snapshots in a non-trivial way as well as save some memory when reading small blobs on FS and HFDS repositories.	2020-07-22 21:06:31 +02:00
Nik Everett	6f6076e208	Drop some params from IndexFieldData.Builder (backport of #59934 ) (#59972 ) We never used the `IndexSettings` parameter and we only used the `MappedFieldType` parameter to get the name of the field which we already know everywhere where we build the `IFD.Builder`. This allows us to drop a fair bit of ceremony from a couple of tests.	2020-07-21 10:28:59 -04:00
Ignacio Vera	f8037abf47	upgrade to lucene-8.6.0 release (#59596 ) (#59599 )	2020-07-15 12:40:57 +02:00
Armin Braun	2dd086445c	Enable Fully Concurrent Snapshot Operations (#56911 ) (#59578 ) Enables fully concurrent snapshot operations: * Snapshot create- and delete operations can be started in any order * Delete operations wait for snapshot finalization to finish, are batched as much as possible to improve efficiency and once enqueued in the cluster state prevent new snapshots from starting on data nodes until executed * We could be even more concurrent here in a follow-up by interleaving deletes and snapshots on a per-shard level. I decided not to do this for now since it seemed not worth the added complexity yet. Due to batching+deduplicating of deletes the pain of having a delete stuck behind a long -running snapshot seemed manageable (dropped client connections + resulting retries don't cause issues due to deduplication of delete jobs, batching of deletes allows enqueuing more and more deletes even if a snapshot blocks for a long time that will all be executed in essentially constant time (due to bulk snapshot deletion, deleting multiple snapshots is mostly about as fast as deleting a single one)) * Snapshot creation is completely concurrent across shards, but per shard snapshots are linearized for each repository as are snapshot finalizations See updated JavaDoc and added test cases for more details and illustration on the functionality. Some notes: The queuing of snapshot finalizations and deletes and the related locking/synchronization is a little awkward in this version but can be much simplified with some refactoring. The problem is that snapshot finalizations resolve their listeners on the `SNAPSHOT` pool while deletes resolve the listener on the master update thread. With some refactoring both of these could be moved to the master update thread, effectively removing the need for any synchronization around the `SnapshotService` state. I didn't do this refactoring here because it's a fairly large change and not necessary for the functionality but plan to do so in a follow-up. This change allows for completely removing any trickery around synchronizing deletes and snapshots from SLM and 100% does away with SLM errors from collisions between deletes and snapshots. Snapshotting a single index in parallel to a long running full backup will execute without having to wait for the long running backup as required by the ILM/SLM use case of moving indices to "snapshot tier". Finalizations are linearized but ordered according to which snapshot saw all of its shards complete first	2020-07-15 03:42:31 +02:00
Armin Braun	e1014038e9	Simplify Repository.finalizeSnapshot Signature (#58834 ) (#59574 ) Many of the parameters we pass into this method were only used to build the `SnapshotInfo` instance to write. This change simplifies the signature. Also, it seems less error prone to build `SnapshotInfo` in `SnapshotsService` isntead of relying on the fact that each repository implementation will build the correct `SnapshotInfo`.	2020-07-15 00:14:28 +02:00
Armin Braun	d18b434e62	Remove Artificially Low Chunk Size Limits from GCS + Azure Blob Stores (#59279 ) (#59564 ) Removing these limits as they cause unnecessarily many object in the blob stores. We do not have to worry about BwC of this change since we do not support any 3rd party implementations of Azure or GCS. Also, since there is no valid reason to set a different than the default maximum chunk size at this point, removing the documentation (which was incorrect in the case of Azure to begin with) for the setting from the docs. Closes #56018	2020-07-14 22:31:07 +02:00
Armin Braun	d456f7870a	Deduplicate Index Metadata in BlobStore (#50278 ) (#59514 ) This PR introduces two new fields in to `RepositoryData` (index-N) to track the blob name of `IndexMetaData` blobs and their content via setting generations and uuids. This is used to deduplicate the `IndexMetaData` blobs (`meta-{uuid}.dat` in the indices folders under `/indices` so that new metadata for an index is only written to the repository during a snapshot if that same metadata can't be found in another snapshot. This saves one write per index in the common case of unchanged metadata thus saving cost and making snapshot finalization drastically faster if many indices are being snapshotted at the same time. The implementation is mostly analogous to that for shard generations in #46250 and piggy backs on the BwC mechanism introduced in that PR (which means this PR needs adjustments if it doesn't go into `7.6`). Relates to #45736 as it improves the efficiency of snapshotting unchanged indices Relates to #49800 as it has the potential of loading the index metadata for multiple snapshots of the same index concurrently much more efficient speeding up future concurrent snapshot delete	2020-07-14 22:18:42 +02:00
Armin Braun	64c5f70a2d	Remove Needless Context Switches on Loading RepositoryData (#56935 ) (#59452 ) We don't need to switch to the generic or snapshot pool for loading cached repository data (i.e. most of the time in normal operation). This makes `executeConsistentStateUpdate` less heavy if it has to retry and lowers the chance of having to retry in the first place. Also, this change allowed simplifying a few other spots in the codebase where we would fork off to another pool just to load repository data.	2020-07-13 21:38:29 +02:00
Alan Woodward	f4caadd239	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:05:10 +01:00
Armin Braun	9268b25789	Add Check for Metadata Existence in BlobStoreRepository (#59141 ) (#59216 ) In order to ensure that we do not write a broken piece of `RepositoryData` because the phyiscal repository generation was moved ahead more than one step by erroneous concurrent writing to a repository we must check whether or not the current assumed repository generation exists in the repository physically. Without this check we run the risk of writing on top of stale cached repository data. Relates #56911	2020-07-08 14:25:01 +02:00
Rene Groeschke	a896df53ac	Remove misc dependency related deprecation warnings (7.x backport) (#59122 ) * Fix dependency related deprecations (#58892) * Fix classpath setup for forbiddenapi usage	2020-07-07 17:10:31 +02:00
Ignacio Vera	5cc6457ed8	upgrade to lucene-8.6.0-snapshot-6a715e2ecc3 (#59091 ) (#59120 )	2020-07-07 12:07:41 +02:00
Jake Landis	604c6dd528	7.x - Create plugin for yamlTest task (#56841 ) (#59090 ) This commit creates a new Gradle plugin to provide a separate task name and source set for running YAML based REST tests. The only project converted to use the new plugin in this PR is distribution/archives/integ-test-zip. For which the testing has been moved to :rest-api-spec since it makes the most sense and it avoids a small but awkward change to the distribution plugin. The remaining cases in modules, plugins, and x-pack will be handled in followups. This plugin is distinctly different from the plugin introduced in #55896 since the YAML REST tests are intended to be black box tests over HTTP. As such they should not (by default) have access to the classpath for that which they are testing. The YAML based REST tests will be moved to separate source sets (yamlRestTest). The which source is the target for the test resources is dependent on if this new plugin is applied. If it is not applied, it will default to the test source set. Further, this introduces a breaking change for plugin developers that use the YAML testing framework. They will now need to either use the new source set and matching task, or configure the rest resources to use the old "test" source set that matches the old integTest task. (The former should be preferred). As part of this change (which is also breaking for plugin developers) the rest resources plugin has been removed from the build plugin and now requires either explicit application or application via the new YAML REST test plugin. Plugin developers should be able to fix the breaking changes to the YAML tests by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests under a yamlRestTest folder (instead of test)	2020-07-06 14:16:26 -05:00
Tim Brooks	605e24ed7c	Use `getPortRange` in http server tests (#58794 ) Currently we are leaving the settings to default port range in the nio and netty4 http server test. This has recently led to tests failing due to what appears to be a port conflict with other processes. This commit modifies these tests to use the test case helper method to generate port ranges. Fixes #58433 and #58296.	2020-07-02 13:21:45 -06:00
Alan Woodward	3ba16e0f39	Move MappedFieldType#getSearchAnalyzer and #getSearchQuoteAnalyzer to TextSearchInfo (#58830 ) Analyzers are specific to text searching, and so should be in TextSearchInfo rather than on the generic MappedFieldType. Backport of #58639	2020-07-01 14:52:14 +01:00
Yannick Welsch	15c85b29fd	Account for recovery throttling when restoring snapshot (#58658 ) (#58811 ) Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account (i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to configure throttling in a single place. The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to `40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change will be observed by clusters where the recovery and restore settings were not adapted. Relates https://github.com/elastic/elasticsearch/issues/57023 Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-07-01 12:19:29 +02:00
Rene Groeschke	d952b101e6	Replace compile configuration usage with api (7.x backport) (#58721 ) * Replace compile configuration usage with api (#58451) - Use java-library instead of plugin to allow api configuration usage - Remove explicit references to runtime configurations in dependency declarations - Make test runtime classpath input for testing convention - required as java library will by default not have build jar file - jar file is now explicit input of the task and gradle will ensure its properly build * Fix compile usages in 7.x branch	2020-06-30 15:57:41 +02:00
Tim Brooks	5efec3a517	Add error logging when http test fails (#58505 ) Netty4HttpServerTransportTests has started to fail intermittently. It seems like unexpected successful responses are being received when the test is simulating errors. This commit adds logging to the test to provide additional information when there is an unexpected success. It also adds the logging to the nio http test.	2020-06-24 11:02:20 -06:00
Alan Woodward	8ebd341710	Add text search information to MappedFieldType (#58230 ) (#58432 ) Now that MappedFieldType no longer extends lucene's FieldType, we need to have a way of getting the index information about a field necessary for building text queries, building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo abstraction that holds this information, and a getTextSearchInfo() method to MappedFieldType to make it available. Field types that do not support text search can just return null here. This allows us to remove the MapperService.getLuceneFieldType() shim method.	2020-06-23 14:37:26 +01:00
Alan Woodward	4b8cf2af6a	Add serialization test for FieldMappers when include_defaults=true (#58235 ) (#58328 ) Fixes a bug in TextFieldMapper serialization when index is false, and adds a base-class test to ensure that all field mappers are tested against all variations with defaults both included and excluded. Fixes #58188	2020-06-18 15:46:04 +01:00
Alan Woodward	ca2d12d039	Remove Settings parameter from FieldMapper base class (#58237 ) This is currently used to set the indexVersionCreated parameter on FieldMapper. However, this parameter is only actually used by two implementations, and clutters the API considerably. We should just remove it, and use it directly in the implementations that require it.	2020-06-18 12:53:54 +01:00
Rene Groeschke	abc72c1a27	Unify dependency licenses task configuration (#58116 ) (#58274 ) - Remove duplicate dependency configuration - Use task avoidance api accross the build - Remove redundant licensesCheck config	2020-06-18 08:15:50 +02:00
Alan Woodward	12a3f6dfca	MappedFieldType should not extend FieldType (#58160 ) MappedFieldType is a combination of two concerns: * an extension of lucene's FieldType, defining how a field should be indexed * a set of query factory methods, defining how a field should be searched We want to break these two concerns apart. This commit is a first step to doing this, breaking the inheritance relationship between MappedFieldType and FieldType. MappedFieldType instead has a series of boolean flags defining whether or not the field is searchable or aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining how indexing should be done. Relates to #56814	2020-06-16 16:56:43 +01:00
Tal Levy	69d5e044af	Add optional description parameter to ingest processors. (#57906 ) (#58152 ) This commit adds an optional field, `description`, to all ingest processors so that users can explain the purpose of the specific processor instance. Closes #56000.	2020-06-15 19:27:57 -07:00
Rene Groeschke	01e9126588	Remove deprecated usage of testCompile configuration (#57921 ) (#58083 ) * Remove usage of deprecated testCompile configuration * Replace testCompile usage by testImplementation * Make testImplementation non transitive by default (as we did for testCompile) * Update CONTRIBUTING about using testImplementation for test dependencies * Fail on testCompile configuration usage	2020-06-14 22:30:44 +02:00
Alan Woodward	16e230dcb8	Update to lucene snapshot e7c625430ed (#57981 ) Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.	2020-06-11 14:51:53 +01:00
Jun Ohtani	c75c8b6e9d	Expose discard_compound_token option to kuromoji_tokenizer (#57421 ) This commit exposes the new Lucene option `discard_compound_token` to the Elasticsearch Kuromoji plugin.	2020-06-05 15:41:01 +02:00
Tanguy Leroux	0e57528d5d	Remove more //NORELEASE (#57517 ) We agreed on removing the following //NORELEASE tags.	2020-06-05 15:34:06 +02:00
Mark Vieira	9b0f5a1589	Include vendored code notices in distribution notice files (#57017 ) (#57569 ) (cherry picked from commit 627ef279fd29f8af63303bcaafd641aef0ffc586)	2020-06-04 10:34:24 -07:00
markharwood	e2c0c4197f	Mute GoogleCloudStorageRepositoryClientYamlTestSuiteIT For #57115	2020-06-03 13:25:31 +01:00
Mark Tozzi	e50f514092	IndexFieldData should hold the ValuesSourceType (#57373 ) (#57532 )	2020-06-02 12:16:53 -04:00
Armin Braun	ba2d70d8eb	Serialize Outbound Messages on IO Threads (#56961 ) (#57080 ) Almost every outbound message is serialized to buffers of 16k pagesize. We were serializing these messages off the IO loop (and retaining the concrete message instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the channel is ready. 1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average. 2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically. With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable. Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC. This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain). I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once. Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.	2020-06-02 16:15:18 +02:00
Tanguy Leroux	b4a2cd810a	Use 3rd party task to run integration tests on external service (#56588 ) Backport of #56587 for 7.x	2020-06-02 11:26:58 +02:00
Armin Braun	be6fa72432	Fix GCS Mock Behavior for Missing Bucket (#57283 ) (#57310 ) * Fix GCS Mock Behavior for Missing Bucket We were throwing a 500 instead of a 404 for a missing bucket. This would make yaml tests needlessly wait for multiple seconds, retrying the 500 response with backoff, in the test checking behavior for missing buckets.	2020-05-29 10:01:20 +02:00
Francisco Fernández Castaño	42a15c9b80	Track PUT/PUT_BLOCK operations on AzureBlobStore. (#57121 ) Backport of #56936	2020-05-25 17:24:34 +02:00
Armin Braun	56401d3f66	Release HTTP Request Body Earlier (#57094 ) (#57110 ) We don't need to hold on to the request body past the beginning of sending the response. There is no need to keep a reference to it until after the response has been sent fully and we can eagerly release it here. Note, this can be optimized further to release the contents even earlier but for now this is an easy increment to saving some memory on the IO pool.	2020-05-25 13:00:19 +02:00
Armin Braun	a4eb3edf46	Fix GCS Repository YAML Test Build (#57073 ) (#57101 ) A few relatively obvious issues here: * We cannot run the different IT runs (large blob setting one and normal integ run) concurrently * We need to set the dependency tasks up correctly for the large blob run so that it works in isolation * We can't use the `localAddress` for the location header of the resumable upload (this breaks in YAML tests because GCS is using a loopback port forward for the initial request and the local address will be chosen as the actual Docker container host) Closes #57026	2020-05-25 11:10:39 +02:00
Rene Groeschke	28920a45f1	Improvement usage of gradle task avoidance api (#56627 ) (#56981 ) Use gradle task avoidance api wherever it is possible as a drop in replacement in the es build	2020-05-25 09:37:33 +02:00
markharwood	eb8cb31d46	Update Lucene version to 8.6.0-snapshot-9d6c738ffce (#57024 ) Same version as master	2020-05-21 11:28:16 +01:00
Alan Woodward	18bfbeda29	Move merge compatibility logic from MappedFieldType to FieldMapper (#56915 ) Merging logic is currently split between FieldMapper, with its merge() method, and MappedFieldType, which checks for merging compatibility. The compatibility checks are called from a third class, MappingMergeValidator. This makes it difficult to reason about what is or is not compatible in updates, and even what is in fact updateable - we have a number of tests that check compatibility on changes in mapping configuration that are not in fact possible. This commit refactors the compatibility logic so that it all sits on FieldMapper, and makes it called at merge time. It adds a new FieldMapperTestCase base class that FieldMapper tests can extend, and moves the compatibility testing machinery from FieldTypeTestCase to here. Relates to #56814	2020-05-20 09:43:13 +01:00
Francisco Fernández Castaño	9e870ec3af	Track GET/LIST Azure Storage API calls (#56937 ) Adds tracking for the API calls performed by the Azure Storage underlying SDK. It relies on the ability to hook a request listener into the OperationContext. Backport of #56773	2020-05-19 13:49:23 +02:00
Tim Brooks	57c3a61535	Create HttpRequest earlier in pipeline (#56393 ) Elasticsearch requires that a HttpRequest abstraction be implemented by http modules before server processing. This abstraction controls when underlying resources are released. This commit moves this abstraction to be created immediately after content aggregation. This change will enable follow-up work including moving Cors logic into the server package and tracking bytes as they are aggregated from the network level.	2020-05-18 14:54:01 -06:00
Francisco Fernández Castaño	60c7832141	Track upload requests on S3 repositories (#56904 ) Add tracking for regular and multipart uploads. Regular uploads are categorized as PUT. Multi part uploads are categorized as POST. The number of documents created for the test #testRequestStats have been increased so all upload methods are exercised. Backport of #56826	2020-05-18 19:05:17 +02:00
Francisco Fernández Castaño	8ab9fc10c1	Track multipart/resumable uploads GCS API calls (#56892 ) Add tracking for multipart and resumable uploads for GoogleCloudStorage. For resumable uploads only the last request is taken into account for billing, so that's the only request that's tracked. Backport of #56821	2020-05-18 13:39:26 +02:00
Armin Braun	c02850f335	Fix S3ClientSettings Leak (#56703 ) (#56862 ) Fixes the fact that repository metadata with the same settings still results in multiple settings instances being cached as well as leaking settings on closing a repository. Closes #56702	2020-05-17 09:18:20 +02:00
Armin Braun	cac85a6f18	Shorter Path in Netty ByteBuf Unwrap (#56740 ) (#56857 ) In most cases we are seeing a `PooledHeapByteBuf` here now. No need to redundantly create an new `ByteBuffer` and single element array for it here when we can just directly unwrap its internal `byte[]`.	2020-05-16 11:54:36 +02:00
Ioannis Kakavas	239ada1669	Test adjustments for FIPS 140 (#56526 ) This change aims to fix our setup in CI so that we can run 7.x in FIPS 140 mode. The major issue that we have in 7.x and did not have in master is that we can't use the diagnostic trust manager in FIPS mode in Java 8 with SunJSSE in FIPS approved mode as it explicitly disallows the wrapping of X509TrustManager. Previous attempts like #56427 and #52211 focused on disabling the setting in all of our tests when creating a Settings object or on setting fips_mode.enabled accordingly (which implicitly disables the diagnostic trust manager). The attempts weren't future proof though as nothing would forbid someone to add new tests without setting the necessary setting and forcing this would be very inconvenient for any other case ( see #56427 (comment) for the full argumentation). This change introduces a runtime check in SSLService that overrides the configuration value of xpack.security.ssl.diagnose.trust and disables the diagnostic trust manager when we are running in Java 8 and the SunJSSE provider is set in FIPS mode.	2020-05-15 18:10:45 +03:00
Alan Woodward	d33d13f2be	Simplify generics on Mapper.Builder (#56747 ) Mapper.Builder currently has some complex generics on it to allow fluent builder construction. However, the second parameter, a return type from the build() method, is unnecessary, as we can use covariant return types. This commit removes this second generic parameter.	2020-05-15 12:14:49 +01:00
Francisco Fernández Castaño	1530bff0cb	Move azure client logic from AzureStorageService to AzureBlobStore (#56806 ) Backport of #56782	2020-05-15 11:30:15 +02:00
Ryan Ernst	9fb80d3827	Move publishing configuration to a separate plugin (#56727 ) This is another part of the breakup of the massive BuildPlugin. This PR moves the code for configuring publications to a separate plugin. Most of the time these publications are jar files, but this also supports the zip publication we have for integ tests.	2020-05-14 20:23:07 -07:00
Mark Vieira	0fd756d511	Enforce strict license distribution requirements (#56642 )	2020-05-14 13:57:56 -07:00
Armin Braun	14a042fbe5	Make No. of Transport Threads == Available CPUs (#56488 ) (#56780 ) We never do any file IO or other blocking work on the transport threads so no tangible benefit can be derived from using more threads than CPUs for IO. There are however significant downsides to using more threads than necessary with Netty in particular. Since we use the default setting for `io.netty.allocator.useCacheForAllThreads` which is `true` we end up using up to `16MB` of thread local buffer cache for each transport thread. Meaning we potentially waste CPUs * 16MB of heap for unnecessary IO threads in addition to obvious inefficiencies of artificially adding extra context switches.	2020-05-14 21:33:46 +02:00
Mark Tozzi	b718193a01	Clean up DocValuesIndexFieldData (#56372 ) (#56684 )	2020-05-14 12:42:37 -04:00
Francisco Fernández Castaño	97bf47f5b9	Track GET/LIST GoogleCloudStorage API calls (#56758 ) Backporting #56585 to 7.x branch. Adds tracking for the API calls performed by the GoogleCloudStorage underlying SDK. It hooks an HttpResponseInterceptor to the SDK transport layer and does http request filtering based on the URI paths that we are interested to track. Unfortunately we cannot hook a wrapper into the ServiceRPC interface since we're using different levels of abstraction to implement retries during reads (GoogleCloudStorageRetryingInputStream).	2020-05-14 14:03:21 +02:00
Nik Everett	b98b260048	Merge significant_terms into the terms package (backport of #56699 ) (#56715 ) This merges the code for the `significant_terms` agg into the package for the code for the `terms` agg. They are super entangled already, this mostly just admits that to ourselves. Precondition for the terms work in #56487	2020-05-13 17:36:21 -04:00
Ignacio Vera	b4521d5183	upgrade to Lucene 8.6.0 snapshot (#56661 )	2020-05-13 14:25:16 +02:00
Armin Braun	0a879b95d1	Save Bounds Checks in BytesReference (#56577 ) (#56621 ) Two spots that allow for some optimization: * We are often creating a composite reference of just a single item in the transport layer => special cased via static constructor to make sure we never do that * Also removed the pointless case of an empty composite bytes ref * `ByteBufferReference` is practically always created from a heap buffer these days so there is no point of dealing with all the bounds checks and extra references to sliced buffers from that and we can just use the underlying array directly	2020-05-12 20:33:45 +02:00
David Turner	8f4af292a7	Hide c.a.a.p.i.BasicProfileConfigFileLoader noise (#56346 ) A recent AWS SDK upgrade has introduced a new source of spurious `WARN` logs when the security manager prevents access to the user's home directory and therefore to their shared client configuration. This is actually the behaviour we want, and it's harmless and handled by the SDK as if the profile config doesn't exist, so this log message is unnecessary noise. This commit suppresses this noisy logging by default. Relates #20313 Closes #56333	2020-05-07 17:00:58 +01:00
Armin Braun	60b6d4eddc	Increase Timeout in S3 Cooldown Test (#56267 ) (#56323 ) Moving from `5s` to `10s` here because of #56095. This adds `10s` to the overall runtime of the test which should be a reasonable tradeoff for stability. Closes #56095	2020-05-07 11:23:07 +02:00
Jason Tedor	33669c0420	Upgrade to Jackson 2.10.4 (#56188 ) Another Jackson release is available. There are some CVEs addressed, none of which impact us, but since we can now bump Jackson easily, let us move along with the train to avoid the false positives from security scanners.	2020-05-06 17:20:23 -04:00
Julie Tibshirani	e852bb29b7	Simplify signature of FieldMapper#parseCreateField. (#56144 ) `FieldMapper#parseCreateField` accepts the parse context, plus a list of fields as an output parameter. These fields are immediately added to the document through `ParseContext#doc()`. This commit simplifies the signature by removing the list of fields, and having the mappers add the fields directly to `ParseContext#doc()`. I think this is nicer for implementors, because previously fields could be added either through the list, or the context (through `add`, `addWithKey`, etc.)	2020-05-06 11:12:09 -07:00
Tim Brooks	6a51017cb2	Upgrade netty to 4.1.49.Final (#56059 )	2020-05-05 10:40:23 -06:00
Armin Braun	3a64ecb6bf	Allow Deleting Multiple Snapshots at Once (#55474 ) (#56083 ) * Allow Deleting Multiple Snapshots at Once (#55474) Adds deleting multiple snapshots in one go without significantly changing the mechanics of snapshot deletes otherwise. This change does not yet allow mixing snapshot delete and abort. Abort is still only allowed for a single snapshot delete by exact name.	2020-05-03 20:30:58 +02:00
Tim Brooks	80662f31a1	Introduce mechanism to stub request handling (#55832 ) Currently there is a clear mechanism to stub sending a request through the transport. However, this is limited to testing exceptions on the sender side. This commit reworks our transport related testing infrastructure to allow stubbing request handling on the receiving side.	2020-04-27 16:57:15 -06:00
Rory Hunter	d66af46724	Always use deprecateAndMaybeLog for deprecation warnings (#55319 ) Backport of #55115. Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...), with an appropriate key, so that all messages can potentially be deduplicated.	2020-04-23 09:20:54 +01:00
Armin Braun	db7eb8e8ff	Remove Redundant CS Update on Snapshot Finalization (#55276 ) (#55528 ) This change folds the removal of the in-progress snapshot entry into setting the safe repository generation. Outside of removing an unnecessary cluster state update, this also has the advantage of removing a somewhat inconsistent cluster state where the safe repository generation points at `RepositoryData` that contains a finished snapshot while it is still in-progress in the cluster state, making it easier to reason about the state machine of upcoming concurrent snapshot operations.	2020-04-21 15:33:17 +02:00
Yannick Welsch	ba39c261e8	Use streaming reads for GCS (#55506 ) To read from GCS repositories we're currently using Google SDK's official BlobReadChannel, which issues a new request every 2MB (default chunk size for BlobReadChannel) using range requests, and fully downloads the chunk before exposing it to the returned InputStream. This means that the SDK issues an awfully high number of requests to download large blobs. Increasing the chunk size is not an option, as that will mean that an awfully high amount of heap memory will be consumed by the download process. The Google SDK does not provide the right abstractions for a streaming download. This PR uses the lower-level primitives of the SDK to implement a streaming download, similar to what S3's SDK does. Also closes #55505	2020-04-21 13:22:26 +02:00
Ignacio Vera	4783f1894c	mute test testReadRangeBlobWithRetries (#55507 ) (#55508 )	2020-04-21 10:59:35 +02:00
Yannick Welsch	b9da307cd1	Add GCS support for searchable snapshots (#55403 ) Adds ranged read support for GCS repositories in order to enable searchable snapshot support for GCS. As part of this PR, I've extracted some of the test infrastructure to make sure that GoogleCloudStorageBlobContainerRetriesTests and S3BlobContainerRetriesTests are covering similar test (as I saw those diverging in what they cover)	2020-04-20 13:02:59 +02:00
Armin Braun	5550d8f3f6	Fix Path Style Access Setting Priority (#55439 ) (#55444 ) * Fix Path Style Access Setting Priority Fixing obvious bug in handling path style access if it's the only setting overridden by the repository settings. Closes #55407	2020-04-20 11:47:41 +02:00
Jason Tedor	0a1b566c65	Fix security manager bug writing large blobs to GCS (#55421 ) * Fix security manager bug writing large blobs to GCS This commit addresses a security manager permissions issue writing large blobs (on the resumable upload path) to GCS. The underlying issue here is that we need to wrap the close and write calls on the channel. It is not enough to do this: SocketAccess.doPrivilegedVoidIOException( () -> Streams.copy( inputStream, Channels.newOutputStream(client().writer(blobInfo, writeOptions)))); This reason that this is not enough is because Streams#copy will be in the stacktrace and it is not granted the security manager permissions needed to close or write this channel. We only grant those permissions to classes loaded in the plugin classloader, and Streams#copy is from the parent classloader. This is why we must wrap the close and write calls as privileged, to truncate the Streams#copy call out of the stacktrace. The reason that this issue is not caught in testing is because the size of data that we use in testing is too small to trigger the large blob resumable upload path. Therefore, we address this by adding a system property to control the threshold, which we can then set in tests to exercise this code path. Prior to rewriting the writeBlobResumable method to wrap the close and write calls as privileged, with this additional test, we are able to reproduce the security manager permissions issue. After adding the wrapping, this test now passes. * Fix forbidden APIs issue * Remove leftover debugging	2020-04-17 18:49:10 -04:00
William Brafford	49e30b15a2	Deprecate disabling basic-license features (#54816 ) (#55405 ) We believe there's no longer a need to be able to disable basic-license features completely using the "xpack..enabled" settings. If users don't want to use those features, they simply don't need to use them. Having such features always available lets us build more complex features that assume basic-license features are present. This commit deprecates settings of the form "xpack..enabled" for basic-license features, excluding "security", which is a special case. It also removes deprecated settings from integration tests and unit tests where they're not directly relevant; e.g. monitoring and ILM are no longer disabled in many integration tests.	2020-04-17 15:04:17 -04:00
Armin Braun	73ab3719e8	Mute GCS Retry Tests on JDK8 (#55372 ) Same as #53119 but for the retries tests. Closes #55317	2020-04-17 12:19:35 +02:00
William Brafford	2ba3be9db6	Remove deprecated third-party methods from tests (#55255 ) (#55269 ) I've noticed that a lot of our tests are using deprecated static methods from the Hamcrest matchers. While this is not a big deal in any objective sense, it seems like a small good thing to reduce compilation warnings and be ready for a new release of the matcher library if we need to upgrade. I've also switched a few other methods in tests that have drop-in replacements.	2020-04-15 17:54:47 -04:00
Ryan Ernst	29b70733ae	Use task avoidance with forbidden apis (#55034 ) Currently forbidden apis accounts for 800+ tasks in the build. These tasks are aggressively created by the plugin. In forbidden apis 3.0, we will get task avoidance (https://github.com/policeman-tools/forbidden-apis/pull/162), but we need to ourselves use the same task avoidance mechanisms to not trigger these task creations. This commit does that for our foribdden apis usages, in preparation for upgrading to 3.0 when it is released.	2020-04-15 13:27:53 -07:00
Ignacio Vera	a677b63daa	Upgrade to lucene 8.5.1 release (#55229 ) (#55235 ) Upgrade to lucene 8.5.1 release that contains a bug fix for a bug that might introduce index corruption when deleting data from an index that was previously shrunk.	2020-04-15 17:35:42 +02:00
Armin Braun	2f91e2aab7	Fix Race in Snapshot Abort (#54873 ) (#55233 ) We can be a little more efficient when aborting a snapshot. Since we know the new repository data after finalizing the aborted snapshot when can pass it down to the snapshot completion listeners. This way, we don't have to fork off to the snapshot threadpool to get the repository data when the listener completes and can directly submit the delete task with high priority straight from the cluster state thread.	2020-04-15 15:42:15 +02:00
Mark Vieira	ce85063653	[7.x] Re-add origin url information to publish POM files (#55173 )	2020-04-14 13:24:15 -07:00
Yannick Welsch	a610513ec7	Provide repository-level stats for searchable snapshots (#55051 ) Provides basic repository-level stats that will allow us to get some insight into how many requests are actually being made by the underlying SDK. Currently only tracks GET and LIST calls for S3 repositories. Most of the code is unfortunately boiler plate to add a new endpoint that will help us better understand some of the low-level dynamics of searchable snapshots.	2020-04-14 14:34:08 +02:00
Jake Landis	a2fafa6af4	[7.x] Lazy test cluster module and plugins (#54852 ) (#55087 ) This change converts the module and plugin parameters for testClusters to be lazy. Meaning that the values are not resolved until they are actually used. This removes the requirement to use project.afterEvaluate to be able to resolve the bundle artifact. Note - this does not completely remove the need for afterEvaluate since it is still needed for the custom resource extension.	2020-04-13 10:53:35 -05:00
Jason Tedor	9eeae59a83	Clarify available processors (#54907 ) The use of available processors, the terminology, and the settings around it have evolved over time. This commit cleans up some places in the codes and in the docs to adjust to the current terminology.	2020-04-10 08:48:27 -04:00
Armin Braun	f6bdd30165	Fix S3 Blob Container Retries Test Range Handling (#55000 ) (#55002 ) The ranges in HTTP headers are using inclusive values for start and end of the range. The math we used was off in so far that start equals end for the range resulted in length `0` instead of the correct value of `1`. Closes #54981 Closes #54995	2020-04-09 10:58:42 +02:00
Mark Vieira	ac6d1f7b24	Mute S3BlobContainerRetriesTests.testReadRangeBlobWithRetries	2020-04-08 16:45:38 -07:00

... 2 3 4 5 6 ...

2879 Commits