Commit Graph

2779 Commits

Author SHA1 Message Date
Vacha d151082832
Upgrade hadoop dependencies for hdfs plugin (#1335)
* Upgrade hadoop dependencies for hdfs plugin

Signed-off-by: Vacha <vachshah@amazon.com>

* Fixing gradle check failures

Signed-off-by: Vacha <vachshah@amazon.com>

* Upgrading htrace-core4 to 4.1.0

Signed-off-by: Vacha <vachshah@amazon.com>
2021-10-14 14:43:49 -04:00
Andriy Redko 3779576c51
Modernize and consolidate JDKs usage across all stages of the build. Use JDK-17 as bundled JDK distribution to run tests (#1358)
* Modernize and consolidate JDKs usage across all stages of the build. Use JDK-17 as bundled JDK distribution to run tests

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Using -Djava.security.egd=file:/dev/urandom explicitly for cli tests

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
2021-10-13 17:25:48 -04:00
Andriy Redko cdbc84f09d
Update Jackson to 2.12.5 (#1247)
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
2021-09-21 18:33:20 -04:00
Andriy Redko b6c8bdf872
Drop mocksocket in favour of custom security manager checks (tests only) (#1205)
* Drop mocksocket in favour of custom security manager checks (tests only)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Slightly relaxed host checks to allow all local addresses

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
2021-09-16 17:21:47 -04:00
Abbas Hussain fa8126004c
Upgrade apache commons-compress to 1.21 (#1197)
Signed-off-by: Abbas Hussain <abbas_10690@yahoo.com>
2021-09-02 08:35:42 +05:30
Nick Knize 5ae00456a0
Upgrade to Lucene 8.9 (#1080)
This commit upgrades to the official lucene 8.9 release

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-08-20 11:28:06 -05:00
Nick Knize ff7e7904ca
[DEPRECATE] SimpleFS in favor of NIOFS (#1073)
Lucene 9 removes support for SimpleFS File System format. This commit deprecates
the SimpleFS format in favor of NIOFS.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-08-19 17:56:55 -05:00
Sven R dcd9cef56c
alt bash path support (#1047)
Signed-off-by: hackacad <admin@hackacad.net>
2021-08-06 11:09:29 -04:00
Vacha c7617b03e8
Replacing docs-beta links with /docs (#957)
Signed-off-by: Vacha Shah <vachshah@amazon.com>
2021-07-13 07:46:05 -07:00
Vacha e17ce53eb7
Adding broken links checker (#877)
* Adding broken links checker

Signed-off-by: Vacha Shah <vachshah@amazon.com>

* Adding exclusions for links

Signed-off-by: Vacha Shah <vachshah@amazon.com>

* Correcting broken link

Signed-off-by: Vacha Shah <vachshah@amazon.com>

* Removing the benchmarks link

Signed-off-by: Vacha Shah <vachshah@amazon.com>
2021-07-12 14:07:56 -07:00
Tianli Feng 18625952a9
update external library 'pdfbox' version to 2.0.24 to reduce vulnerability (#883) 2021-06-25 13:18:15 -07:00
Abbas Hussain 3e92821c82
[CVE] Upgrade dependencies for Azure related plugins to mitigate CVEs (#688)
* Update commons-io-2.4.jar to 2.7 for plugins/discovery-azure-classic module
* Remove unused jackson dependency and respective LICENSE and NOTICE
* Update guava dependency to mitigate CVE for repository-azure plugin

Signed-off-by: Abbas Hussain <abbas_10690@yahoo.com>
2021-05-26 03:27:36 +05:30
Rabi Panda 50abf6d066
[CVE] Upgrade dependencies to mitigate CVEs (#657)
This PR upgrade the following dependencies to fix CVEs.

- commons-codec:1.12 (->1.13) apache/commons-codec@48b6157
- ant:1.10.8 (->1.10.9) https://ant.apache.org/security.html
- jackson-databind:2.10.4 (->2.11.0) FasterXML/jackson-databind#2589
- jackson-dataformat-cbor:2.10.4 (->2.11.0) https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-28491
- apache-httpclient:4.5.10 (->4.5.13) https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2020-13956
- checkstyle:8.20 (->8.29) https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-10782
- junit:4.12 (->4.13.1) https://github.com/junit-team/junit4/security/advisories/GHSA-269g-pwp5-87pp
- netty:4.1.49.Final (->4.1.59) https://github.com/netty/netty/security/advisories/GHSA-5mcr-gq6c-3hq2

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-05-18 11:37:24 -07:00
Rabi Panda 943c778a7f
[CVE-2018-11765] Upgrade hadoop dependencies for hdfs plugin (#654)
Hadoop 2.8.5 has been reported to have CVEs (https://bugzilla.redhat.com/show_bug.cgi?id=1883549). We need to upgrade this to 2.10.1. This also updates the hadoop-minicluster version to 2.10.1 as well. This upgrade also brings in two additional dependencies, woodstox-core and stax2-api that are added along with the sha1s, licenses and notices.

Also upgrade guava to the latest as per the CVE https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8908

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-05-13 14:56:47 -07:00
Rabi Panda 6550e099b3
[CVE-2020-7692] Upgrade google-oauth clients for goolge cloud plugins (#662)
For discovery-gce and repository-gcs plugins update the google-oauth-client library to version 1.31.0. See CVE details at https://nvd.nist.gov/vuln/detail/CVE-2020-7692

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-05-13 12:19:57 -07:00
Rabi Panda 0e180f4703
Update dependencies for ingest-attachment plugin. (#666)
This PR resolves the CVEs for dependencies in the ingest-attachment plugin.

tika : '1.24' -> '1.24.1' (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-9489)
pdfbox : '2.0.19' -> '2.0.23' (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-27807)
commons-io:commons-io : '2.6' -> '2.7' (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-29425)

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-05-11 10:40:33 -07:00
Nick Knize c5a3c3cb41
Update lucene version to 8.8.2 (#557)
This commit updates the codebase to the latest released version of Lucene.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-04-23 09:48:41 -05:00
Rabi Panda 3fede8be3c
Rename the distribution used in test clusters. (#603)
For test clusters, we are using the archive(zip, tar), so we rename the distribution accordingly.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-04-22 14:21:32 -07:00
Nick Knize 0ba0e7cc26
[Versioning] Rebase to OpenSearch version 1.0.0 (#555)
This commit rebases the versioning to OpenSearch 1.0.0

Co-authored-by: Rabi Panda <adnapibar@gmail.com>

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-04-15 17:06:47 -05:00
Nick Knize ee6d15e26a
[License] Add SPDX License Header to security policies (#531)
This commit adds the SPDX license header and modifications copyright to security
policy files.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-04-12 22:59:36 -05:00
Rabi Panda 8727afbcd3
Use the correct domain to fix failing integration tests. (#519)
This commit fixes a renaming issue (opensearch.co -> opensearch.org) which was causing few integration test failures.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-04-10 09:42:39 -07:00
Rabi Panda 2a3ce0bb75
Fix rename issues and failing repository-hdfs tests. (#518)
This commit fixes some partial rename issues and as a result fixes the failing secure repository-hdfs tests.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-04-09 17:51:27 -07:00
Nick Knize 9168f1fb43
[License] Add SPDX and OpenSearch Modification license header (#509)
This commit adds the SPDX Apache-2.0 license header along with an additional
copyright header for all modifications.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-04-09 14:28:18 -05:00
Rabi Panda 2dca3462f2
Fix stragglers from renaming to OpenSearch work. (#483)
This commit fixes more instances where we missed renaming to OpenSearch.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-04-05 11:51:20 -07:00
Harold Wang 5971a518d0
Replace nio and nitty test endpoint (#475)
Signed-off-by: Harold Wang <harowang@amazon.com>
2021-03-31 13:37:22 -07:00
Harold Wang fd4c3968ab
[Rename] org.opensearch.ingest.attachment.IngestAttachmentClientYamlTestSuiteIT (#463)
* Change "Test elasticsearch" back

* Update content, language and size of test attachement

* Regenerate test attachment content with updated date and author

Signed-off-by: Harold Wang <harowang@amazon.com>
2021-03-26 21:59:23 -07:00
Rabi Panda 3460a8c213
Fix a few more renaming issues. (#464)
This commit fixes some more missed instances where we can perform the renaming to OpenSearch.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-26 12:05:16 -07:00
Rabi Panda 0bdd1293c1
Use alternate example data in OpenSearch test cases. (#454)
This commit updates some of the sample test data used in test cases in OpenSearch.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-25 08:52:07 -07:00
Rabi Panda 2e3055c9e2
Fix more failing tests as a result of renaming (#457)
This commit fixes some more renaming issues and as a result fixes the failing tests,

* :qa:logging-config:test 
* :example-plugins:painless-whitelist:yamlRestTest
* :modules:reindex:test

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-24 09:33:05 -07:00
Rabi Panda 8469519413 Fix Checkstyle issues.
Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Rabi Panda 8bba6603da [Rename] Replace more instances of Elasticsearch with OpenSearch. (#432)
This commit replaces more replaceable instances of Elasticsearch with OpenSearch.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Nick Knize 7051167c83 [Rename] remaining elasticsearch pass 1 (#416)
This commit refactors instances of 'elasticsearch' with opensearch everywhere
except references to issues, and other places needed to test compatibility with
old elasticsearch clusters.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-03-21 20:56:34 -05:00
Rabi Panda 597b52992d [Rename] File names replace elasticsearch with opensearch. (#419)
This commit renames several files that contain the name elasticsearch and replace that with opensearch.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Rabi Panda eddfe6760d [Rename] Fix issues for gradle precommit task. (#418)
Fix miscellaneous issues identified during `gradle precommit`. These issues are the side effects of the renaming to OpenSearch work.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Rabi Panda df11cc9de4 [Rename] Fix gradle build as part of the renaming process. (#397)
This commit fixes the currently broken gradle build resulted from the renaming work. It reverts a few dependencies and comments out the `opensearch_distibutions` task which is currently failing for some builds. We will address these separately in the future once we have a working build.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Rabi Panda 13f6d23e40 [Rename] Property and metadata keys with prefix es. (#389)
Rename all property and metadata keys with prefix 'es.' to 'opensearch.'.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Nick Knize 5b46a05702 [Rename] remaining packages and resources in test/fixture (#364)
This commit refactors the remaining o.e.index and o.e.test packages in the
test/fixtures module. References throughout the codebase are also refactored.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-03-21 20:56:34 -05:00
Harold Wang 82f9ff93cb [Rename] plugins (#193)
* [Rename] plugins (#193)

This PR refactors files under "plugins" folders part of the Elasticsearch to OpenSearch renaming effort.

Signed-off-by: Harold Wang <harowang@amazon.com>
2021-03-21 20:56:34 -05:00
Nick Knize 923ea001f5 [Rename] o.e.action.support classes (#253)
This commit refactors the classes in o.e.action.support to
o.opensearch.action.support. The remaining directories will be refactored in a
separate commit.

Signed-off-by: Nicholas Knize <nknize@amazon.com>
2021-03-21 20:56:34 -05:00
Rabi Panda 991b3650b6 [Rename] refactor server/snapshots package. (#251)
Refactor `server/snapshots` to rename the package names from `org.elasticsearch.snapshots` to `org.opensearch.snapshots` as part of the rename to OpenSearch work.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Rabi Panda 584efd7970 [Rename] modules/lang-painless (#210)
Refactor lang-painless module as part rename to OpenSearch work.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Rabi Panda 3eee5183d1 [Rename] server/rest (#229)
This commit refactors the `server/rest` package as part of the Elasticsearch to OpenSearch renaming.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Nick Knize 8aa818e93e [Rename] refactor o.e.action.admin.cluster (#207)
This commit refactors all classes in o.e.action.admin.cluster to 
org.opensearch.action.admin.cluster. References are updated 
throughout the codebase.

Signed-off-by: Nicholas Knize <nknize@amazon.com>
2021-03-21 20:56:34 -05:00
Nick Knize 1203aa7302 [Rename] refactor o.e.action classes (#203)
This commit refactors top level classes in o.e.action to o.opensearch.action.
References throughout the rest of the codebase have been updated.

Signed-off-by: Nicholas Knize <nknize@amazon.com>
2021-03-21 20:56:34 -05:00
Nick Knize 0c81a5cf65 [Rename] refactor o.e.action.admin.indices (#209)
This commit refactors o.e.action.admin.indices package to
o.opensearch.action.admin.indices. References through out the codebase have been
updated to reflect the new package location.

Signed-off-by: Nicholas Knize <nknize@amazon.com>
2021-03-21 20:56:34 -05:00
Nick Knize 2aa9906c42 [Rename] ElasticsearchParseException class in server module (#169)
This commit refactors ElasticsearchParseException class in the server module to
OpenSearchParseException. References and usages throughout the rest of the
codebase are fully refactored.

Signed-off-by: Nicholas Knize <nknize@amazon.com>
2021-03-21 20:56:34 -05:00
Nick Knize ccceb381db [Rename] ElasticsearchException class in server module (#165)
This commit refactors the ElasticsearchException class located in the server module
to OpenSearchException. References and usages throughout the rest of the
codebase are fully refactored.

Signed-off-by: Nicholas Knize <nknize@amazon.com>
2021-03-21 20:56:34 -05:00
Rabi Panda 38e9c9750a [PURIFY] Remove the AuthorizationEnginePlugin from examples. (#26)
Signed-off-by: Peter Nied <petern@amazon.com>
2021-03-13 10:36:09 -06:00
Rabi Panda c856534394 [PURIFY] Remove remaining x-pack license. (#25)
Signed-off-by: Peter Nied <petern@amazon.com>
2021-03-13 10:36:09 -06:00
Nick Knize a0b91cb230 Cleanup build script to exclude security-authorization-engine (#8) (#8)
* Cleanup build-scan, remove publish scan to elastic server

* Cleanup build script to exclude security-authorization-engine which test has dependency on xpack

* Cleanup build script to exclude security-authorization-engine which test has dependency on xpack

Co-authored-by: Huan Jiang <huanji@amazon.com>
Signed-off-by: Peter Nied <petern@amazon.com>
2021-03-13 10:36:06 -06:00
Nick Knize 3a52e9ddc1 [PURIFY] update build.gradle files to ensure build completes; gradle check fails (#7)
Signed-off-by: Peter Nied <petern@amazon.com>
2021-03-13 10:36:06 -06:00
Alan Woodward fb84b6710d
Restore use of default search and search_quote analyzers (#65491) (#65562)
In the refactoring of TextFieldMapper, we lost the ability to define
a default search or search_quote analyzer in index settings. This
commit restores that ability, and adds some more comprehensive
testing.

Fixes #65434
2020-11-26 18:34:59 +00:00
Armin Braun 51e9d6f227
Revert Serializing Outbound Transport Messages on IO Threads (#64632) (#64654)
Serializing outbound transport message on the IO loop was introduced in https://github.com/elastic/elasticsearch/pull/56961. Unfortunately it turns out that this is incompatible with assumptions made by CCR code here: f22ddf822e/x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/repositories/GetCcrRestoreFileChunkAction.java (L60-L61) and that are not easy to work around on short notice.

Raising reverting this move (as a temporary solution, it's still a valuable change long-term) as a blocker therefore as this seriously affects the stability of the initial phase of the CCR following by causing corrupted bytes to be send to the follower.
2020-11-05 16:29:12 +01:00
Ignacio Vera 4851bc7bae
Upgrade to Lucene-8.7.0 (#64532) (#64537) 2020-11-03 16:57:04 +01:00
Ignacio Vera d0f5066310
Upgrade to lucene-8.7.0-snapshot-72d8528c3a6 (#63912) (#63928) (#63933) 2020-10-20 15:08:06 +02:00
Julie Tibshirani ae2fc4118d Add factory methods for common value fetchers. (#63438)
This PR adds factory methods for the most common implementations:
* `SourceValueFetcher.identity` to pass through the source value untouched.
* `SourceValueFetcher.toString` to simply convert the source value to a string.
2020-10-08 12:14:53 -07:00
Mayya Sharipova e022b78198
Upgrade to lucene-8.7.0-snapshot-5c4168d (#63466)
This disables sort optim on _doc, which may still be unstable.
Backport for #63444
2020-10-08 08:20:43 -04:00
Mayya Sharipova e236ea43e9 Upgrade to lucene-8.7.0-snapshot-e914862 (#63401)
Backport for: #63395
2020-10-07 09:45:14 -04:00
Alan Woodward 88b45dfa61
Convert TextFieldMapper to parametrized form (#63269) (#63392)
As a result of this, we can remove a chunk of code from TypeParsers as well. Tests
for search/index mode analyzers have moved into their own file. This commit also
rationalises the serialization checks for parameters into a single SerializerCheck
interface that takes the values includeDefaults, isConfigured and the value
itself.

Relates to #62988
2020-10-07 13:26:25 +01:00
Mayya Sharipova f2ba62b894
Upgrade to lucene- 8.7.0-snapshot-66c49a35402 (#63372)
This includes fixing a bug in doc iteration during sort optimization

Backport for #63349
2020-10-06 22:38:58 -04:00
Julie Tibshirani f17ca18dfa
Make array value parsing flag more robust. (#63371)
When constructing a value fetcher, the 'parsesArrayValue' flag must match
`FieldMapper#parsesArrayValue`. However there is nothing in code or tests to
help enforce this.

This PR reworks the value fetcher constructors so that `parsesArrayValue` is
'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must
explicitly set it to true and ensure the behavior is covered by tests.

Follow-up to #62974.
2020-10-06 17:49:25 -07:00
Nhat Nguyen 1a6837883a Upgrade to Lucene-8.7.0-snapshot-77396dbf339 (#63222)
Includes LUCENE-9554, which exposes the pendingNumDocs from IndexWriter.
2020-10-05 14:39:30 -04:00
Rene Groeschke f58ebe58ee
Use services for archive and file operations in tasks (#62968) (#63201)
Referencing a project instance during task execution is discouraged by
Gradle and should be avoided. E.g. It is incompatible with Gradles
incubating configuration cache. Instead there are services available to handle
archive and filesystem operations in task actions.

Brings us one step closer to #57918
2020-10-05 15:52:15 +02:00
Alan Woodward 01950bc80f
Move FieldMapper#valueFetcher to MappedFieldType (#62974) (#63220)
For runtime fields, we will want to do all search-time interaction with
a field definition via a MappedFieldType, rather than a FieldMapper, to
avoid interfering with the logic of document parsing. Currently, fetching
values for runtime scripts and for building top hits responses need to
call a method on FieldMapper. This commit moves this method to
MappedFieldType, incidentally simplifying the current call sites and freeing
us up to implement runtime fields as pure MappedFieldType objects.
2020-10-04 14:54:59 +01:00
Alan Woodward de08ba58bf Convert percolator, murmur3 and histogram mappers to parametrized form (#63004)
Relates to #62988
2020-09-29 14:42:26 +01:00
Mayya Sharipova 4c8c3c8df6
Upgrade lucene to lucene-8.7.0-snapshot-3b59906 (#62978)
Backport for #62970
2020-09-28 16:52:31 -04:00
Tim Brooks 59dd889c10
Split up large HTTP responses in outbound pipeline (#62666)
Currently Netty will batch compression an entire HTTP response
regardless of its content size. It allocates a byte array at least of
the same size as the uncompressed content. This causes issues with our
attempts to remove humungous G1GC allocations. This commit resolves the
issue by split responses into 128KB chunks.

This has the side-effect of making large outbound HTTP responses that
are compressed be send as chunked transfer-encoding.
2020-09-24 16:35:52 -06:00
Tim Brooks 43a4882951
Move CorsHandler to server (#62007)
Currently we duplicate our specialized cors logic in all transport
plugins. This is unnecessary as it could be implemented in a single
place. This commit moves the logic to server. Additionally it fixes a
but where we are incorrectly closing http channels on early Cors
responses.
2020-09-24 16:32:59 -06:00
Alan Woodward e28750b001
Add parameter update and conflict tests to MapperTestCase (#62828) (#62902)
This commit adds a mechanism to MapperTestCase that allows implementing
test classes to check that their parameters can be updated, or throw conflict
errors as advertised. Child classes override the registerParameters method
and tell the passed-in UpdateChecker class about their parameters. Simple
conflicts can be checked, using the existing minimal mappings as a base to
compare against, or alternatively a particular initial mapping can be provided
to check edge cases (eg, norms can be updated from true to false, but not
vice versa). Updates are registered with a predicate that checks that the update
has in fact been applied to the resulting FieldMapper.

Fixes #61631
2020-09-24 20:38:12 +01:00
Armin Braun 83ec8dd4e2
Upgrade GCS SDK to 1.113.1 (#62848) (#62864)
Just staying on top of upgrades to the SDK and its dependencies.
2020-09-24 15:43:21 +02:00
Luca Cavanna 862fab06d3
Share same existsQuery impl throughout mappers (#57607)
Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers.

There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available.

This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method.

At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.
2020-09-23 11:00:53 +02:00
Luca Cavanna 5ca86d541c
Move stored flag from TextSearchInfo to MappedFieldType (#62717) (#62770) 2020-09-23 09:40:34 +02:00
markharwood a0df0fb074
Search - add case insensitive flag for "term" family of queries #61596 (#62661)
Backport of fe9145f

Closes #61546
2020-09-22 13:56:51 +01:00
Luca Cavanna 9ae29713fd
Dense vector field type minor fixes (#62631)
The dense vector field is not aggregatable although it produces fielddata through its BinaryDocValuesField. It should pass up hasDocValues set to true to its parent class in its constructor, and return isAggregatable false. Same for the sparse vector field (only in 7.x).

This may not have consequences today, but it will be important once we try to share the same exists query implementation throughout all of the mappers with #57607.
2020-09-22 10:40:51 +02:00
Christos Soulios 6a298970fd
[7.x] Allow metadata fields in the _source (#62616)
Backports #61590 to 7.x

    So far we don't allow metadata fields in the document _source. However, in the case of the _doc_count field mapper (#58339) we want to be able to set

    This PR adds a method to the metadata field parsers that exposes if the field can be included in the document source or not.
    This way each metadata field can configure if it can be included in the document _source
2020-09-18 19:56:41 +03:00
Adrien Grand 4de8579455
Upgrade to lucene-8.7.0-snapshot-830bd186a8d. (#62596) 2020-09-18 09:51:34 +02:00
David Turner 0a3f2c453f Hide c.a.s.s.i.UseArnRegionResolver noise (#62522)
A recent AWS SDK upgrade has introduced a new source of spurious `WARN`
logs when the security manager prevents access to the user's home
directory and therefore to `$HOME/.aws/config`. This is the behaviour we
want, and it's harmless and handled by the SDK as if the config doesn't
exist, so this log message is unnecessary noise.  This commit suppresses
this noisy logging by default.

Relates #20313, #56346, #53962
Closes #62493
2020-09-18 08:30:39 +01:00
Tanguy Leroux e6777810ba
Fix S3BlobContainerRetriesTests (#62464) (#62551)
The AssertingInputStream in S3BlobContainerRetriesTests verifies 
that InputStream are either fully consumed or aborted, but the 
eof flag is only set when the underlying stream returns it.

When buffered read are executed and when the exact number 
of remaining bytes are read, the eof flag is not set to true. Instead 
the test should rely on the total number of bytes read to know if 
the stream has been fully consumed.

Close #62390
2020-09-17 17:12:34 +02:00
Adrien Grand 9a8225bbc1
Upgrade to lucene-8.7.0-snapshot-9cd3af50f80. (#62450) (#62476)
This new snapshot contains the following JIRAs that we're interested in:
 - [LUCENE-9525](https://issues.apache.org/jira/browse/LUCENE-9525)
Better handling of small documents. This should improve retrieval times
when documents are less than ~1kB.
 - [LUCENE-9510](https://issues.apache.org/jira/browse/LUCENE-9510)
Faster flushes when index sorting is enabled by not compressing the
temporary files that store stored fields and term vectors.
2020-09-17 10:28:20 +02:00
Nik Everett 24a24d050a
Implement fields fetch for runtime fields (backport of #61995) (#62416)
This implements the `fields` API in `_search` for runtime fields using
doc values. Most of that implementation is stolen from the
`docvalue_fields` fetch sub-phase, just moved into the same API that the
`fields` API uses. At this point the `docvalue_fields` fetch phase looks
like a special case of the `fields` API.

While I was at it I moved the "which doc values sub-implementation
should I use for fetching?" question from a bunch of `instanceof`s to a
method on `LeafFieldData` so we can be much more flexible with what is
returned and we're not forced to extend certain classes just to make the
fetch phase happy.

Relates to #59332
2020-09-15 20:24:10 -04:00
Armin Braun 98f525f8a7
Faster Azure Blob InputStream (#61812) (#62387)
Building our own that should perform better than the one in the SDK.
Also, as a result saving a HEAD call for each ranged read on Azure.
2020-09-15 18:27:22 +02:00
Adrien Grand 6db8afefc2
Upgrade to lucene-8.7.0-snapshot-cdfdc1e0851. (#62376)
Upgrade to a new Lucene snapshot that (at least partially) addresses the
indexing rate regression when index sorting is enabled.

Backport of #62334.
2020-09-15 17:48:07 +02:00
Tanguy Leroux faf96c175e
Abort non-fully consumed S3 input stream (#62167) (#62370)
Today when an S3RetryingInputStream is closed the remaining bytes 
that were not consumed are drained right before closing the underlying 
stream. In some contexts it might be more efficient to not consume the 
remaining bytes and just drop the connection.

This is for example the case with snapshot backed indices prewarming, 
where there is not point in reading potentially large blobs if we know 
the cache file we want to write the content of the blob as already been 
evicted. Draining all bytes here takes a slot in the prewarming thread 
pool for nothing.
2020-09-15 14:33:37 +02:00
Francisco Fernández Castaño 21303e8e15
Take into account sas tokens while metering put object requests on azure (#62244)
Backport of #62225
Closes #62208
2020-09-10 19:47:58 +02:00
Ignacio Vera c8981ea93d
upgrade to lucene-8.7.0-snapshot-b313618cc1d (#62213) (#62222) 2020-09-10 16:23:18 +02:00
Jake Landis d8dad9ab2c
[7.x] Remove integTest task from PluginBuildPlugin (#61879) (#62135)
This commit removes `integTest` task from all es-plugins.  
Most relevant projects have been converted to use yamlRestTest, javaRestTest, 
or internalClusterTest in prior PRs. 

A few projects needed to be adjusted to allow complete removal of this task
* x-pack/plugin - converted to use yamlRestTest and javaRestTest 
* plugins/repository-hdfs - kept the integTest task, but use `rest-test` plugin to define the task
* qa/die-with-dignity - convert to javaRestTest
* x-pack/qa/security-example-spi-extension - convert to javaRestTest
* multiple projects - remove the integTest.enabled = false (yay!)

related: #61802
related: #60630
related: #59444
related: #59089
related: #56841
related: #59939
related: #55896
2020-09-09 14:25:41 -05:00
Nik Everett b8e9a7125f
Speed up empty highlighting many fields (backport of #61860) (#62122)
Kibana often highlights *everything* like this:
```
POST /_search
{
  "query": ...,
  "size": 500,
  "highlight": {
    "fields": {
      "*": { ... }
    }
  }
}
```

This can get slow when there are hundreds of mapped fields. I tested
this locally and unscientifically and it took a request from 20ms to
150ms when there are 100 fields. I've seen clusters with 2000 fields
where simple search go from 500ms to 1500ms just by turning on this sort
of highlighting. Even when the query is just a `range` that and the
fields are all numbers and stuff so it won't highlight anything.

This speeds up the `unified` highlighter in this case in a few ways:
1. Build the highlighting infrastructure once field rather than once pre
   document per field. This cuts out a *ton* of work analyzing the query
   over and over and over again.
2. Bail out of the highlighter before loading values if we can't produce
   any results.

Combined these take that local 150ms case down to 65ms. This is unlikely
to be really useful when there are only a few fetched docs and only a
few fields, but we often end up having many fields with many fetched
docs.
2020-09-08 15:49:50 -04:00
Francisco Fernández Castaño 2bb5716b3d
Add repositories metering API (#62088)
This pull request adds a new set of APIs that allows tracking the number of requests performed
by the different registered repositories.

In order to avoid losing data, the repository statistics are archived after the repository is closed for
a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the
statistics for the active repositories as well as the modified/closed repositories.

Backport of #60371
2020-09-08 14:01:04 +02:00
Ignacio Vera 31c026f25c
upgrade to Lucene-8.7.0-snapshot-61ea26a (#61957) (#61974) 2020-09-04 13:46:20 +02:00
Ryan Ernst d6e17170c3
Simplify adding plugins and modules to testclusters (#61886)
There are currently half a dozen ways to add plugins and modules for
test clusters to use. All of them require the calling project to peek
into the plugin or module they want to use to grab its bundlePlugin
task, and then both depend on that task, as well as extract the archive
path the task will produce. This creates cross project dependencies that
are difficult to detect, and if the dependent plugin/module has not yet
been configured, the build will fail because the task does not yet
exist.

This commit makes the plugin and module methods for testclusters
symmetetric, and simply adding a file provider directly, or a project
path that will produce the plugin/module zip. Internally this new
variant uses normal configuration/dependencies across projects to get
the zip artifact. It also has the added benefit of no longer needing the
caller to add to the test task a dependsOn for bundlePlugin task.
2020-09-03 19:37:46 -07:00
Alan Woodward e2f006eeb4
Merge FetchSubPhase hitsExecute and hitExecute methods (#60907) (#61893)
FetchSubPhase has two 'execute' methods, one which takes all hits to be examined,
and one which takes a single HitContext. It's not obvious which one should be implemented
by a given sub-phase, or if implementing both is a possibility; nor is it obvious that we first
run the hitExecute methods of all subphases, and then subsequently call all the
hitsExecute methods.

This commit reworks FetchSubPhase to replace these two variants with a processor class,
`FetchSubPhaseProcessor`, that is returned from a single `getProcessor` method.  This
processor class has two methods, `setNextReader()` and `process`.  FetchPhase collects
processors from all its subphases (if a subphase does not need to execute on the current
search context, it can return `null` from `getProcessor`).  It then sorts its hits by docid, and
groups them by lucene leaf reader.  For each reader group, it calls `setNextReader()` on
all non-null processors, and then passes each doc id to `process()`.

Implementations of fetch sub phases can divide their concerns into per-request, per-reader
and per-document sections, and no longer need to worry about sorting docs or dealing with
reader slices.

FetchSubPhase now provides a FetchSubPhaseExecutor that exposes two methods,
setNextReader(LeafReaderContext) and execute(HitContext). The parent FetchPhase collects all
these executors together (if a phase should not be executed, then it returns null here); then
it sorts hits, and groups them by reader; for each reader it calls setNextReader, and then
execute for each hit in turn. Individual sub phases no longer need to concern themselves with
sorting docs or keeping track of readers; global structures can be built in
getExecutor(SearchContext), per-reader structures in setNextReader and per-doc in execute.
2020-09-03 12:20:55 +01:00
Tim Brooks e573fa9abc
Add data.path fast path for FilePermission (#61302)
The recursive data.path FilePermission check is an extremely hot
codepath in Elasticsearch. Unfortunately the FilePermission check in
Java is extremely allocation heavy. As it iterates through different
file permissions, it allocates byte arrays for each Path component that
must be compared. This PR improves the situation by adding the recursive
data.path FilePermission it its own PermissionsCollection object which
is checked first.
2020-09-01 12:03:22 -06:00
Jason Tedor 64cd229b35
Upgrade to Lucene 8.6.2 (#61688)
This commit upgrades the Lucene dependencies to 8.6.2.
2020-08-31 09:54:07 -04:00
Armin Braun 0da20579ca
Cleanly Handle S3 SDK Exceptions in Request Counting (#61686) (#61698)
It looks like it is possible for a request to throw an exception early
before any API interaciton has happened. This can lead to the request count
map containing a `null` for the request count key.
The assertion is not correct and we should not NPE here
(as that might also hide the original exception since we are running this code in
a `finally` block from within the S3 SDK).

Closes #61670
2020-08-31 11:05:59 +02:00
Luca Cavanna f769821bc8
Pass SearchLookup supplier through to fielddataBuilder (#61430) (#61638)
Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not.

To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method.

As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors.

With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition.

Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch.

This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>.

Relates to #59332

Co-authored-by: Nik Everett <nik9000@gmail.com>
2020-08-27 18:09:56 +02:00
Przemyslaw Gomulka 9f566644af
Do not create two loggers for DeprecationLogger backport(#58435) (#61530)
DeprecationLogger's constructor should not create two loggers. It was
taking parent logger instance, changing its name with a .deprecation
prefix and creating a new logger.
Most of the time parent logger was not needed. It was causing Log4j to
unnecessarily cache the unused parent logger instance.

depends on #61515
backports #58435
2020-08-26 16:04:02 +02:00
Nik Everett 87cf81e179
Migrate some more mapper test cases (#61507) (#61552)
Migrate some more mapper test cases from `ESSingleNodeTestCase` to
`MapperTestCase`.
2020-08-25 15:27:26 -04:00
markharwood 8b56441d2b
Search - add case insensitive support for regex queries. (#59441) (#61532)
Backport to add case insensitive support for regex queries. 
Forks a copy of Lucene’s RegexpQuery and RegExp from Lucene master.
This can be removed when 8.7 Lucene is released.

Closes #59235
2020-08-25 17:18:59 +01:00
Przemyslaw Gomulka f3f7d25316
Header warning logging refactoring backport(#55941) (#61515)
Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog).
Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed.

relates #55699
relates #52369
backports #55941
2020-08-25 16:35:54 +02:00
Julie Tibshirani 997c73ec17
Correct how field retrieval handles multifields and copy_to. (#61391)
Before when a value was copied to a field through a parent field or `copy_to`,
we parsed it using the `FieldMapper` from the source field. Instead we should
parse it using the target `FieldMapper`. This ensures that we apply the
appropriate mapping type and options to the copied value.

To implement the fix cleanly, this PR refactors the value parsing strategy. Now
instead of looking up values directly, field mappers produce a helper object
`ValueFetcher`. The value fetchers are responsible for almost all aspects of
fetching, including looking up the right paths in the _source.

The PR is fairly big but each commit can be reviewed individually.

Fixes #61033.
2020-08-20 15:53:35 -07:00