Commit Graph

2521 Commits

Author SHA1 Message Date
Jason Tedor 5fcda57b37
Rename MetaData to Metadata in all of the places (#54519)
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
2020-03-31 17:24:38 -04:00
Zachary Tong c9db2de41d
[7.x] Comprehensively test supported/unsupported field type:agg combinations (#54451)
* Comprehensively test supported/unsupported field type:agg combinations (#52493)

This adds a test to AggregatorTestCase that allows us to programmatically
verify that an aggregator supports or does not support a particular
field type.  It fetches the list of registered field type parsers,
creates a MappedFieldType from the parser and then attempts to run
a basic agg against the field.

A supplied list of supported VSTypes are then compared against the
output (success or exception) and suceeds or fails the test accordingly.

Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com>
* Skip fields that are not aggregatable

* Use newIndexSearcher() to avoid incompatible readers (#52723)

Lucene's `newSearcher()` can generate readers like ParallelCompositeReader
which we can't use.  We need to instead use our helper `newIndexSearcher`
2020-03-31 14:35:03 -04:00
Martijn van Groningen 4b4fbc160d
Refactor AliasOrIndex abstraction. (#54394)
Backport of #53982

In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams,
the abstraction needs to be made more flexible, because currently it really can be only
an alias or an index.

* Renamed `AliasOrIndex` to `IndexAbstraction`.
* Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is.
* Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum.
* Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface.
* Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`.
* Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method.

Relates to #53100
2020-03-30 10:12:16 +02:00
Tim Brooks 2ccddbfa88
Move transport decoding and aggregation to server (#54360)
Currently all of our transport protocol decoding and aggregation occurs
in the individual transport modules. This means that each implementation
(test, netty, nio) must implement this logic. Additionally, it means
that the entire message has been read from the network before the server
package receives it.

This commit creates a pipeline in server which can be passed arbitrary
bytes to handle. Internally, the pipeline will decode, decompress, and
aggregate the messages. Additionally, this allows us to run many
megabytes of bytes through the pipeline in tests to ensure that the
logic works.

This work will enable future work:

Circuit breaking or backoff logic based on message type and byte
in the content aggregator.
Sharing bytes with the application layer using the ref counted
releasable network bytes.
Improved network monitoring based specifically on channels.
Finally, this fixes the bug where we do not circuit break on the correct
message size when compression is enabled.
2020-03-27 14:13:10 -06:00
Tim Brooks f5b4020819
Remove netty BytesReference implementations (#54355)
Elasticsearch has a number of different BytesReference implementations.
These implementations can all implement the interface in different ways
with subtly different behavior and performance characteristics. On the
other-hand, the JVM only represents bytes as an array or a direct byte
buffer. This commit deletes the specialized Netty implementations and
moves to using a generic ByteBuffer reference type. This will allow us
to focus on standardizing performance and behave around a smaller number
of implementations that can be used by all components in Elasticsearch.
2020-03-27 11:01:33 -06:00
Armin Braun d9d11f6d16
Remove Unused Apache Http Dependency from GCS Repo Plugin (#54331) (#54342)
We are not using the Apache HTTP client backed http transport
with the GCS repo. Same as with the app engine type transport
we can save ourselves the dependency on the http client here
and ignore the missing classes.
2020-03-27 15:10:19 +01:00
Armin Braun 70b378cd1b
Upgrade GCS Dependency to 1.106.0 (#54092) (#54112)
* Upgrade GCS Dependency to 1.106.0 (#54092)

Upgrading GCS Dep + related dependencies as it seems some more retry bugs were fixed between .104 and .106
2020-03-25 19:05:01 +01:00
James Baiera b84c74cf70
Update the HDFS version used by HDFS Repo (#53693) (#54125) 2020-03-25 14:01:29 -04:00
Mark Vieira 7728ccd920
Encore consistent compile options across all projects (#54120)
(cherry picked from commit ddd068a7e92dc140774598664efdc15155ab05c2)
2020-03-25 08:24:21 -07:00
Armin Braun 4271963462
Revert "Use Azure Bulk Deletes in Azure Repository (#53919)" (#54089) (#54111)
This reverts commit 23cccf088810b8416ed278571352393cc2de9523.
Unfortunately SAS token auth still doesn't work with bulk deletes so we can't use them yet.

Closes #54080
2020-03-25 12:13:25 +01:00
Ioannis Kakavas 7d4ae7d982
Upgrade Tika to 1.24 (#54130) (#54150)
Also updates commons-compress to 1.19, pdfbox to 2.0.19 and
POI to 4.1.2. Adds a compile dependency to commons-math3
3.6.1 and SparseBitSet 1.2
2020-03-25 11:03:26 +02:00
Alan Woodward 39d7d0dc10 Upgrade to lucene 8.5.0 release (#54077)
Upgrades our lucene dependency to the released 8.5.0 version.
2020-03-24 13:45:50 +00:00
Mark Vieira 70cfedf542
Refactor global build info plugin to leverage JavaInstallationRegistry (#54026)
This commit removes the configuration time vs execution time distinction
with regards to certain BuildParms properties. Because of the cost of
determining Java versions for configuration JDK locations we deferred
this until execution time. This had two main downsides. First, we had
to implement all this build logic in tasks, which required a bunch of
additional plumbing and complexity. Second, because some information
wasn't known during configuration time, we had to nest any build logic
that depended on this in awkward callbacks.

We now defer to the JavaInstallationRegistry recently added in Gradle.
This utility uses a much more efficient method for probing Java
installations vs our jrunscript implementation. This, combined with some
optimizations to avoid probing the current JVM as well as deferring
some evaluation via Providers when probing installations for BWC builds
we can maintain effectively the same configuration time performance
while removing a bunch of complexity and runtime cost (snapshotting
inputs for the GenerateGlobalBuildInfoTask was very expensive). The end
result should be a much more responsive build execution in almost all
scenarios.

(cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)
2020-03-23 15:30:10 -07:00
Namgyu Kim bc2289c258 Add nori_number token filter in analysis-nori (#53583)
This change adds the `nori_number` token filter.
It also adds a `discard_punctuation` option in nori_tokenizer that should be used in conjunction with the new filter.
2020-03-23 19:53:34 +01:00
Armin Braun 754d071c4e
Upgrade to AWS SDK 1.11.749 (#53962) (#53974)
Upgrading AWS SDK to v1.11.749.
Required building clients inside privileged contexts because some class loading that requires privileges now happens there and working around a new SDK bug in the S3 client builder.

Closes #53191
2020-03-23 15:31:29 +01:00
Armin Braun b51ea25a00
Use Azure Bulk Deletes in Azure Repository (#53919) (#53967)
Now that we upgraded the Azure SDK to 8.6.2 in #53865 we can make use of
bulk deletes.
2020-03-23 13:35:05 +01:00
Armin Braun 69a35158ce
Fix Azure Repository with HTTPs Endpoint (#53903) (#53963)
Upgrading to 8.6.2 in #53865 broke running against HTTPs endpoints (and hence real azure)
because the https url connection needs the newly added permission to work.
2020-03-23 12:16:33 +01:00
Armin Braun 41301d74b0
Upgrade to Azure SDK 8.6.2 (#53865) (#53886)
This fixes some bugs around retrying and URL encoding and should enable a follow-up
that finally adds bulk deletes on Azure.
2020-03-20 18:27:02 +01:00
Armin Braun a70ebef366
Longer Timeout in S3 Retries Test (#53841) (#53847)
The lower end of the timeout range of 100ms is prone to time out
on CI before the mock REST server gets to sending a response that
is not supposed to be a timeout.
Using 1-3s here should make this safe at the cost of randomly making
this test take a few seconds.

Closes #53506
2020-03-20 12:23:40 +01:00
Jake Landis db3420d757
[7.x] Optimize which Rest resources are used by the Rest tests… (#53766)
This should help with Gradle's incremental compile such that projects
only depend upon the resources they use.

related #52114
2020-03-19 12:28:59 -05:00
Ryan Ernst 5c472fcb47 Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642)
Re-applies the change from #53523 along with test fixes.

closes #53626
closes #53624
closes #53622
closes #53625

Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Jake Landis <jake.landis@elastic.co>
2020-03-17 10:28:51 -07:00
Alan Woodward 71b703edd1 Rename AtomicFieldData to LeafFieldData (#53554)
This conforms with lucene's LeafReader naming convention, and
matches other per-segment structures in elasticsearch.
2020-03-17 12:30:12 +00:00
Mark Vieira 2f0aca992b
Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576)"
This reverts commit b7dbadeea0.
2020-03-15 18:10:40 -07:00
Jason Tedor b7dbadeea0
Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576)
This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2
dependency to 2.13.1.

Relates #53523
2020-03-14 13:28:06 -04:00
Jason Tedor 32dd852210
Update jackson-databind to 2.8.11.6 (#53522)
This commit upgrades the jackson-databind depdendency to
2.8.11.6. Additionally, we revert a previous change that put
ingest-geoip on the version of jackson-databind from the version
properties file. This is because upgrading ingest-geoip to a later
version of jackson-databind also requires an upgrade to the geoip2
dependency which is currently blocked. Therefore, if we can get to a
point where we otherwise upgrade our Jackson dependencies, we do not
want ingest-geoip to automatically come along with it.
2020-03-12 20:15:13 -04:00
Alan Woodward 5c861cfe6e Upgrade to final lucene 8.5.0 snapshot (#53293)
Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use
the latest snapshot to check that there are no last-minute bugs or regressions.
2020-03-10 09:32:59 +00:00
Nhat Nguyen 5476a49833 Revert "upgrade to lucene-snapshot-fa75139efea (#53150) (#53151)"
This reverts commit 058113aa42.
2020-03-05 17:33:00 -05:00
Armin Braun 204c366a4e
Upgrade GCS SDK to 1.104.0 (#52839) (#53152)
Upgrading the GCS SDK to the most recent version.
Adjusting (i.e. improving) the REST mock accordingly.
This should significantly boost performance by pulling in
https://github.com/googleapis/java-core/issues/86 in some cases.
2020-03-05 11:18:18 +01:00
Ignacio Vera 058113aa42
upgrade to lucene-snapshot-fa75139efea (#53150) (#53151) 2020-03-05 10:04:05 +01:00
Tanguy Leroux 52d4807f8d
Mute GoogleCloudStorageBlobStoreRepositoryTests on jdk8 (#53119)
Tests in GoogleCloudStorageBlobStoreRepositoryTests are known 
to be flaky on JDK 8 (#51446, #52430 ) and we suspect a JDK 
bug (https://bugs.openjdk.java.net/browse/JDK-8180754) that triggers
 some assertion on the server side logic that emulates the Google 
Cloud Storage service.

Sadly we were not able to reproduce the failures, even when using 
the same OS (Debian 9, Ubuntu 16.04) and JDK (Oracle Corporation 
1.8.0_241 [Java HotSpot(TM) 64-Bit Server VM 25.241-b07]) of 
almost all the test failures on CI. While we spent some time fixing 
code (#51933, #52431) to circumvent the JDK bug they are still flaky 
on JDK-8. This commit mute these tests for JDK-8 only.

Close ##52906
2020-03-05 09:18:05 +01:00
Nhat Nguyen e6755afeeb
Upgrade to Lucene 8.5.0-snapshot-c4475920b08 (#52950) (#52977)
To give LUCENE-9228 more CI cycles
2020-02-29 09:29:16 -05:00
Lee Hinman a47e404732 Mute GoogleCloudStorageBlobStoreRepositoryTests (#52926)
These intermittently fail due to an assertion triggered by a JDK bug.

Relates to #52906
2020-02-27 15:16:48 -07:00
Mark Vieira f46b370e7a
Fix cacheability of repository-hdfs integ tests (#52858) 2020-02-27 09:53:51 -08:00
Mark Vieira bc9c3f0135
Ignore test seed in third party test system property inputs (#52849) 2020-02-26 14:29:34 -08:00
Mark Vieira f06d692706
[Backport] Consolidate docker availability logic (#52656) 2020-02-21 15:24:05 -08:00
markharwood 96d603979b
Upgrade Lucene to 8.5.0-snapshot-b01d7cb (#52584)
Upgrading 7x to same Lucene 8.5 version used in master
2020-02-21 10:25:03 +00:00
Armin Braun 5a7db0c520
Fix GCS Test testReadLargeBlobWithRetries (#52619) (#52624)
The countdown didn't work well here because it only returns `true` once the countdown reaches `0`
but can on subsequent executions return `false` again if a countdown at `0` is counted down again,
leading to more than the expected number of simulated failures.

Closes #52607
2020-02-21 10:34:53 +01:00
Armin Braun 1662cd45a4
Add Region and Signer Algorithm Overrides to S3 Repos (#52112) (#52562)
Exposes S3 SDK signing region and algorithm override settings as requested in #51861.

Closes #51861
2020-02-21 10:21:20 +01:00
Armin Braun 0a09e15959
Add Caching for RepositoryData in BlobStoreRepository (#52341) (#52566)
Cache latest `RepositoryData` on heap when it's absolutely safe to do so (i.e. when the repository is in strictly consistent mode).

`RepositoryData` can safely be assumed to not grow to a size that would cause trouble because we often have at least two copies of it loaded at the same time when doing repository operations. Also, concurrent snapshot API status requests currently load it independently of each other and so on, making it safe to cache on heap and assume as "small" IMO.

The benefits of this move are:
* Much faster repository status API calls
   * listing all snapshot names becomes instant
   * Other operations are sped up massively too because they mostly operate in two steps: load repository data then load multiple other blobs to get the additional data
* Additional cloud cost savings
* Better resiliency, saving another spot where an IO issue could break the snapshot
* We can simplify a number of spots in the current code that currently pass around the repository data in tricky ways to avoid loading it multiple times in follow ups.
2020-02-21 10:20:07 +01:00
Armin Braun 4bb780bc37
Refactor Inflexible Snapshot Repository BwC (#52365) (#52557)
* Refactor Inflexible Snapshot Repository BwC (#52365)

Transport the version to use for  a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere.
Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.
2020-02-21 09:14:34 +01:00
Mark Vieira 4bce9984e6
Mute GoogleCloudStorageBlobContainerRetriesTests.testReadLargeBlobWithRetries
Signed-off-by: Mark Vieira <portugee@gmail.com>
2020-02-20 15:13:34 -08:00
Armin Braun aeb7b777e6
Add Blob Download Retries to GCS Repository (#52479) (#52521)
* Add Blob Download Retries to GCS Repository

Exactly as #46589 (and kept as close to it as possible code wise so we can dry things up in a follow-up potentially) but for GCS.

Closes #52319
2020-02-19 18:29:13 +01:00
Tim Brooks e752221fc6
Upgrade netty to 4.1.45.Final (#51689)
Upgrade netty.
2020-02-18 09:11:29 -07:00
Ioannis Kakavas d9ce0e6733
Update BouncyCastle to 1.64 (#52185) (#52464)
This commit upgrades the bouncycastle dependency from 1.61 to 1.64.
2020-02-18 14:11:34 +02:00
Armin Braun a9c7557ac4
Fix Failure to Drain Stream in GCS Repo Tests (#52431) (#52454)
Same as #51933 but for the custom handler just used in this test.

Closes #52430
2020-02-18 11:37:34 +01:00
Marios Trivyzas dac720d7a1
Add a cluster setting to disallow expensive queries (#51385) (#52279)
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have
usually slow performance cannot be executed and an error message
is returned.

- Queries that need to do linear scans to identify matches:
  - Script queries
- Queries that have a high up-front cost:
  - Fuzzy queries
  - Regexp queries
  - Prefix queries (without index_prefixes enabled
  - Wildcard queries
  - Range queries on text and keyword fields
- Joining queries
  - HasParent queries
  - HasChild queries
  - ParentId queries
  - Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
  - Script score queries
  - Percolate queries

Closes: #29050
(cherry picked from commit a8b39ed842c7770bd9275958c9f747502fd9a3ea)
2020-02-12 22:56:14 +01:00
Armin Braun 6ea3f5ada1
Move EC2 Discovery Tests to Mock Rest API (#50605) (#52270)
Move EC2 discovery tests to using the mock REST API introduced in
https://github.com/elastic/elasticsearch/pull/50550 instead of mocking
the AWS SDK classes manually.
Move the trivial remaining AWS SDK mocks to the single test suit that
was using them.
2020-02-12 18:35:50 +01:00
Ignacio Vera 80e3c97210 Upgrade to lucene-8.5.0-snapshot-d62f6307658 (#52039) (#52130) 2020-02-10 10:13:22 +01:00
Ioannis Kakavas 343fb36c7f Test modifications for FIPS 140 mode (#51832) (#52128)
- Enable SunJGSS provider for Kerberos tests
- Handle the fact that in the decrypt method in KeyStoreWrapper might
not throw immediately when the GCM cipher is from BouncyCastle FIPS
and we end up with a DataInputStream that has reached it's end.
- Disable tests, jarHell, testingConventions for ingest attachment
plugin. We don't support this plugin (and document this) in FIPS
mode.
- Don't attempt to install ingest-attachment in smoke-test-plugins
2020-02-10 10:57:03 +02:00
Jay Modi 3edadfefd0 RestHandlers declare handled routes (#52123)
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.

This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.

Closes #51622

Co-authored-by: Jason Tedor <jason@tedor.me>

Backport of #51950
2020-02-09 22:48:32 -07:00