Commit Graph

2294 Commits

Author SHA1 Message Date
Alan Woodward 44c3418531 Simplify handling of keyword field normalizers (#42002)
We have a number of places in analysis-handling code where we check
if a field type is a keyword field, and if so then extract the normalizer rather
than pulling the index-time analyzer. However, a keyword normalizer is
really just a special case of an analyzer, so we should be able to simplify this
by setting the normalizer as the index-time analyzer at construction time.
2019-05-10 14:38:46 +01:00
Alpar Torok 711ace0533 Testclusters: support for security and convert example plugins (#41864)
testclusters detect from settings that security is enabled
if a user is not specified using the DSL introduced in this PR, a default one is created
the appropriate wait conditions are used authenticating with the first user defined in the DSL ( or the default user ).
an example DSL to create a user is user username:"test_user" password:"x-pack-test-password" role: "superuser" all keys are optional and default to the values shown in this example
2019-05-08 14:04:00 +03:00
Yannick Welsch 5b71baa100 Upgrade SDK and test discovery-ec2 credential providers (#41732)
Upgrades the AWS SDK to the same version that we're using for the repository-s3 plugin, providing
testing capabilities to override certain SDK endpoints in order to point them to localhost for testing.
Adds tests for the various credential providers.
2019-05-08 09:38:36 +02:00
David Turner 4c909e93bb
Reject port ranges in `discovery.seed_hosts` (#41905)
Today Elasticsearch accepts, but silently ignores, port ranges in the
`discovery.seed_hosts` setting:

```
discovery.seed_hosts: 10.1.2.3:9300-9400
```

Silently ignoring part of a setting like this is trappy. With this change we
reject seed host addresses of this form.

Closes #40786
Backport of #41404
2019-05-08 08:34:32 +01:00
Alan Woodward 4cca1e8fff Correct spelling of MockLogAppender.PatternSeenEventExpectation (#41893)
The class was called PatternSeenEventExcpectation. This commit
is a straight class rename to correct the spelling.
2019-05-07 17:28:51 +01:00
Tim Brooks 927013426a
Read multiple TLS packets in one read call (#41820)
This is related to #27260. Currently we have a single read buffer that
is no larger than a single TLS packet. This prevents us from reading
multiple TLS packets in a single socket read call. This commit modifies
our TLS work to support reading similar to the plaintext case. The data
will be copied to a (potentially) recycled TLS packet-sized buffer for
interaction with the SSLEngine.
2019-05-06 09:51:32 -06:00
Tim Brooks 24484ae227
Fix http read timeout test by releasing response (#41801)
This fixes #41794. Currently the read timeout test queues up responses
in the netty pipeline. These responses are immediately returned in the
write call, but they are not released. This commit releases the
responses. This will cause the leak detector to quit throwing
exceptions.
2019-05-03 16:18:26 -06:00
Tim Brooks b4bcbf9f64
Support http read timeouts for transport-nio (#41466)
This is related to #27260. Currently there is a setting
http.read_timeout that allows users to define a read timeout for the
http transport. This commit implements support for this functionality
with the transport-nio plugin. The behavior here is that a repeating
task will be scheduled for the interval defined. If there have been
no requests received since the last run and there are no inflight
requests, the channel will be closed.
2019-05-02 09:48:52 -06:00
Armin Braun 7cc4b9a8b3
Implement Bulk Deletes for GCS Repository (#41368) (#41681)
* Implement Bulk Deletes for GCS Repository (#41368)

* Just like #40322 for AWS
* We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it
  * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex)
* Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations

back port of #41368
2019-04-30 17:03:57 +02:00
Armin Braun 08c0ecb90e
Upgrade to Netty 4.1.35 (#41499) (#41651)
* Some fixes and possible performance fixes in the last 3 versions ->
upgrading
2019-04-30 09:27:51 +02:00
Tim Brooks df3ef66294
Remove dedicated SSL network write buffer (#41654)
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.

This commit also backports the following commit:

Handle WRAP ops during SSL read

It is possible that a WRAP operation can occur while decrypting
handshake data in TLS 1.3. The SSLDriver does not currently handle this
well as it does not have access to the outbound buffer during read call.
This commit moves the buffer into the Driver to fix this issue. Data
wrapped during a read call will be queued for writing after the read
call is complete.
2019-04-29 17:59:13 -06:00
Alpar Torok 335f2bf102 Testclsuters: convert plugins qa projects (#41496)
Add testclusters support for files in keystore and convert qa subprojects within plugins.
2019-04-26 08:57:52 -07:00
Armin Braun aad33121d8
Async Snapshot Repository Deletes (#40144) (#41571)
Motivated by slow snapshot deletes reported in e.g. #39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete.

* Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll
   * I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable.
* See https://github.com/elastic/elasticsearch/pull/39656#issuecomment-470492106
* Also, as a side effect this gives the `SnapshotResiliencyTests` a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore).
* By adding a `ThreadPool` reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in #39657
2019-04-26 15:36:09 +02:00
Tim Brooks 1f8ff052a1
Revert "Remove dedicated SSL network write buffer (#41283)"
This reverts commit f65a86c258.
2019-04-25 18:39:25 -06:00
Tim Brooks f65a86c258
Remove dedicated SSL network write buffer (#41283)
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.
2019-04-25 14:30:54 -06:00
Armin Braun 23b3741618
Remove Exists Check from S3 Repository Deletes (#40931) (#41534)
* The check doesn't add much if anything practically, since the S3 repository is eventually consistent and we only log the non-existence of a blob anyway
  * We don't do the check on writes for this very reason and documented it as such
  * Removing the check saves one API call per single delete speeding up the deletion process and lowering costs
2019-04-25 18:25:03 +02:00
Armin Braun 2e4ac178e2
Fix Repository Base Path Matching in Azure ITs (#41457) (#41469)
* Added quotes so that "regexy" base paths like `7.0` that we use on CI
don't break matching
* closes #41405
2019-04-24 12:24:05 +02:00
clement-tourriere c80f86e3e4 Add ignore_above in ICUCollationKeywordFieldMapper (#40414)
Add the possibility to use ignore_above parameter in ICUCollationKeywordFieldMapper.

Close #40413
2019-04-19 14:19:35 -07:00
Alpar Torok 4ef4ed66b9 Convert repository-hdfs to testclusters (#41252)
* Convert repository-hdfs to testclusters

Relates #40862
2019-04-19 09:47:14 +03:00
Armin Braun c4e84e2b34
Add Bulk Delete Api to BlobStore (#40322) (#41253)
* Adds Bulk delete API to blob container
* Implement bulk delete API for S3
* Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs
* Closes #40250
2019-04-16 17:19:05 +02:00
Mark Vieira 1287c7d91f
[Backport] Replace usages RandomizedTestingTask with built-in Gradle Test (#40978) (#40993)
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)

This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.

(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)

* Fix forking JVM runner

* Don't bump shadow plugin version
2019-04-09 11:52:50 -07:00
Jay Modi f34663282c
Update apache httpclient to version 4.5.8 (#40875)
This change updates our version of httpclient to version 4.5.8, which
contains the fix for HTTPCLIENT-1968, which is a bug where the client
started re-writing paths that contained encoded reserved characters
with their unreserved form.
2019-04-05 13:48:10 -06:00
Martijn van Groningen e5cec87697
Remove -Xlint exclusions in all plugins. (#40721)
The xlint exclusions of the following plugins were removed:
* ingest-attachment.
* mapper-size.
* transport-nio. Removing the -try exclusion required some work, because
  the NettyAdaptor implements AutoCloseable and NettyAdaptor#close() method
  could throw an InterruptedException (ChannelFuture#await() and a generic
  Exception is re-thrown, which maybe an ChannelFuture). The easiest way
  around this to me seemed that NettyAdaptor should not implement AutoCloseable,
  because it is not directly used in a try-with-resources statement.

Relates to #40366
2019-04-04 08:30:34 +02:00
Christoph Büscher 09ba3ec677 Small refactorings to analysis components (#40745)
This change adds the following internal refactorings:

* wraps input analyzers into an unmodifiable map in IndexAnalyzers ctor
* removes duplicated indexSetting in IndexAnalyzers
* removes references to IndexAnalyzers from DocumentMapperParser and TypeParser.ParserContext.
  It can always be retrieve it from MapperService directly in those cases
2019-04-03 14:22:16 +02:00
Alpar Torok 293297ae3d Fix repository-hdfs when no docker and unnecesary fixture
The hdfs-fixture is actually executed in plugin/repository-hdfs as a
dependency. The fixture is not needed and actually causes a failure
because we have two copies now and both use the same ports.
2019-03-29 16:55:12 +02:00
Alpar Torok 35d96c22c0 Fix 3rd pary S3 tests (#40588)
* Fix 3rd pary S3 tests

This is allready excluded on line 186, by doing this again here, the
other exclusion from arround that line are removed causing the tests to
fail.

* Fix blacklisting with the fixture
2019-03-29 08:04:16 +02:00
Alpar Torok d791e08932 Test fixtures krb5 (#40297)
Replaces the vagrant based kerberos fixtures with docker based test fixtures plugin.
The configuration is now entirely static on the docker side and no longer driven by Gradle,
also two different services are being configured since there are two different consumers of the fixture that can run in parallel and require different configurations.
2019-03-28 17:26:58 +02:00
Andy Bristol 23395a9b9f
search as you type fieldmapper (#35600)
Adds the search_as_you_type field type that acts like a text field optimized
for as-you-type search completion. It creates a couple subfields that analyze
the indexed terms as shingles, against which full terms are queried, and a
prefix subfield that analyze terms as the largest shingle size used and
edge-ngrams, against which partial terms are queried

Adds a match_bool_prefix query type that creates a boolean clause of a term
query for each term except the last, for which a boolean clause with a prefix
query is created.

The match_bool_prefix query is the recommended way of querying a search as you
type field, which will boil down to term queries for each shingle of the input
text on the appropriate shingle field, and the final (possibly partial) term
as a term query on the prefix field. This field type also supports phrase and
phrase prefix queries however
2019-03-27 13:29:13 -07:00
Tim Brooks ab44f5fd5d
Add InboundHandler for inbound message handling (#40430)
This commit adds an InboundHandler to handle inbound message processing.
With this commit, this code is moved out of the TcpTransport.
Additionally, finer grained unit tests are added to ensure that the
inbound processing works as expected
2019-03-27 12:33:26 -06:00
Tim Brooks 3860ddd1a4
Move outbound message handling to OutboundHandler (#40336)
Currently there are some components of message serializer and sending
that still occur in TcpTransport. This commit makes it possible to
send a message without the TcpTransport by moving all of the remaining
application logic to the OutboundHandler. Additionally, it adds unit
tests to ensure that this logic works as expected.
2019-03-27 11:47:36 -06:00
Alpar Torok 524e0273ae Testclusters: convert plugin repository-s3 (#40399)
* Add support for setting and keystore settings
* system properties and env var config
* use testclusters for repository-s3
* Some cleanup of the build.gradle file for plugin-s3
* add runner {} to rest integ test task
2019-03-27 08:40:16 +02:00
Henning Andersen bf444b9f02 Store Pending Deletions Fix (#40345)
FilterDirectory.getPendingDeletions does not delegate, fixed
temporarily by overriding in StoreDirectory.

This in turn caused duplicate file name use after a trimUnsafeCommits
had been done, since a new IndexWriter would not consider the pending
deletes in IndexFileDeleter. This should only happen on windows (AFAIK).

Reenabled doing index updates for all tests using
IndexShardTests.indexOnReplicaWithGaps (which could fail due to above
when using mocked WindowsFS).

Added getPendingDeletions delegation to all elasticsearch
FilterDirectory subclasses that were not trivial test-only overrides to
minimize the risk of hitting this issue in another case.
2019-03-26 15:30:44 +01:00
Armin Braun 13d76239a0
Use Netty ByteBuf Bulk Operations for Faster Deserialization (#40158) (#40339)
* Use bulk methods to read numbers faster from byte buffers
2019-03-24 19:08:51 +01:00
Yannick Welsch 8f84f455c3 Remove note about Azure ARM plugin (#40219)
Relates to #22679
2019-03-20 16:31:37 +01:00
Tim Brooks 0b50a670a4
Remove transport name from tcp channel (#40074)
Currently, we maintain a transport name ("mock-nio", "nio", "netty")
that is passed to a `TcpTransportChannel` when a request is received.
The value of this name is to associate with the task when we register a
task with the task manager. However, it is only possible to run ES with
one transport, so having an implementation specific name is unnecessary.
This commit removes the name and replaces it with the generic
"transport".
2019-03-15 12:04:13 -06:00
Ryan Ernst 8f09c77777 Add no-jdk distributions (#39882)
This commit adds a variant for every official distribution that omits
the bundled jdk. The "no-jdk" naming is conveyed through the package
classifier, alongside the platform. Package tests are also added for
each new distribution.
2019-03-15 00:55:57 -07:00
Jason Tedor d02bca1314
Upgrade the bouncycastle dependency to 1.61 (#40017)
This commit upgrades the bouncycastle dependency from 1.59 to 1.61.
2019-03-14 08:54:47 -04:00
Jim Ferenczi 7a7658707a
Upgrade to Lucene release 8.0.0 (#39998)
This commit upgrades to the GA release of Lucene 8

Closes #39640
2019-03-13 18:11:50 +01:00
Jay Modi ba40230c7f
Add client jar for transport-nio (#39860)
This change marks the transport-nio plugin as having a client jar. The
nio transport can be used from a transport client and the
x-pack-transport artifact depends on the transport-nio jar but this jar
is not published. This change marks the transport-nio project as having
a client jar so that the jar may be published in the same way that we
publish the netty4 transport artifact.

The change to actually publish the jar will be handled separately as an
update to the release manager.
2019-03-12 09:27:08 -06:00
David Emanuel Buchmann b5ed039160
plugins/repository-gcs: Update google-cloud-storage/core to 1.59.0 (#39748)
* plugins/repository-gcs: Update google-cloud-storage /
google-cloud-core to 1.59.0

* plugins: Update sha1 for google-cloud-core & google-cloud-storage
2019-03-10 11:04:52 -04:00
Alpar Torok 6c75a2f2b0 Testclusters: start using it for testing some plugins (#39693)
* enable testclusters for some plugins
2019-03-07 17:52:50 +02:00
markharwood 1873de5240
Bug fix for AnnotatedTextHighlighter - port of 39525 (#39749)
Bug fix for AnnotatedTextHighlighter - port of 39525

Relates to #39395
2019-03-06 19:02:04 +00:00
Armin Braun aaecaf59a4
Optimize Bulk Message Parsing and Message Length Parsing (#39634) (#39730)
* Optimize Bulk Message Parsing and Message Length Parsing

* findNextMarker took almost 1ms per invocation during the PMC rally track
  * Fixed to be about an order of magnitude faster by using Netty's bulk `ByteBuf` search
* It is unnecessary to instantiate an object (the input stream wrapper) and throw it away, just to read the `int` length from the message bytes
  * Fixed by adding bulk `int` read to BytesReference
2019-03-06 08:13:15 +01:00
Armin Braun 65732d707f
Add Support for S3 Intelligent Tiering (#39376) (#39620)
* Add support for S3 intelligent tiering
* Closes #38836
2019-03-04 10:32:37 +01:00
Alpar Torok 813351fe26 Un-mute and fix BuildExamplePluginsIT (#38899)
* Un-mute and fix BuildExamplePluginsIT

There doesn't seem to be anything wrong with the test iteself.
I think the failure were CI performance related, but while it was muted,
some failures managed to sneak in.

Closes #38784

* PR review
2019-03-04 08:50:55 +02:00
Alan Woodward 71b8494181
Upgrade to lucene 8.0.0-snapshot-ff9509a8df (#39444)
Backport of #39350

Contains the following:

* LUCENE-8635: Move terms dictionary off-heap for non-primary-key fields in `MMapDirectory`
* LUCENE-8292: `TermsEnum` is fully abstract
* LUCENE-8679: Return WITHIN in `EdgeTree#relateTriangle` only when polygon and triangle share one edge
* LUCENE-8676: Nori tokenizer deals correctly with large buffers
* LUCENE-8697: `GraphTokenStreamFiniteStrings` better handles side paths with gaps
* LUCENE-8664: Add `equals` and `hashCode` to `TotalHits`
* LUCENE-8660: `TopDocsCollector` returns accurate hit counts if the total equals the threshold
* LUCENE-8654: `Polygon2D#relateTriangle` fix for when the polygon is inside the triangle
* LUCENE-8645: `Intervals#fixField` can merge intervals from different fields
* LUCENE-8585: Create jump-tables for DocValues at index time
2019-02-27 14:36:08 +00:00
Jason Tedor 224600f370
Bump jackson-databind version for AWS SDK (#39183)
This commit bumps the jackson-databind version for discovery-ec2 and
repository-s3 to 2.8.11.3.
2019-02-20 13:04:50 -05:00
Henning Andersen 00a26b9dd2 Blob store compression fix (#39073)
Blob store compression was not enabled for some of the files in
snapshots due to constructor accessing sub-class fields. Fixed to
instead accept compress field as constructor param. Also fixed chunk
size validation to work.

Deprecated repositories.fs.compress setting as well to be able to unify
in a future commit.
2019-02-20 09:24:41 +01:00
Alan Woodward 176013e23c Avoid double term construction in DfsPhase (#38716)
DfsPhase captures terms used for scoring a query in order to build global term statistics across
multiple shards for more accurate scoring. It currently does this by building the query's `Weight`
and calling `extractTerms` on it to collect terms, and then calling `IndexSearcher.termStatistics()`
for each collected term. This duplicates work, however, as the various `Weight` implementations 
will already have collected these statistics at construction time.

This commit replaces this round-about way of collecting stats, instead using a delegating
IndexSearcher that collects the term contexts and statistics when `IndexSearcher.termStatistics()`
is called from the Weight.

It also fixes a bug when using rescorers, where a `QueryRescorer` would calculate distributed term
statistics, but ignore field statistics.  `Rescorer.extractTerms` has been removed, and replaced with
a new method on `RescoreContext` that returns any queries used by the rescore implementation.
The delegating IndexSearcher then collects term contexts and statistics in the same way described
above for each Query.
2019-02-15 16:00:38 +00:00
Tanguy Leroux 510829f9f7
TransportVerifyShardBeforeCloseAction should force a flush (#38401)
This commit changes the `TransportVerifyShardBeforeCloseAction` so that it 
always forces the flush of the shard. It seems that #37961 is not sufficient to 
ensure that the translog and the Lucene commit share the exact same max 
seq no and global checkpoint information in case of one or more noop 
operations have been made.

The `BulkWithUpdatesIT.testThatMissingIndexDoesNotAbortFullBulkRequest` 
and `FrozenIndexTests.testFreezeEmptyIndexWithTranslogOps` test this trivial 
situation and they both fail 1 on 10 executions.

Relates to #33888
2019-02-06 13:22:54 +01:00