This commit reintroduces 31251c9 and 63a5799. These commits introduced a
memory leak and were reverted. This commit brings those commits back
and fixes the memory leak by removing unnecessary retain method calls.
This reverts commit 31251c9 introduced in #30695.
We suspect this commit is causing the OOME's reported in #30811 and we will use this PR to test this assertion.
* master:
QA: Add xpack tests to rolling upgrade (#30795)
Modify state of VerifyRepositoryResponse for bwc (#30762)
Reduce CLI scripts to one-liners on Windows (#30772)
Simplify number of shards setting (#30783)
Replace Request#setHeaders with addHeader (#30588)
[TEST] remove endless wait in RestClientTests (#30776)
[Docs] Fix script-fields snippet execution (#30693)
Upgrade to Lucene-7.4.0-snapshot-cc2ee23050 (#30778)
[DOCS] Add SAML configuration information (#30548)
[DOCS] Remove X-Pack references from SQL CLI (#30694)
Make http pipelining support mandatory (#30695)
[Docs] Fix typo in circuit breaker docs (#29659)
[Feature] Adding a char_group tokenizer (#24186)
[Docs] Fix broken cross link in documentation
Test: wait for netty threads in a JUnit ClassRule (#30763)
Increase the maximum number of filters that may be in the cache. (#30655)
[Security] Include an empty json object in an json array when FLS filters out all fields (#30709)
[TEST] Wait for CS to be fully applied in testDeleteCreateInOneBulk
Add more yaml tests for get alias API (#29513)
Ignore empty completion input (#30713)
[DOCS] fixed incorrect default
[ML] Filter undefined job groups from update calendar actions (#30757)
Fix docs failure on language analyzers (#30722)
[Docs] Fix inconsistencies in snapshot/restore doc (#30480)
Enable installing plugins from snapshots.elastic.co (#30765)
Remove fedora 26, add 28 (#30683)
Accept Gradle build scan agreement (#30645)
Remove logging from elasticsearch-nio jar (#30761)
Add Delete Repository High Level REST API (#30666)
The new snapshot includes LUCENE-8324 which fixes missing checkpoint
after a fully deletes segment is dropped on flush. This snapshot should
resolves failed tests in the CorruptedFileIT suite.
Closes#30741Closes#30577
This is related to #29500 and #28898. This commit removes the abilitiy
to disable http pipelining. After this commit, any elasticsearch node
will support pipelined requests from a client. Additionally, it extracts
some of the http pipelining work to the server module. This extracted
work is used to implement pipelining for the nio plugin.
This is related to #27260. The elasticsearch-nio jar is supposed to be
a library opposed to a framework. Currently it internally logs certain
exceptions. This commit modifies it to not rely on logging. Instead
exception handlers are passed by the applications that use the jar.
* master:
Reduce CLI scripts to one-liners (#30759)
SQL: Preserve scoring in bool queries (#30730)
QA: Switch rolling upgrade to 3 nodes (#30728)
[TEST] Enable DEBUG logging on testAutoQueueSizingWithMax
[ML] Don't install empty ML metadata on startup (#30751)
Add assertion on removing copy_settings (#30748)
bump lucene version for 6_3_0
[DOCS] Mark painless execute api as experimental (#30710)
disable annotation processor for docs (#30610)
Add more script contexts (#30721)
Fix default shards count in create index docs (#30747)
Mute testCorruptFileThenSnapshotAndRestore
Added dedicated script contexts for:
* script function score
* script sorting
* terms_set query
Scripts for these contexts will either have a specific return value or
use scoring and therefor in the future will need their own scripting classes.
Relates to #30511
* master:
Scripting: Remove getDate methods from ScriptDocValues (#30690)
Upgrade to Lucene-7.4.0-snapshot-59f2b7aec2 (#30726)
[Docs] Fix single page :docs:check invocation (#30725)
Docs: Add uptasticsearch to list of clients (#30738)
[DOCS] Removes out-dated x-pack/docs/en/index.asciidoc
[DOCS] Removes redundant index.asciidoc files (#30707)
[TEST] Reduce forecast overflow to disk test memory limit (#30727)
Plugins: Remove meta plugins (#30670)
[DOCS] Moves X-Pack configurationg pages in table of contents (#30702)
TEST: Add engine log to testCorruptFileThenSnapshotAndRestore
[ML][TEST] Fix bucket count assertion in ModelPlotsIT (#30717)
[ML][TEST] Make AutodetectMemoryLimitIT less fragile (#30716)
Default copy settings to true and deprecate on the REST layer (#30598)
[Build] Add test admin when starting gradle run with trial license and
This implementation lazily (on 1st forecast request) checks for available diskspace and creates a subfolder for storing data outside of Lucene indexes, but as part of the ES data paths.
Tests: Fail if test watches could not be triggered (#30392)
[ML] add version information in case of crash of native ML process (#30674)
Make TransportClusterStateAction abide to our style (#30697)
Change required version for Get Settings transport API changes to 6.4.0 (#30706)
[DOCS] Fixes edit URLs for stack overview (#30583)
Silence sleep based watcher test
[TEST] Adjust version skips for movavg/movfn tests
[DOCS] Replace X-Pack terms with attributes
[ML] Clean left behind model state docs (#30659)
Correct typos
filters agg docs duplicated 'bucket' word removal (#30677)
top_hits doc example description update (#30676)
[Docs] Replace InetSocketTransportAddress with TransportAdress (#30673)
[TEST] Account for increase in ML C++ memory usage (#30675)
User proper write-once semantics for GCS repository (#30438)
Remove bogus file accidentally added
Add detailed assert message to IndexAuditUpgradeIT (#30669)
Adjust fast forward for token expiration test (#30668)
Improve explanation in rescore (#30629)
Deprecate `nGram` and `edgeNGram` names for ngram filters (#30209)
Watcher: Fix watch history template for dynamic slack attachments (#30172)
Fix _cluster/state to always return cluster_uuid (#30656)
[Tests] Add debug information to CorruptedFileIT
# Conflicts:
# test/framework/src/main/java/org/elasticsearch/indices/analysis/AnalysisFactoryTestCase.java
Meta plugins existed only for a short time, in order to enable breaking
up x-pack into multiple plugins. However, now that x-pack is no longer
installed as a plugin, the need for them has disappeared. This commit
removes the meta plugins infrastructure.
There's no need for an extra blobExists() call when writing a blob to the GCS service. GCS provides
an option (with stronger consistency guarantees) on the insert method that guarantees that the
blob that's uploaded does not already exist.
Relates to #19749
* es/master: (74 commits)
Preserve REST client auth despite 401 response (#30558)
[test] packaging: add windows boxes (#30402)
Make xpack modules instead of a meta plugin (#30589)
Mute ShrinkIndexIT
[ML] DeleteExpiredDataAction should use client with origin (#30646)
Reindex: Fixed typo in assertion failure message (#30619)
[DOCS] Fixes list of unconverted snippets in build.gradle
[DOCS] Reorganizes RBAC documentation
SQL: Remove dependency for server's version from JDBC driver (#30631)
Test: increase search logging for LicensingTests
Adjust serialization version in IndicesOptions
[TEST] Fix compilation
Remove version argument in RangeFieldType (#30411)
Remove unused DirectoryUtils class. (#30582)
Mitigate date histogram slowdowns with non-fixed timezones. (#30534)
Add a MovingFunction pipeline aggregation, deprecate MovingAvg agg (#29594)
Removes AwaitsFix on IndicesOptionsTests
Template upgrades should happen in a system context (#30621)
Fix bug in BucketMetrics path traversal (#30632)
Fixes IndiceOptionsTests to serialise correctly (#30644)
...
This commit is related to #28898. It adds an nio driven http server
transport. Currently it only supports basic http features. Cors,
pipeling, and read timeouts will need to be added in future PRs.
This does away with the deprecated `com.google.api-client:google-api-client:1.23`
and replaces it with `com.google.cloud:google-cloud-storage:1.28.0`.
It also changes security permissions for the repository-gcs plugin.
* master:
Default to one shard (#30539)
Unmute IndexUpgradeIT tests
Forbid expensive query parts in ranking evaluation (#30151)
Docs: Update HighLevelRestClient migration docs (#30544)
Clients: Switch to new performRequest (#30543)
[TEST] Fix typo in MovAvgIT test
Add missing dependencies on testClasses (#30527)
[TEST] Mute ML test that needs updating to following ml-cpp changes
Document woes between auto-expand-replicas and allocation filtering (#30531)
Moved tokenizers to analysis common module (#30538)
Adjust copy settings versions
Mute ShrinkIndexIT suite
SQL: SYS TABLES ordered according to *DBC specs (#30530)
Deprecate not copy settings and explicitly disallow (#30404)
[ML] Improve state persistence log message
Build: Add mavenPlugin cluster configuration method (#30541)
Re-enable FlushIT tests
Bump Gradle heap to 2 GB (#30535)
SQL: Use request flavored methods in tests (#30345)
Suppress hdfsFixture if there are spaces in the path (#30302)
Delete temporary blobs before creating index file (#30528)
Watcher: Remove TriggerEngine.getJobCount() (#30395)
[ML] Fix wire BWC for JobUpdate (#30512)
Use simpler write-once semantics for FS repository (#30435)
Derive max composite buffers from max content len
Use simpler write-once semantics for HDFS repository (#30439)
SQL: Improve correctness of SYS COLUMNS & TYPES (#30418)
Mute two tests in FlushIT with @AwaitsFix.
Fix incorrect template name in test case
Build: Remove legacy bwc files from xpack (#30485)
Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
Security: cleanup code in file stores (#30348)
Security: fix TokenMetaData equals and hashcode (#30347)
Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
SQL: Improve compatibility with MS query (#30516)
SQL: Fix parsing of dates with milliseconds (#30419)
HDFS sets its thread-name format based partly on a URL-encoded version of the
path, but the URL-encoding of spaces as `%20` is interpreted as a field in the
formatted string of type `2`, which is nonsensical. This change simply skips
these tests in this case.
There's no need for an extra `blobExists()` call when writing a blob to the HDFS service. The writeBlob implementation for the HDFS repository already uses the `CreateFlag.CREATE` option on the file creation, which ensures that the blob that's uploaded does not already exist. This saves one network roundtrip.
* master:
Upgrade to Lucene-7.4-snapshot-6705632810 (#30519)
add version compatibility from 6.4.0 after backport, see #30319 (#30390)
Security: Simplify security index listeners (#30466)
Add proper longitude validation in geo_polygon_query (#30497)
Remove Discovery.AckListener.onTimeout() (#30514)
Build: move generated-resources to build (#30366)
Reindex: Fold "with all deps" project into reindex (#30154)
Isolate REST client single host tests (#30504)
Solve Gradle deprecation warnings around shadowJar (#30483)
SAML: Process only signed data (#30420)
Remove BWC repository test (#30500)
Build: Remove xpack specific run task (#30487)
AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests
LLClient: Add setJsonEntity (#30447)
Expose CommonStatsFlags directly in IndicesStatsRequest. (#30163)
Silence IndexUpgradeIT test failures. (#30430)
Bump Gradle heap to 1792m (#30484)
[docs] add warning for read-write indices in force merge documentation (#28869)
Avoid deadlocks in cache (#30461)
Test: remove hardcoded list of unconfigured ciphers (#30367)
mute SplitIndexIT due to https://github.com/elastic/elasticsearch/issues/30416
Docs: Test examples that recreate lang analyzers (#29535)
BulkProcessor to retry based on status code (#29329)
Add GET Repository High Level REST API (#30362)
add a comment explaining the need for RetryOnReplicaException on missing mappings
Add `coordinating_only` node selector (#30313)
Stop forking groovyc (#30471)
Avoid setting connection request timeout (#30384)
Use date format in `date_range` mapping before fallback to default (#29310)
Watcher: Increase HttpClient parallel sent requests (#30130)
# Conflicts:
# x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/LocalStateCompositeXPackPlugin.java
* elastic-master:
Watcher: Mark watcher as started only after loading watches (#30403)
Pass the task to broadcast actions (#29672)
Disable REST default settings testing until #29229 is back-ported
Correct wording in log message (#30336)
Do not fail snapshot when deleting a missing snapshotted file (#30332)
AwaitsFix testCreateShrinkIndexToN
DOCS: Correct mapping tags in put-template api
DOCS: Fix broken link in the put index template api
Add put index template api to high level rest client (#30400)
Relax testAckedIndexing to allow document updating
[Docs] Add snippets for POS stop tags default value
Move respect accept header on no handler to 6.3.1
Respect accept header on no handler (#30383)
[Test] Add analysis-nori plugin to the vagrant tests
[Docs] Fix bad link
[Docs] Fix end of section in the korean plugin docs
Expose the Lucene Korean analyzer module in a plugin (#30397)
Docs: remove transport_client from CCS role example (#30263)
[Rollup] Validate timezone in range queries (#30338)
Use readFully() to read bytes from CipherInputStream (#28515)
Fix docs Recently merged #29229 had a doc bug that broke the doc build. This commit fixes.
Test: remove cluster permission from CCS user (#30262)
Add Get Settings API support to java high-level rest client (#29229)
Watcher: Remove unneeded index deletion in tests
This change adds a new plugin called `analysis-nori` that exposes
Korean text analysis in es using the new Lucene Korean analyzer module named (`nori`).
The plugin adds:
* a Korean analyzer: `nori`
* a Korean tokenizer: `nori_tokenizer`
* a part of speech stop filter: `nori_part_of_speech`
* a filter that can replace Hanja characters with their Hangul transcription: `nori_readingform`
* master:
Set the new lucene version for 6.4.0
[ML][TEST] Clean up jobs in ModelPlotIT
Upgrade to 7.4.0-snapshot-1ed95c097b (#30357)
Watcher: Ensure trigger service pauses execution (#30363)
[DOCS] Added coming qualifiers in changelog
[DOCS] Commented out empty sections in the changelog to fix the doc build. (#30372)
Security: reduce garbage during index resolution (#30180)
Make RepositoriesMetaData contents unmodifiable (#30361)
Change quad tree max levels to 29. Closes#21191 (#29663)
Test: use trial license in qa tests with security
[ML] Add integration test for model plots (#30359)
SQL: Fix bug caused by empty composites (#30343)
[ML] Account for gaps in data counts after job is reopened (#30294)
InternalEngineTests.testConcurrentOutOfOrderDocsOnReplica should use two documents (#30121)
Change signature of Get Repositories Response (#30333)
Tests: Use different watch ids per test in smoke test (#30331)
[Docs] Add term query with normalizer example
Adds Eclipse config for xpack licence headers (#30299)
Watcher: Make start/stop cycle more predictable and synchronous (#30118)
[test] add debug logging for packaging test
[DOCS] Removed X-Pack Breaking Changes
[DOCS] Fixes link to TLS LDAP info
Update versions for start_trial after backport (#30218)
Packaging: Set elasticsearch user to have non-existent homedir (#29007)
[DOCS] Fixes broken links to bootstrap user (#30349)
Fix NPE when CumulativeSum agg encounters null/empty bucket (#29641)
Make licensing FIPS-140 compliant (#30251)
[DOCS] Reorganizes authentication details in Stack Overview (#30280)
Network: Remove http.enabled setting (#29601)
Fix merging logic of Suggester Options (#29514)
[DOCS] Adds LDAP realm configuration details (#30214)
[DOCS] Adds native realm configuration details (#30215)
ReplicationTracker.markAllocationIdAsInSync may hang if allocation is cancelled (#30316)
[DOCS] Enables edit links for X-Pack pages (#30278)
Packaging: Unmark systemd service file as a config file (#29004)
SQL: Reduce number of ranges generated for comparisons (#30267)
Tests: Simplify VersionUtils released version splitting (#30322)
Cancelling a peer recovery on the source can leak a primary permit (#30318)
Added changelog entry for deb prerelease version change (#30184)
Convert server javadoc to html5 (#30279)
Create default ES_TMPDIR on Windows (#30325)
[Docs] Clarify `fuzzy_like_this` redirect (#30183)
Post backport of #29658.
Fix docs of the `_ignored` meta field.
Remove MapperService#types(). (#29617)
Remove useless version checks in REST tests. (#30165)
Add a new `_ignored` meta field. (#29658)
Move repository-azure fixture test to QA project (#30253)
# Conflicts:
# buildSrc/version.properties
# server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
Upgrade to lucene-7.4.0-snapshot-1ed95c097b
This version contains:
* An Analyzer for Korean
* An IntervalQuery and IntervalsSource that retrieve minimum intervals of positional queries.
* A new API to retrieve matches (offsets and positions) of a query for a single document.
* Support for soft deletes in the index writer.
* A fixed shingle filter that handles index time synonyms.
* Support for emoji sequence in ICUTokenizer (with an upgrade to icu 61.1)
Similarly to what has been done in for the repository-s3 plugin, this
pull request moves the fixture test into a dedicated
repository-azure/qa/microsoft-azure-storage project.
It also exposes some environment variables which allows to execute the
integration tests against the real Azure Storage service. When the
environment variables are not defined, the integration tests are
executed using the fixture added in #29347.
Closes#29349
Similarly to what has been done in for the repository-s3 plugin,
this commit moves the fixture test into a dedicated
repository-gcs/qa/google-cloud-storage project.
It also exposes some environment variables which allows to
execute the integration tests against the real Google Cloud
Storage service. When the environment variables are not
defined, the integration tests are executed using the fixture
added in #28788. Related to #29349.
This *mostly* silences `javadoc`'s warning about defaulting to
generating html4 files by enabling generating html5 file for the
projects for which that works. It didn't work in a half dozen projects,
about half of which I've fixed in this PR, entirely by replacing
`<tt>thing</tt>` with `{@code thing}`.
There are a few remaining projects that contain javadoc with invalid
html5. I'll fix those projects in a followup.
This commit moves the repository-s3 fixture test added in #29296 in a
new `repository-s3/qa/amazon-s3` project. This new project allows the
REST integration tests to be executed using the real S3 service when
all the required environment variables are provided. When no env var
is provided, then the tests are executed using the fixture added
in #29296.
The REST tests located at the `repository-s3`plugin project now only
verify that the plugin is correctly loaded.
The REST tests have been adapted to allow a bucket name and a base
path to be specified as env vars. This way it is possible to run the tests
with different base paths (could be anything, like a CI job name or a
branch name) without multiplicating buckets.
Related to #29349
This commit moves the apache and elastic license files into a new
root level `licenses` directory and rewrites the top level LICENSE.txt
to clarify the repository has a mix of apache and elastic licensed code.
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
* es/master:
Add remote cluster client (#29495)
Ensure flush happens on shard idle
Adds SpanGapQueryBuilder in the query DSL (#28636)
Control max size and count of warning headers (#28427)
Make index APIs work without types. (#29479)
Deprecate filtering on `_type`. (#29468)
Fix auto-generated ID example format (#29461)
Fix typo in max number of threads check docs (#29469)
Add primary term to translog header (#29227)
Add a helper method to get a random java.util.TimeZone (#29487)
Move TimeValue into elasticsearch-core project (#29486)
Fix NPE in InternalGeoCentroidTests#testReduceRandom (#29481)
Build: introduce keystoreFile for cluster config (#29491)
test: Index more docs, so that it is less likely the search request does not time out.
This commit introduces built in support for adding files to the
keystore when configuring the integration test cluster for a project.
In order to use this support, simply add `keystoreFile` followed by the
secure setting name and the path to the source file inside the
integTestCluster closure for a project. The built in support will
handle the creation of the keystore and the addition of the file to the
keystore.
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
* Move Streams.copy into elasticsearch-core and make a multi-release jar
This moves the method `Streams.copy(InputStream in, OutputStream out)` into the
`elasticsearch-core` project (inside the `o.e.core.internal.io` package). It
also makes this class into a multi-release class where the Java 9 equivalent
uses `InputStream#transferTo`.
This is a followup from
https://github.com/elastic/elasticsearch/pull/29300#discussion_r178147495
This commit adds a new fixture that emulates an
Azure Storage service in order to improve the
existing integration tests. This is very similar
to what has been made for Google Cloud Storage
in #28788 and for Amazon S3 in #29296, and it
would have helped a lot to catch bugs like #22534.
The repository-gcs unit tests rely on the GoogleCloudStorageTestServer
but it would be better if they rely on a mocked Storage client instead.
That would also help to extract the GoogleCloudStorageFixture and the
GoogleCloudStorageTestServer classes in a QA third party project.
Closes#28960
This commit adds the S3BlobStoreRepositoryTests class that extends the
base testing class for S3. It also removes some usage of socket servers
that emulate socket connections in unit tests. It was added to trigger
security exceptions, but this won't be needed anymore since #29296
is merged.
Today when you input a byte size setting that is out of bounds for the
setting, you get an error message that indicates the maximum value of
the setting. The problem is that because we use ByteSize#toString, we
end up with a representation of the value that does not really tell you
what the bound is. For example, if the bound is 2^31 - 1 bytes, the
output would be 1.9gb which does not really tell you want the limit as
there are many byte size values that we format to the same 1.9gb with
ByteSize#toString. We have a method ByteSize#getStringRep that uses the
input units to the value as the output units for the string
representation, so we end up with no loss if we use this to report the
bound. This commit does this.
This commit adds a new fixture that emulates a S3 service in order to
improve the existing integration tests. This is very similar to what has
been made for Google Cloud Storage in #28788, and such tests would
have helped a lot to catch bugs like #22534.
The AmazonS3Fixture is brittle and only implements the very necessary
stuff for the S3 repository to work, but at least it works and can be
adapted for specific tests needs.
Some source files seem to have the execute bit (a+x) set, which doesn't
really seem to hurt but is a bit odd. This change removes those, making
the permissions similar to other source files in the repository.
* Decouple XContentBuilder from BytesReference
This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.
While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).
Relates to #28504
We today support a global `indexed_chars` processor parameter. But in some cases, users would like to set this limit depending on the document itself.
It used to be supported in mapper-attachments plugin by extracting the limit value from a meta field in the document sent to indexation process.
We add an option which reads this limit value from the document itself
by adding a setting named `indexed_chars_field`.
Which allows running:
```
PUT _ingest/pipeline/attachment
{
"description" : "Extract attachment information. Used to parse pdf and office files",
"processors" : [
{
"attachment" : {
"field" : "data",
"indexed_chars_field" : "size"
}
}
]
}
```
Then index either:
```
PUT index/doc/1?pipeline=attachment
{
"data": "BASE64"
}
```
Which will use the default value (or the one defined by `indexed_chars`)
Or
```
PUT index/doc/2?pipeline=attachment
{
"data": "BASE64",
"size": 1000
}
```
Closes#28942
As we have factored Elasticsearch into smaller libraries, we have ended
up in a situation that some of the dependencies of Elasticsearch are not
available to code that depends on these smaller libraries but not server
Elasticsearch. This is a good thing, this was one of the goals of
separating Elasticsearch into smaller libraries, to shed some of the
dependencies from other components of the system. However, this now
means that simple utility methods from Lucene that we rely on are no
longer available everywhere. This commit copies IOUtils (with some small
formatting changes for our codebase) into the fold so that other
components of the system can rely on these methods where they no longer
depend on Lucene.
With this commit we skip all GeoIpProcessorFactoryTests on Windows.
These tests use a MappedByteBuffer which will keep its file mappings
until it is garbage-collected. As a consequence, the corresponding
file appears to be still in use, Windows cannot delete it and the test
will fail in teardown.
Closes#29001
Windows has some strong limitations on command line arguments,
specially when it's too long. In the googlecloudstoragefixture anttask
the classpath argument is very long and the command fails. This commit
removes the classpath as an argument and uses the CLASSPATH
environment variable instead.
With this commit we reduce heap usage of the ingest-geoip plugin by
memory-mapping the database files. Previously, we have stored these
files gzip-compressed but this has resulted that data are loaded on the
heap.
Closes#28782
This commit adds a GoogleCloudStorageFixture that uses the
logic of a GoogleCloudStorageTestServer (added in #28576)
to emulate a remote Google Cloud Storage service.
By adding this fixture and a more complete integration test, we
should be able to catch more bugs when upgrading the client library.
The fixture is started by the googleCloudStorageFixture task
and a custom Service Account file is created and added to the
Elasticsearch keystore for each test.
This is related to #27260. The transport-nio plugin needs socket
permissions to operate as a transport. This commit gives it these
permissions in the policy file.
This commit is related to #27260. Currently there is a weird
relationship between channel contexts and nio channels. The selectors
use the context for read and writing. But the selector operates directly
on the nio channel for registering, closing, and connecting.
This commit works on improving this relationship. The selector operates
directly on the context which wraps the low level java.nio.channels. The
NioChannel class is simply an API that is used to interact with the
channel (sending messages from outside the selector event loop,
scheduling a close, adding listeners, etc). The context is only used
internally by the channel to implement these apis and by the selector to
perform these operations.
Similarly to what has been done for s3 and azure, this commit removes
the repository settings `application_name` and `connect/read_timeout`
in favor of client settings. It introduce a GoogleCloudStorageClientSettings
class (similar to S3ClientSettings) and a bunch of unit tests for that,
it aligns the documentation to be more coherent with the S3 one, it
documents the connect/read timeouts that were not documented at all and
also adds a new client setting that allows to define a custom endpoint.
This is related to #28662. It wraps the azure repository inputstream in
an inputstream that ensures `read` calls have socket permissions. This
is because the azure inputstream internally makes service calls.
This pull request extracts in a dedicated class the request/response
logic that "emulates" a Google Cloud Storage service in our
repository-gcs tests.
The idea behind this is to make the logic more reusable. The class
MockHttpTransport has been renamed to MockStorage which now
only takes care of instantiating a Storage client and does the low-level
request/response plumbing needed by this client.
The "Google Cloud Storage" logic has been extracted from
MockHttpTransport and put in a new GoogleCloudStorageTestServer
that is now independent from the google client testing framework.
GceDiscoverTests can be simplified in a similar manner than #27945. It
now uses a mocked GceInstancesService that exposes internal test cluster
nodes as if they were real GCE nodes. It should also make the test more
robust by not using a HTTP server anymore.
closes#24313
The TikaImpl#parse method comment sounds like this method is only used
in the same package for testing, but AttachmentProcessor uses it outside
of testing, so we should remove this comment.
Tika parsers need accessDeclaredMembers because ZipFile needs
accessDeclaredMembers on JDK 10. This commit guards adding this
permission to parsers so that the permission is only granted on JDK
10. Additionally, we add an assertion that forces us to check if the
permission is still needed in JDK 11.
Relates #28603
Tests on jdk10 were failing because of a change in its ZipFile implementation
that now needs `accessDeclaredMembers` permissions. This change adds
the missing permission to the plugins security policy and TikaImpl.
Closes#28568
* Move to non-deprecated XContentHelper.createParser(...)
This moves away from one of the now-deprecated XContentHelper.createParser
methods in favor of specifying the deprecation logger at parser creation time.
Relates to #28449
Note that this doesn't move all the `createParser` calls because some of them
use the already-deprecated method that doesn't specify the XContentType.
* Remove the deprecated (and now non-needed) createParser method
This pull request replaces the jvm-example plugin (from the jvm/site plugins era) by two new plugins: a custom-settings that shows how to register and use custom settings (including secured settings) in a plugin, and rest-handler plugin that shows how to register a rest handler.
The two plugins now reside in the plugins/examples project. They can serve as sample plugins for users, a special attention has been put on documentation. The packaging tests have been adapted to use the custom-settings plugin.
This commit is related to #27260. Currently have a channel context that
implements reading and writing logic for socket channels. Additionally,
we have exception contexts to handle exceptions. And accepting contexts
to handle accepted channels. This PR introduces a ChannelContext that
handles close and exception handling for all channel types.
Additionally, it has implementers that provide specific functionality
for socket channels (read and writing). And specific functionality for
server channels (accepting).
This commit adds a gradle plugin to ease development of meta plugins.
Applying the plugin will generated the meta plugin properties based on
the es_meta_plugin configuration object, which includes name and
description. The plugins to include within the meta plugin are
configured through the `plugins` list. An integ test task is also
automatically added.
This commit is related to #27260. Right now we have separate read and
write contexts for implementing specific protocol logic. However, some
protocols require a closer relationship between read and write
operations than is allowed by our current model. An example is HTTP
which might require a write if some problem with request parsing was
encountered.
Additionally, some protocols require close messages to be sent when a
channel is shutdown. This is also problematic in our current model,
where we assume that channels should simply be queued for close and
forgotten.
This commit transitions to a single ChannelContext which implements
all read, write, and close logic for protocols. It is the job of the
context to tell the selector when to close the channel. A channel can
still be manually queued for close with a selector. This is how server
channels are closed for now. And this route allows timeout mechanisms on
normal channel closes to be implemented.
This one is interesting. The third party audit task runs inside the
Gradle JVM. This means that if Gradle is started on JDK 8, the third
party audit tasks will fail as a result of the changes to support
building Elasticsearch with the JDK 9 compiler. This commit reverts the
third party audit changes to support running this task when Gradle is
started with JDK 8.
Relates #28256
This commit modifies the build to require JDK 9 for
compilation. Henceforth, we will compile with a JDK 9 compiler targeting
JDK 8 as the class file format. Optionally, RUNTIME_JAVA_HOME can be set
as the runtime JDK used for running tests. To enable this change, we
separate the meaning of the compiler Java home versus the runtime Java
home. If the runtime Java home is not set (via RUNTIME_JAVA_HOME) then
we fallback to using JAVA_HOME as the runtime Java home. This enables:
- developers only have to set one Java home (JAVA_HOME)
- developers can set an optional Java home (RUNTIME_JAVA_HOME) to test
on the minimum supported runtime
- we can test compiling with JDK 9 running on JDK 8 and compiling with
JDK 9 running on JDK 9 in CI
This commit adds a PainlessExtension which may be plugged in via SPI to
add additional classes, methods and members to the painless whitelist on
a per context basis. An example plugin adding and using a whitelist is
also added.
This commit changes the phonetic filter factory to use a DaitchMokotoffSoundexFilter
instead of a PhoneticFilter with a daitch_mokotoff encoder when daitch_mokotoff is selected.
The latter does not hanlde branching when computing the soundex and fails to encode multiple
variations when possible.
Closes#28211
The method `initiateChannel` on `TcpTransport` is explicit in that
channels can be connect asynchronously. All production implementations
do connect asynchronously. Only the blocking `MockTcpTransport`
connects in a synchronous manner. This avoids testing some of the
blocking code in `TcpTransport` that waits on connections to complete.
Additionally, it requires a more extensive method signature than
required for other transports.
This commit modifies the `MockTcpTransport` to make these connections
asynchronously on a different thread. Additionally, it simplifies that
`initiateChannel` method signature.
* This change makes sure that we don't detect a file path containing a ':' as
a maven coordinate (e.g.: `file:C:\path\to\zip`)
* restore test muted on master
This commit adds the ability to package multiple plugins in a single zip.
The zip file for a meta plugin must contains the following structure:
|____elasticsearch/
| |____ <plugin1> <-- The plugin files for plugin1 (the content of the elastisearch directory)
| |____ <plugin2> <-- The plugin files for plugin2
| |____ meta-plugin-descriptor.properties <-- example contents below
The meta plugin properties descriptor is mandatory and must contain the following properties:
description: simple summary of the meta plugin.
name: the meta plugin name
The installation process installs each plugin in a sub-folder inside the meta plugin directory.
The example above would create the following structure in the plugins directory:
|_____ plugins
| |____ <name_of_the_meta_plugin>
| | |____ meta-plugin-descriptor.properties
| | |____ <plugin1>
| | |____ <plugin2>
If the sub plugins contain a config or a bin directory, they are copied in a sub folder inside the meta plugin config/bin directory.
|_____ config
| |____ <name_of_the_meta_plugin>
| | |____ <plugin1>
| | |____ <plugin2>
|_____ bin
| |____ <name_of_the_meta_plugin>
| | |____ <plugin1>
| | |____ <plugin2>
The sub-plugins are loaded at startup like normal plugins with the same restrictions; they have a separate class loader and a sub-plugin
cannot have the same name than another plugin (or a sub-plugin inside another meta plugin).
It is also not possible to remove a sub-plugin inside a meta plugin, only full removal of the meta plugin is allowed.
Closes#27316
This commit is related to #27260. It moves the TcpChannelFactory into
NioTransport so that consumers do not have to be passed around.
Additionally it deletes an unused read handler.
This is related to #27260. This commit moves the NioTransport from
:test:framework to a new nio-transport plugin. Additionally, supporting
tcp decoding classes are moved to this plugin. Generic byte reading and
writing contexts are moved to the nio library.
Additionally, this commit adds a basic MockNioTransport to
:test:framework that is a TcpTransport implementation for testing that
is driven by nio.
This commit adds the infrastructure to plugin building and loading to
allow one plugin to extend another. That is, one plugin may extend
another by the "parent" plugin allowing itself to be extended through
java SPI. When all plugins extending a plugin are finished loading, the
"parent" plugin has a callback (through the ExtensiblePlugin interface)
allowing it to reload SPI.
This commit also adds an example plugin which uses as-yet implemented
extensibility (adding to the painless whitelist).
This commit changes some Azure tests so that they do not rely on
MockZenPing and TestZenDiscovery anymore, but instead use a mocked
AzureComputeService that exposes internal test cluster nodes as if
they were real Azure nodes.
Related to #27859Closes#27917, #11533
TestZenDiscovery is used to allow discovery based on in memory structures. This isn't a relevant for the cloud providers tests (but isn't a problem at the moment either)
* Fixes ByteSizeValue to serialise correctly
This fix makes a few fixes to ByteSizeValue to make it possible to perform round-trip serialisation:
* Changes wire serialisation to use Zlong methods instead of VLong methods. This is needed because the value `-1` is accepted but previously if `-1` is supplied it cannot be serialised using the wire protocol.
* Limits the supplied size to be no more than Long.MAX_VALUE when converted to bytes. Previously values greater than Long.MAX_VALUE bytes were accepted but would be silently interpreted as Long.MAX_VALUE bytes rather than erroring so the user had no idea the value was not being used the way they had intended. I consider this a bug and so fine to include this bug fix in a minor version but I am open to other points of view.
* Adds a `getStringRep()` method that can be used when serialising the value to JSON. This will print the bytes value if the size is positive, `”0”` if the size is `0` and `”-1”` if the size is `-1`.
* Adds logic to detect fractional values when parsing from a String and emits a deprecation warning in this case.
* Modifies hashCode and equals methods to work with long values rather than doubles so they don’t run into precision problems when dealing with large values. Previous to this change the equals method would not detect small differences in the values (e.g. 1-1000 bytes ranges) if the actual values where very large (e.g. PBs). This was due to the values being in the order of 10^18 but doubles only maintaining a precision of ~10^15.
Closes#27568
* Fix bytes settings default value to not use fractional values
* Fixes test
* Addresses review comments
* Modifies parsing to preserve unit
This should be bwc since in the case that the input is fractional it reverts back to the old method of parsing it to the bytes value.
* Addresses more review comments
* Fixes tests
* Temporarily changes version check to 7.0.0
This will be changed to 6.2 when the fix has been backported
This pull request changes the S3BlobContainer.blobExists() method implementation
to make it use the AmazonS3.doesObjectExist() method instead of
AmazonS3.getObjectMetadata(). The AmazonS3 implementation takes care of
catching any thrown AmazonS3Exception and compares its response code with 404,
returning false (object does not exist) or lets the exception be propagated.
Add support for filtering fields returned as part of mappings in get index, get mappings, get field mappings and field capabilities API.
Plugins can plug in their own function, which receives the index as argument, and return a predicate which controls whether each field is included or not in the returned output.
This commit adds the node name to the names of thread pool executors so
that the node name is visible in rejected execution exception messages.
Relates #27663
Using custom rules in the icu_collation filter can fail on Windows. If the rules are interpreted
as a file location, this leads to an InvalidPathException when trying to read the rules from a file.
This new snapshot mostly brings a change to TopFieldCollector which can now
early terminate collection when trackTotalHits is `false`.
As a follow-up, we should replace our usage of
`EarlyTerminatingSortingCollector` with this new option.
* Sense HA HDFS settings and remove permission restrictions during regular execution.
This PR adds integration tests for HA-Enabled HDFS deployments, both regular and secured.
The Mini HDFS fixture has been updated to optionally run in HA-Mode. A new test suite has
been added for reproducing the effects of a Namenode failing over during regular repository
usage. Going forward, the HDFS Repository will still be subject to its self imposed permission
restrictions during normal use, but will no longer restrict them when running against an HA
enabled HDFS cluster. Instead, the plugin will rely on the provided security policy and not
further restrict the permissions so that the transparent operation to failover to a different
Namenode in the client does not raise security exceptions. Additionally, we are now testing the
secure mode with SASL based wire encryption of data between Elasticsearch and HDFS. This
includes a missing library (commons codec) in order to support this change.
This awaits fix has been there forever and no one seems to know what to
do with this test. I say let CI churn on it because it passed for me
three out of three times. If there is something wrong with it, we will
know quickly and can then address with the new information that we have.
The main highlight of this new snapshot is that it introduces the opportunity
for queries to opt out of caching. In case a query opts out of caching, not only
will it never be cached, but also no compound query that wraps it will be
cached.
This commit changes the DefaultHttpRequestInitializer in order to make
it create new HttpIOExceptionHandler and HttpUnsuccessfulResponseHandler
for every new HTTP request instead of reusing the same two handlers for
all requests.
Closes#27092
The AWS SDK has a transitive dependency on Jackson Databind. While the
AWS SDK was recently upgraded, the Jackson Databind dependency was not
pulled along with it to the version that the AWS SDK depends on. This
commit upgrades the dependencies for discovery-ec2 and repository-s3
plugins to match versions on the AWS SDK transitive dependencies.
Relates #27361
We use affix settings to group settings / values under a certain namespace.
In some cases like login information for instance a setting is only valid if
one or more other settings are present. For instance `x.test.user` is only valid
if there is an `x.test.passwd` present and vice versa. This change allows to specify
such a dependency to prevent settings updates that leave settings in an inconsistent
state.
Now the blob size information is available before writing anything,
the repository implementation can know upfront what will be the
more suitable API to upload the blob to S3.
This commit removes the DefaultS3OutputStream and S3OutputStream
classes and moves the implementation of the upload logic directly in the
S3BlobContainer.
related #26993closes#26969
Gradle 5.0 will remove support for colons in configuration and task
names. This commit fixes this for our build by removing all current uses
of colons in configuration and task names.
Relates #27305
Only tests should use the single argument Environment constructor. To
enforce this the single arg Environment constructor has been replaced with
a test framework factory method.
Production code (beyond initial Bootstrap) should always use the same
Environment object that Node.getEnvironment() returns. This Environment
is also available via dependency injection.
For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist.
Closes#21495
* Enhances exists queries to reduce need for `_field_names`
Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance.
This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`.
Closes#26770
* Addresses review comments
* Addresses more review comments
* implements existsQuery explicitly on every mapper
* Reinstates ability to perform term query on `_field_names`
* Added bwc depending on index created version
* Review Comments
* Skips tests that are not supported in 6.1.0
These values will need to be changed after backporting this PR to 6.x
Currently, when we create a BeiderMorseFilter with an unspecified `languageset`,
the filter will not guess the language, which should be the default behaviour.
This change fixes this and adds a simple test for the cases with and without
provided `languageset` settings.
Closes#26771
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.
Today we represent each value of a list setting with it's own dedicated key
that ends with the index of the value in the list. Aside of the obvious
weirdness this has several issues especially if lists are massive since it
causes massive runtime penalties when validating settings. Like a list of 100k
words will literally cause a create index call to timeout and in-turn massive
slowdown on all subsequent validations runs.
With this change we use a simple string list to represent the list. This change
also forbids to add a settings that ends with a .0 which was internally used to
detect a list setting. Once this has been rolled out for an entire major
version all the internal .0 handling can be removed since all settings will be
converted.
Relates to #26723
While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already.
This commit simplifies a bit that part of the code.
It also removes `repositoryName` from AzureBlobStore which was not used anymore.
Also we move some properties in AzureBlobContainer to `private` members.
Since `#getAsMap` exposes internal representation we are trying to remove it
step by step. This commit is cleaning up some xcontent writing as well as
usage in tests
We use group settings historically instead of using a prefix setting which is more restrictive and type safe. The majority of the usecases needs to access a key, value map based on the _leave node_ of the setting ie. the setting `index.tag.*` might be used to tag an index with `index.tag.test=42` and `index.tag.staging=12` which then would be turned into a `{"test": 42, "staging": 12}` map. The group settings would always use `Settings#getAsMap` which is loosing type information and uses internal representation of the settings. Using prefix settings allows now to access such a method type-safe and natively.
Even though you annotate the Test class with `@ThirdParty` the static
code is initialized.
In that case it fails with:
```
==> Test Info: seed=529C3C6977F695FC; jvms=3; suites=6
Suite: org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests
ERROR 0.00s J2 | AzureSnapshotRestoreTests (suite) <<< FAILURES!
> Throwable #1: java.lang.IllegalStateException: to run integration tests, you need to set -Dtests.thirdparty=true and -Dtests.azure.account=azure-account -Dtests.azure.key=azure-key
> at org.elasticsearch.cloud.azure.AzureTestUtils.generateMockSecureSettings(AzureTestUtils.java:37)
> at org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests.generateMockSettings(AzureSnapshotRestoreTests.java:81)
> at org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests.<clinit>(AzureSnapshotRestoreTests.java:84)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
Completed [1/6] on J2 in 2.21s, 0 tests, 1 error <<< FAILURES!
```
Closes#26812.
(cherry picked from commit eb6d714 for master branch)
* Use Azure upload method instead of our own implementation
We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container
Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote.
And well... Let's just read the doc!
* Adapt integration tests to secure settings
That was a missing part in #23405.
* Simplify all the integration tests and *extends ESBlobStoreRepositoryIntegTestCase tests
* removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository
* merges 2 IT classes so all IT tests are ran from one single class
* We don't remove/create anymore the container between each single test but only for the test suite
While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code:
`AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff.
But with PR #22858, I wrote:
```java
@Override
public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) {
if (blobStore.doesContainerExist(blobStore.container()) == false) {
throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " +
" creating an azure snapshot repository backed by it.");
}
}
```
instead of
```java
@Override
public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) {
if (blobStore.doesContainerExist(blobStore.container()) == false) {
throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " +
" creating an azure snapshot repository backed by it.");
}
super.initializeSnapshot(snapshotId, indices, clusterMetadata);
}
```
As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved.
Closes#26777.
This change adds a fromXContent method to Settings that allows to read
the xcontent that is produced by toXContent. It also replaces the entire settings
loader infrastructure and removes the structured map representation. Future PRs will
also tackle the `getAsMap` that exposes the internal represenation of settings for
better encapsulation.
Add checks for special permissions before reading hdfs stream data. Also adds test from
readonly repository fix. MiniHDFS will now start with an existing repository with a single snapshot
contained within. Readonly Repository is created in tests and attempts to list the snapshots
within this repo.
When adding file based discovery, we added a fallback when the discovery
type was set to zen (the default, so everyone got this warning). This
commit removes the fallback for 6.0. Setting file discovery should now
happen explicitly through the hosts_provider setting.
closes#26661
The discovery-file plugin was not config path aware, so it always picked
up the default config path (from Elasticsearch home) rather than a
custom config path. This commit fixes the discovery-file plugin to
respect a custom config path.
Relates #26662
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the
`stoptags` are not given in the config. Also adding a test which checks that
part-of-speech tokens are removed when using the kuromoji_part_of_speech
filter.
Removing several occurrences of this typo in the docs and javadocs, seems to be
a common mistake. Corrections turn up once in a while in PRs, better to correct
some of this in one sweep.