Commit Graph

1514 Commits

Author SHA1 Message Date
Luca Cavanna a17d6cab98
Replace Request#setHeaders with addHeader (#30588)
Adding headers rather than setting them all at once seems more
user-friendly and we already do it in a similar way for parameters
(see Request#addParameter).
2018-05-22 20:32:30 +02:00
Nhat Nguyen 1918a30237
Upgrade to Lucene-7.4.0-snapshot-cc2ee23050 (#30778)
The new snapshot includes LUCENE-8324 which fixes missing checkpoint
after a fully deletes segment is dropped on flush. This snapshot should
resolves failed tests in the CorruptedFileIT suite.

Closes #30741
Closes #30577
2018-05-22 13:11:48 -04:00
Tim Brooks 31251c9a6d
Make http pipelining support mandatory (#30695)
This is related to #29500 and #28898. This commit removes the abilitiy
to disable http pipelining. After this commit, any elasticsearch node
will support pipelined requests from a client. Additionally, it extracts
some of the http pipelining work to the server module. This extracted
work is used to implement pipelining for the nio plugin.
2018-05-22 09:29:31 -06:00
Tim Brooks abf8c56a37
Remove logging from elasticsearch-nio jar (#30761)
This is related to #27260. The elasticsearch-nio jar is supposed to be
a library opposed to a framework. Currently it internally logs certain
exceptions. This commit modifies it to not rely on logging. Instead
exception handlers are passed by the applications that use the jar.
2018-05-21 20:18:12 -06:00
Nhat Nguyen 3245e78b78 Merge branch 'master' into ccr
* master:
  Scripting: Remove getDate methods from ScriptDocValues (#30690)
  Upgrade to Lucene-7.4.0-snapshot-59f2b7aec2 (#30726)
  [Docs] Fix single page :docs:check invocation (#30725)
  Docs: Add uptasticsearch to list of clients (#30738)
  [DOCS] Removes out-dated x-pack/docs/en/index.asciidoc
  [DOCS] Removes redundant index.asciidoc files (#30707)
  [TEST] Reduce forecast overflow to disk test memory limit (#30727)
  Plugins: Remove meta plugins (#30670)
  [DOCS] Moves X-Pack configurationg pages in table of contents (#30702)
  TEST: Add engine log to testCorruptFileThenSnapshotAndRestore
  [ML][TEST] Fix bucket count assertion in ModelPlotsIT (#30717)
  [ML][TEST] Make AutodetectMemoryLimitIT less fragile (#30716)
  Default copy settings to true and deprecate on the REST layer (#30598)
  [Build] Add test admin when starting gradle run with trial license and
  This implementation lazily (on 1st forecast request) checks for available diskspace and creates a subfolder for storing data outside of Lucene indexes, but as part of the ES data paths.
  Tests: Fail if test watches could not be triggered (#30392)
  [ML] add version information in case of crash of native ML process (#30674)
  Make TransportClusterStateAction abide to our style (#30697)
  Change required version for Get Settings transport API changes to 6.4.0 (#30706)
  [DOCS] Fixes edit URLs for stack overview (#30583)
  Silence sleep based watcher test
  [TEST] Adjust version skips for movavg/movfn tests
  [DOCS] Replace X-Pack terms with attributes
  [ML] Clean left behind model state docs (#30659)
  Correct typos
  filters agg docs duplicated 'bucket' word removal (#30677)
  top_hits doc example description update (#30676)
  [Docs] Replace InetSocketTransportAddress with TransportAdress (#30673)
  [TEST] Account for increase in ML C++ memory usage (#30675)
  User proper write-once semantics for GCS repository (#30438)
  Remove bogus file accidentally added
  Add detailed assert message to IndexAuditUpgradeIT (#30669)
  Adjust fast forward for token expiration test  (#30668)
  Improve explanation in rescore (#30629)
  Deprecate `nGram` and `edgeNGram` names for ngram filters (#30209)
  Watcher: Fix watch history template for dynamic slack attachments (#30172)
  Fix _cluster/state to always return cluster_uuid (#30656)
  [Tests] Add debug information to CorruptedFileIT

# Conflicts:
#	test/framework/src/main/java/org/elasticsearch/indices/analysis/AnalysisFactoryTestCase.java
2018-05-19 07:38:17 -04:00
Nhat Nguyen 67d8fc222d
Upgrade to Lucene-7.4.0-snapshot-59f2b7aec2 (#30726)
This snapshot resolves issues related to ShrinkIndexIT.
2018-05-18 18:21:39 -04:00
Ryan Ernst b3f3a4312b
Plugins: Remove meta plugins (#30670)
Meta plugins existed only for a short time, in order to enable breaking
up x-pack into multiple plugins. However, now that x-pack is no longer
installed as a plugin, the need for them has disappeared. This commit
removes the meta plugins infrastructure.
2018-05-18 10:56:08 -07:00
Martijn van Groningen 5298237847
Merge remote-tracking branch 'es/master' into ccr
* es/master: (74 commits)
  Preserve REST client auth despite 401 response (#30558)
  [test] packaging: add windows boxes (#30402)
  Make xpack modules instead of a meta plugin (#30589)
  Mute ShrinkIndexIT
  [ML] DeleteExpiredDataAction should use client with origin (#30646)
  Reindex: Fixed typo in assertion failure message (#30619)
  [DOCS] Fixes list of unconverted snippets in build.gradle
  [DOCS] Reorganizes RBAC documentation
  SQL: Remove dependency for server's version from JDBC driver (#30631)
  Test: increase search logging for LicensingTests
  Adjust serialization version in IndicesOptions
  [TEST] Fix compilation
  Remove version argument in RangeFieldType (#30411)
  Remove unused DirectoryUtils class. (#30582)
  Mitigate date histogram slowdowns with non-fixed timezones. (#30534)
  Add a MovingFunction pipeline aggregation, deprecate MovingAvg agg (#29594)
  Removes AwaitsFix on IndicesOptionsTests
  Template upgrades should happen in a system context (#30621)
  Fix bug in BucketMetrics path traversal (#30632)
  Fixes IndiceOptionsTests to serialise correctly (#30644)
  ...
2018-05-17 10:34:44 +02:00
Adrien Grand 28d4685d72
Mitigate date histogram slowdowns with non-fixed timezones. (#30534)
Date histograms on non-fixed timezones such as `Europe/Paris` proved much slower
than histograms on fixed timezones in #28727. This change mitigates the issue by
using a fixed time zone instead when shard data doesn't cross a transition so
that all timestamps share the same fixed offset. This should be a common case
with daily indices.

NOTE: Rewriting the aggregation doesn't work since the timezone is then also
used on the coordinating node to create empty buckets, which might be out of the
range of data that exists on the shard.

NOTE: In order to be able to get a shard context in the tests, I reused code
from the base query test case by creating a new parent test case for both
queries and aggregations: `AbstractBuilderTestCase`.

Mitigates #28727
2018-05-16 17:06:52 +02:00
Zachary Tong df853c49c0
Add a MovingFunction pipeline aggregation, deprecate MovingAvg agg (#29594)
This pipeline aggregation gives the user the ability to script functions that "move" across a window
of data, instead of single data points.  It is the scripted version of MovingAvg pipeline agg.

Through custom script contexts, we expose a number of convenience methods:

 - MovingFunctions.max()
 - MovingFunctions.min()
 - MovingFunctions.sum()
 - MovingFunctions.unweightedAvg()
 - MovingFunctions.linearWeightedAvg()
 - MovingFunctions.ewma()
 - MovingFunctions.holt()
 - MovingFunctions.holtWinters()
 - MovingFunctions.stdDev()

The user can also define any arbitrary logic via their own scripting, or combine with the above methods.
2018-05-16 10:57:00 -04:00
Van0SS 4478f10a2a Rest High Level client: Add List Tasks (#29546)
This change adds a `listTasks` method to the high level java
ClusterClient which allows listing running tasks through the 
task management API.

Related to #27205
2018-05-16 13:31:37 +02:00
Nik Everett 9b47e0508b Fix compilation of test framework tests
We accidentally broke the compilation of the test frameworks tests. All
better now.
2018-05-15 22:56:41 -04:00
Jason Tedor 25c823da09
Skip shard deprecation messages in REST tests (#30630)
A 6.x node can send a deprecation message that the default number of
shards will change from five to one in 7.0.0. In a mixed cluster,
whether or not a create index request sees five or one shard and
produces a deprecation message depends on the version of the master
node. This means that during BWC tests a test can see this deprecation
message depending on the version of the master node. In 6.x when we
introduced this deprecation message we assumed that whereever we see
this deprecation message is expected. However, in a mixed cluster test
we need a similar mechanism but it would only apply if the version of
the master node is earlier than 7.0.0. This commit takes advantage of a
recent change to expose the version of the master node to do sections of
REST tests. With this in hand, we can skip asserting on the deprecation
message if the version of the master node is before 7.0.0 and otherwise
seeing that deprecation message would be completely unexpected.
2018-05-15 21:07:32 -04:00
Tim Brooks 99b9ab58e2
Add nio http server transport (#29587)
This commit is related to #28898. It adds an nio driven http server
transport. Currently it only supports basic http features. Cors,
pipeling, and read timeouts will need to be added in future PRs.
2018-05-15 16:37:14 -06:00
Nhat Nguyen b12c2f61c5
Use exact numDocs in synced-flush and metadata snapshot (#30228)
Since #29458, we use a searcher to calculate the number of documents for
a commit stats. Sadly, that approach is flawed. The searcher might no
longer point to the last commit if it's refreshed. As synced-flush
requires an exact numDocs to work correctly, we have to exclude all
soft-deleted docs.

This commit makes synced-flush stop using CommitStats but read an exact
numDocs directly from an index commit.

Relates #29458
Relates #29530
2018-05-15 18:00:53 -04:00
Jason Tedor abc06d5b79
Expose master version in REST test context (#30623)
This commit exposes the master version to the REST test context. This
will be needed in a follow-up where the master version will be used to
determine whether or not a certain warning header is expected.
2018-05-15 17:26:43 -04:00
Nik Everett 869b639d14
QA: System property to override distribution (#30591)
This configures all `qa` projects to use the distribution contained in
the `tests.distribution` system property if it is set. The goal is to
create a simple way to run tests against the default distribution which
has x-pack basic features enabled while not forcing these tests on all
contributors. You run these tests by doing something like:

```
./gradlew -p qa -Dtests.distribution=zip check
```

or

```
./gradlew -p qa -Dtests.distribution=zip bwcTest
```

x-pack basic *shouldn't* get in the way of any of these tests but
nothing is ever perfect so this we have to disable a few when running
with the zip distribution.
2018-05-15 17:16:16 -04:00
Julie Tibshirani 4f9dd37169
Add support for search templates to the high-level REST client. (#30473) 2018-05-15 13:07:58 -07:00
Nhat Nguyen 2a2c23be2f
Store the reason of noop in its document tombstone (#30570)
Relates #29530
2018-05-15 13:36:54 -04:00
Nhat Nguyen b971a81e70 Merge branch 'master' into ccr
* master:
  Default to one shard (#30539)
  Unmute IndexUpgradeIT tests
  Forbid expensive query parts in ranking evaluation (#30151)
  Docs: Update HighLevelRestClient migration docs (#30544)
  Clients: Switch to new performRequest (#30543)
  [TEST] Fix typo in MovAvgIT test
  Add missing dependencies on testClasses (#30527)
  [TEST] Mute ML test that needs updating to following ml-cpp changes
  Document woes between auto-expand-replicas and allocation filtering (#30531)
  Moved tokenizers to analysis common module (#30538)
  Adjust copy settings versions
  Mute ShrinkIndexIT suite
  SQL: SYS TABLES ordered according to *DBC specs (#30530)
  Deprecate not copy settings and explicitly disallow (#30404)
  [ML] Improve state persistence log message
  Build: Add mavenPlugin cluster configuration method (#30541)
  Re-enable FlushIT tests
  Bump Gradle heap to 2 GB (#30535)
  SQL: Use request flavored methods in tests (#30345)
  Suppress hdfsFixture if there are spaces in the path (#30302)
  Delete temporary blobs before creating index file (#30528)
  Watcher: Remove TriggerEngine.getJobCount() (#30395)
  [ML] Fix wire BWC for JobUpdate (#30512)
  Use simpler write-once semantics for FS repository (#30435)
  Derive max composite buffers from max content len
  Use simpler write-once semantics for HDFS repository (#30439)
  SQL: Improve correctness of SYS COLUMNS & TYPES (#30418)
  Mute two tests in FlushIT with @AwaitsFix.
  Fix incorrect template name in test case
  Build: Remove legacy bwc files from xpack (#30485)
  Mute UnicastZenPingTests#testSimplePings with @AwaitsFix.
  Security: cleanup code in file stores (#30348)
  Security: fix TokenMetaData equals and hashcode (#30347)
  Mute two tests from SmokeTestWatcherWithSecurityClientYamlTestSuiteIT.
  Mute SharedClusterSnapshotRestoreIT#testSnapshotSucceedsAfterSnapshotFailure with @AwaitsFix.
  SQL: Improve compatibility with MS query (#30516)
  SQL: Fix parsing of dates with milliseconds (#30419)
2018-05-14 13:23:23 -04:00
Jason Tedor 4a4e3d70d5
Default to one shard (#30539)
This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.

Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
2018-05-14 12:22:35 -04:00
Martijn van Groningen 7b95470897
Moved tokenizers to analysis common module (#30538)
The following tokenizers were moved: classic, edge_ngram,
letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and
whitespace.

Left keyword tokenizer factory in server module, because
normalizers directly depend on it.This should be addressed on a
follow up change.

Relates to #23658
2018-05-14 07:55:01 +02:00
Yannick Welsch fc870fdb4c
Use simpler write-once semantics for HDFS repository (#30439)
There's no need for an extra `blobExists()` call when writing a blob to the HDFS service. The writeBlob implementation for the HDFS repository already uses the `CreateFlag.CREATE` option on the file creation, which ensures that the blob that's uploaded does not already exist. This saves one network roundtrip.
2018-05-11 09:50:37 +02:00
Jay Modi f733de8e67
Security: fix TokenMetaData equals and hashcode (#30347)
The TokenMetaData equals method compared byte arrays using `.equals` on
the arrays themselves, which is the equivalent of an `==` check. This
means that a seperate byte[] with the same contents would not be
considered equivalent to the existing one, even though it should be.

The method has been updated to use `Array#equals` and similarly the
hashcode method has been updated to call `Arrays#hashCode` instead of
calling hashcode on the array itself.
2018-05-10 13:12:11 -06:00
Nhat Nguyen a5be4149a3 Merge branch 'master' into ccr
* master:
  Upgrade to Lucene-7.4-snapshot-6705632810 (#30519)
  add version compatibility from 6.4.0 after backport, see #30319 (#30390)
  Security: Simplify security index listeners (#30466)
  Add proper longitude validation in geo_polygon_query (#30497)
  Remove Discovery.AckListener.onTimeout() (#30514)
  Build: move generated-resources to build (#30366)
  Reindex: Fold "with all deps" project into reindex (#30154)
  Isolate REST client single host tests (#30504)
  Solve Gradle deprecation warnings around shadowJar (#30483)
  SAML: Process only signed data (#30420)
  Remove BWC repository test (#30500)
  Build: Remove xpack specific run task (#30487)
  AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests
  LLClient: Add setJsonEntity (#30447)
  Expose CommonStatsFlags directly in IndicesStatsRequest. (#30163)
  Silence IndexUpgradeIT test failures. (#30430)
  Bump Gradle heap to 1792m (#30484)
  [docs] add warning for read-write indices in force merge documentation (#28869)
  Avoid deadlocks in cache (#30461)
  Test: remove hardcoded list of unconfigured ciphers (#30367)
  mute SplitIndexIT due to https://github.com/elastic/elasticsearch/issues/30416
  Docs: Test examples that recreate lang analyzers  (#29535)
  BulkProcessor to retry based on status code (#29329)
  Add GET Repository High Level REST API (#30362)
  add a comment explaining the need for RetryOnReplicaException on missing mappings
  Add `coordinating_only` node selector (#30313)
  Stop forking groovyc (#30471)
  Avoid setting connection request timeout (#30384)
  Use date format in `date_range` mapping before fallback to default (#29310)
  Watcher: Increase HttpClient parallel sent requests (#30130)

# Conflicts:
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/LocalStateCompositeXPackPlugin.java
2018-05-10 13:04:08 -04:00
Jason Tedor bf2365d13b
Remove BWC repository test (#30500)
This commit removes a test that we can not restore from 1.x and 2.x
repository files. This test is not needed, the version of Elasticsearch
that this commit targets can not even read index files from those
versions.
2018-05-09 23:24:54 -04:00
Martijn van Groningen bb6586dc5f [CCR] Read changes from Lucene instead of translog (#30120)
This commit adds an API to read translog snapshot from Lucene,
then cut-over from the existing translog to the new API in CCR.

Relates #30086
Relates #29530
2018-05-09 17:35:27 -04:00
Nik Everett f9dc86836d
Docs: Test examples that recreate lang analyzers (#29535)
We have a pile of documentation describing how to rebuild the built in
language analyzers and, previously, our documentation testing framework
made sure that the examples successfully built *an* analyzer but they
didn't assert that the analyzer built by the documentation matches the
built in anlayzer. Unsuprisingly, some of the examples aren't quite
right.

This adds a mechanism that tests that the analyzers built by the docs.
The mechanism is fairly simple and brutal but it seems to be working:
build a hundred random unicode sequences and send them through the
`_analyze` API with the rebuilt analyzer and then again through the
built in analyzer. Then make sure both APIs return the same results.
Each of these calls to `_anlayze` takes about 20ms on my laptop which
seems fine.
2018-05-09 09:23:10 -04:00
Nhat Nguyen 75719ac71b
Make _id terms optional in segment containing only noop (#30409)
Previously only index and delete operations are indexed into Lucene,
therefore every segment should have both _id and _version terms as these
operations contain both terms. However, this is no longer guaranteed
after noop is also indexed into Lucene. A segment which contains only
no-ops does not have neither _id or _version because a no-op does not
contain these terms.

This change adds a dummy version to no-ops and makes _id terms optional
in PerThreadIDVersionAndSeqNoLookup.

Relates #30226
2018-05-08 13:38:40 -04:00
Nhat Nguyen d9b9d7d107 Merge branch 'master' into ccr
* elastic-master:
  Watcher: Mark watcher as started only after loading watches (#30403)
  Pass the task to broadcast actions (#29672)
  Disable REST default settings testing until #29229 is back-ported
  Correct wording in log message (#30336)
  Do not fail snapshot when deleting a missing snapshotted file (#30332)
  AwaitsFix testCreateShrinkIndexToN
  DOCS: Correct mapping tags in put-template api
  DOCS: Fix broken link in the put index template api
  Add put index template api to high level rest client (#30400)
  Relax testAckedIndexing to allow document updating
  [Docs] Add snippets for POS stop tags default value
  Move respect accept header on no handler to 6.3.1
  Respect accept header on no handler (#30383)
  [Test] Add analysis-nori plugin to the vagrant tests
  [Docs] Fix bad link
  [Docs] Fix end of section in the korean plugin docs
  Expose the Lucene Korean analyzer module in a plugin (#30397)
  Docs: remove transport_client from CCS role example (#30263)
  [Rollup] Validate timezone in range queries (#30338)
  Use readFully() to read bytes from CipherInputStream (#28515)
  Fix  docs Recently merged #29229 had a doc bug that broke the doc build. This commit fixes.
  Test: remove cluster permission from CCS user (#30262)
  Add Get Settings API support to java high-level rest client (#29229)
  Watcher: Remove unneeded index deletion in tests
2018-05-08 09:04:01 -04:00
Stéphane Campinas 2f8905839f Correct wording in log message (#30336) 2018-05-07 12:00:06 +02:00
Tanguy Leroux 1987d6261f
Do not fail snapshot when deleting a missing snapshotted file (#30332)
When deleting or creating a snapshot for a given shard, elasticsearch 
usually starts by listing all the existing snapshotted files in the repository. 
Then it computes a diff and deletes the snapshotted files that are not 
needed anymore. During this deletion, an exception is thrown if the file 
to be deleted does not exist anymore.

This behavior is challenging with cloud based repository implementations 
like S3 where a file that has been deleted can still appear in the bucket for 
few seconds/minutes (because the deletion can take some time to be fully 
replicated on S3). If the deleted file appears in the listing of files, then the 
following deletion will fail with a NoSuchFileException and the snapshot 
will be partially created/deleted.

This pull request makes the deletion of these files a bit less strict, ie not 
failing if the file we want to delete does not exist anymore. It introduces a 
new BlobContainer.deleteIgnoringIfNotExists() method that can be used 
at some specific places where not failing when deleting a file is 
considered harmless.

Closes #28322
2018-05-07 09:35:55 +02:00
Nhat Nguyen 2c73969505
Introduce soft-deletes retention policy based on global checkpoint (#30335)
This commit introduces a soft-deletes retention merge policy based on
the global checkpoint. Some notes on this simple retention policy:

- This policy keeps all operations whose seq# is greater than the 
persisted global checkpoint and configurable extra operations prior to
the global checkpoint. This is good enough for querying history changes.

- This policy is not watertight for peer-recovery. We send the 
safe-commit in peer-recovery, thus we need to also send all operations
after the local checkpoint of that commit. This is analog to the min
translog generation for recovery.

- This policy is too simple to support rollback.

Relates #29530
2018-05-04 23:19:01 -04:00
Nhat Nguyen 8fefa8a661 Update InternalEngine tests on ccr side for #30121
Relates #30121
2018-05-04 10:57:54 -04:00
Nhat Nguyen db14717098 Merge branch 'master' into ccr
* master:
  Set the new lucene version for 6.4.0
  [ML][TEST] Clean up jobs in ModelPlotIT
  Upgrade to 7.4.0-snapshot-1ed95c097b (#30357)
  Watcher: Ensure trigger service pauses execution (#30363)
  [DOCS] Added coming qualifiers in changelog
  [DOCS] Commented out empty sections in the changelog to fix the doc build. (#30372)
  Security: reduce garbage during index resolution (#30180)
  Make RepositoriesMetaData contents unmodifiable (#30361)
  Change quad tree max levels to 29. Closes #21191 (#29663)
  Test: use trial license in qa tests with security
  [ML] Add integration test for model plots (#30359)
  SQL: Fix bug caused by empty composites (#30343)
  [ML] Account for gaps in data counts after job is reopened (#30294)
  InternalEngineTests.testConcurrentOutOfOrderDocsOnReplica should use two documents (#30121)
  Change signature of Get Repositories Response (#30333)
  Tests: Use different watch ids per test in smoke test (#30331)
  [Docs] Add term query with normalizer example
  Adds Eclipse config for xpack licence headers (#30299)
  Watcher: Make start/stop cycle more predictable and synchronous (#30118)
  [test] add debug logging for packaging test
  [DOCS] Removed X-Pack Breaking Changes
  [DOCS] Fixes link to TLS LDAP info
  Update versions for start_trial after backport (#30218)
  Packaging: Set elasticsearch user to have non-existent homedir (#29007)
  [DOCS] Fixes broken links to bootstrap user (#30349)
  Fix NPE when CumulativeSum agg encounters null/empty bucket (#29641)
  Make licensing FIPS-140 compliant (#30251)
  [DOCS] Reorganizes authentication details in Stack Overview (#30280)
  Network: Remove http.enabled setting (#29601)
  Fix merging logic of Suggester Options (#29514)
  [DOCS] Adds LDAP realm configuration details (#30214)
  [DOCS] Adds native realm configuration details (#30215)
  ReplicationTracker.markAllocationIdAsInSync may hang if allocation is cancelled (#30316)
  [DOCS] Enables edit links for X-Pack pages (#30278)
  Packaging: Unmark systemd service file as a config file (#29004)
  SQL: Reduce number of ranges generated for comparisons (#30267)
  Tests: Simplify VersionUtils released version splitting (#30322)
  Cancelling a peer recovery on the source can leak a primary permit (#30318)
  Added changelog entry for deb prerelease version change (#30184)
  Convert server javadoc to html5 (#30279)
  Create default ES_TMPDIR on Windows (#30325)
  [Docs] Clarify `fuzzy_like_this` redirect (#30183)
  Post backport of #29658.
  Fix docs of the `_ignored` meta field.
  Remove MapperService#types(). (#29617)
  Remove useless version checks in REST tests. (#30165)
  Add a new `_ignored` meta field. (#29658)
  Move repository-azure fixture test to QA project (#30253)

# Conflicts:
#	buildSrc/version.properties
#	server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
2018-05-04 09:40:57 -04:00
Jim Ferenczi dbd857341f
Upgrade to 7.4.0-snapshot-1ed95c097b (#30357)
Upgrade to lucene-7.4.0-snapshot-1ed95c097b

This version contains:
* An Analyzer for Korean
* An IntervalQuery and IntervalsSource that retrieve minimum intervals of positional queries.
* A new API to retrieve matches (offsets and positions) of a query for a single document.
* Support for soft deletes in the index writer.
* A fixed shingle filter that handles index time synonyms.
* Support for emoji sequence in ICUTokenizer (with an upgrade to icu 61.1)
2018-05-04 11:44:22 +02:00
Zachary Tong 3c2d2a7d4a
Fix NPE when CumulativeSum agg encounters null/empty bucket (#29641)
Fix NPE when CumulativeSum agg encounters null/empty bucket

If the cusum agg encounters a null value, it's because the value is
missing (like the first value from a derivative agg), the path is
not valid, or the bucket in the path was empty.

Previously cusum would just explode on the null, but this changes it
so we only increment the sum if the value is non-null and finite.
This is safe because even if the cusum encounters all null or empty
buckets, the cumulative sum is still zero (like how the sum agg returns
zero even if all the docs were missing values)

I went ahead and tweaked AggregatorTestCase to allow testing pipelines,
so that I could delete the IT test and reimplement it as AggTests.

Closes #27544
2018-05-02 12:22:55 -07:00
Ryan Ernst fb0aa562a5
Network: Remove http.enabled setting (#29601)
This commit removes the http.enabled setting. While all real nodes (started with bin/elasticsearch) will always have an http binding, there are many tests that rely on the quickness of not actually needing to bind to 2 ports. For this case, the MockHttpTransport.TestPlugin provides a dummy http transport implementation which is used by default in ESIntegTestCase.

closes #12792
2018-05-02 11:42:05 -07:00
Ryan Ernst f0e92676b1
Tests: Simplify VersionUtils released version splitting (#30322)
This commit refactors VersionUtils.resolveReleasedVersions to be
simpler, and in the process fixes the behavior to match that of
VersionCollection.groovy.

closes #30133
2018-05-02 09:29:35 -07:00
Nhat Nguyen 2da0cd24b3 TEST: AwaitsFix assertSameSyncIdSameDocs
This is tracked by https://github.com/elastic/elasticsearch/pull/30228
2018-05-02 12:20:03 -04:00
Nhat Nguyen d621fc7a00
Add tombstone document into Lucene for Noop (#30226)
This commit adds a tombstone document into Lucene for every No-op. 
With this change, Lucene index is expected to have a complete history 
of operations like Translog. In fact, this guarantee is subjected to the
soft-deletes retention merge policy.

Relates #29530
2018-05-02 09:08:29 -04:00
Adrien Grand 368ddc408f
Remove MapperService#types(). (#29617)
This isn't be necessary with a single type per index.
2018-05-02 11:35:12 +02:00
Nhat Nguyen eb4281edef CCR side #30244
Relates #30244
2018-05-01 21:08:24 -04:00
Nhat Nguyen d52ca33bd9 Merge branch 'master' into ccr
* master: (68 commits)
  [DOCS] Removes X-Pack Elasticsearch release notes (#30272)
  Correct an example in the top-level suggester documentation. (#30224)
  [DOCS] Removes broken link
  [DOCS] Adds file realm configuration details (#30221)
  [DOCS] Adds PKI realm configuration details (#30225)
  Fix a reference to match_phrase_prefix in the match query docs. (#30282)
  Fix failure for validate API on a terms query (#29483)
  [DOCS] Fix 6.4-specific link in changelog (#30314)
  Remove RepositoriesMetaData variadic constructor (#29569)
  Test: increase authentication logging for debugging
  [DOCS] Removes redundant SAML realm settings (#30196)
  REST Client: Add Request object flavored methods (#29623)
  [DOCS] Adds changelog to Elasticsearch Reference (#30271)
  [DOCS] Fixes section error
  SQL: Teach the CLI to ignore empty commands (#30265)
  [DOCS] Adds Active Directory realm configuration details (#30223)
  [DOCS] Removes redundant file realm settings (#30192)
  [DOCS] Fixes users command name (#30275)
  Build: Move gradle wrapper jar to a dot dir (#30146)
  Build: Log a warning if disabling reindex-from-old (#30304)
2018-05-01 21:07:54 -04:00
Boaz Leskes 4a537ef03c
Bulk operation fail to replicate operations when a mapping update times out (#30244)
Starting with the refactoring in https://github.com/elastic/elasticsearch/pull/22778 (released in 5.3) we may fail to properly replicate operation when a mapping update on master fails. If a bulk
operations needs a mapping update half way, it will send a request to the master before continuing 
to index the operations. If that request times out or isn't acked (i.e., even one node in the cluster 
didn't process it within 30s), we end up throwing the exception and aborting the entire bulk. This is 
a problem because all operations that were processed so far are not replicated any more to the 
replicas.  Although these operations were never "acked" to the user (we threw an error) it cause the 
local checkpoint on the replicas to lag (on 6.x) and the primary and replica to diverge. 

This PR does a couple of things:
1) Most importantly, treat *any* mapping update failure as a document level failure, meaning only 
    the relevant indexing operation will fail.
2) Removes the mapping update callbacks from `IndexShard.applyIndexOperationOnPrimary` and 
    similar methods for simpler execution. We don't use exceptions any more when a mapping 
    update was successful.

I think we need to do more work here (the fact that a single slow node can prevent those mappings 
updates from being acked and thus fail operations is bad), but I want to keep this as small as I can 
(it is already too big).
2018-05-01 08:15:02 +02:00
Nik Everett 50945051b6
HTML5ify Javadoc for core and test framework (#30234)
`javadoc` will switch from detaulting to html4 to html5 in "a future
release". We should get ahead of it so we're not surprised. Also, HTML5
is the future! Er, the present. Anyway, this follows up from #30220 to
make the Javadoc for two of the four remaining projects HTML5
compatible.
2018-04-30 09:39:50 -04:00
Nhat Nguyen 8ebca76cf7
Index stale operations to Lucene to have complete history (#29679)
Today, when processing out of order operations, we only add it into
translog but skip adding into Lucene. Translog, therefore, has a
complete history of sequence numbers while Lucene does not.

Since we would like to have a complete history in Lucene, this change
makes sure that stale operations will be added to Lucene as soft-deleted
documents if required.

Relates #29530
2018-04-27 19:39:29 -04:00
Martijn van Groningen 56b61578bc
Merge remote-tracking branch 'es/master' into ccr
* es/master: (32 commits)
  TEST: Unmute testPrimaryRelocationWhileIndexing
  Remove remaining tribe node references (#29574)
  Never leave stale delete tombstones in version map (#29619)
  Do not serialize common stats flags using ordinal (#29600)
  Remove stale comment from JVM stats (#29625)
  TEST: Mute testPrimaryRelocationWhileIndexing
  Remove bulk fallback for write thread pool (#29609)
  Fix an incorrect reference to 'zero_terms_docs' in match_phrase queries.
  Update the version compatibility for zero_terms_query in match_phrase.
  Account translog location to ram usage in version map
  Remove extra spaces from changelog
  Add support to match_phrase query for zero_terms_query. (#29598)
  Fix incorrect references to 'zero_terms_docs' in query parsing error messages. (#29599)
  Build: Move java home checks to pre-execution phase (#29548)
  Avoid side-effect in VersionMap when assertion enabled (#29585)
  [Tests] Remove accidental logger usage
  Add tests for ranking evaluation with aliases (#29452)
  Deprecate use of `htmlStrip` as name for HtmlStripCharFilter (#27429)
  Update plan for the removal of mapping types. (#29586)
  [Docs] Add rankEval method for Jva HL client
  ...
2018-04-20 07:50:11 +02:00
Nhat Nguyen ac84879a71
Use soft deletes to maintain doc history (#29549)
Today we can use the soft-deletes feature from Lucene to maintain a
history of a document. This change simply replaces hard-deletes by
soft-deletes in Engine.

Besides marking a document as deleted, we also index a tombstone
associated with that delete operation. Storing delete tombstones allows
us to have a history of sequence-based operations which can serve in
recovery or rollback.

Relates #29530
2018-04-19 20:45:13 -04:00
Martijn van Groningen 621a1935b8
test: also assert deprecation warning after clusters have been closed. 2018-04-19 09:20:04 +02:00
Ryan Ernst 1cb3a9d9dc Test: Guard deprecation check when 0 nodes created
The internal test cluster can sometimes have 0 nodes. In this situation,
the http.enabled flag will never be read, and thus no deprecation
warning will be emitted. This commit guards the deprecation warning
check in this case.
2018-04-18 21:18:56 -07:00
Ryan Ernst 98d776edaf
Networking: Deprecate http.enabled setting (#29591)
This commit deprecates the http.enabled, in preparation for removing the
feature in 7.0.

relates #12792
2018-04-18 17:36:09 -07:00
Nhat Nguyen 4be1488324 Merge branch 'master' into ccr
* master:
  Remove the index thread pool (#29556)
  Remove extra copy in ScriptDocValues.Strings
  Fix full cluster restart test recovery (#29545)
  Fix binary doc values fetching in _search (#29567)
  Mutes failing MovAvgIT tests
  Fix the assertion message for an incorrect current version. (#29572)
  Fix the version ID for v5.6.10. (#29570)
  Painless Spec Documentation Clean Up (#29441)
  Add versions 5.6.10 and 6.2.5
  [TEST] test against scaled value instead of fixed epsilon in MovAvgIT
  Remove `flatSettings` support from request classes (#29560)
  MapperService to wrap a single DocumentMapper. (#29511)
  Fix dependency checks on libs when generating Eclipse configuration. (#29550)
  Add null_value support to geo_point type (#29451)
  Add documentation about the include_type_name option. (#29555)
  Enforce translog access via engine (#29542)
2018-04-18 11:41:08 -04:00
Julie Tibshirani c8209fa7b1
Fix the assertion message for an incorrect current version. (#29572) 2018-04-17 19:27:02 -07:00
Nhat Nguyen 67990121fc Merge branch 'master' into ccr 2018-04-17 12:34:31 -04:00
Nhat Nguyen 45c6c20467
Enforce translog access via engine (#29542)
Today the translog of an engine is exposed and can be accessed directly.
While this exposure offers much flexibility, it also causes these troubles:

- Inconsistent behavior between translog method and engine method.
For example, rolling a translog generation via an engine also trims
unreferenced files, but translog's method does not.

- An engine does not get notified when critical errors happen in translog
as the access is direct.

This change isolates translog of an engine and enforces all accesses to
translog via the engine.
2018-04-17 08:03:41 -04:00
Jason Tedor 1dd0fd4874
Deprecate the index thread pool (#29540)
The index thread pool is no longer needed as its primary use-case for
single-document indexing requests has been relieved now that
single-document indexing requests are converted to bulk indexing
requests (with a single document payload).
2018-04-17 06:47:30 -04:00
olcbean b3e3b80f1b REST high-level client: add support for Indices Update Settings API [take 2] (#29327)
Relates to #27205
2018-04-16 21:39:11 +02:00
Martijn van Groningen 9da3e739fb
Merge remote-tracking branch 'es/master' into ccr
* es/master:
  Add remote cluster client (#29495)
  Ensure flush happens on shard idle
  Adds SpanGapQueryBuilder in the query DSL (#28636)
  Control max size and count of warning headers (#28427)
  Make index APIs work without types. (#29479)
  Deprecate filtering on `_type`. (#29468)
  Fix auto-generated ID example format (#29461)
  Fix typo in max number of threads check docs (#29469)
  Add primary term to translog header (#29227)
  Add a helper method to get a random java.util.TimeZone (#29487)
  Move TimeValue into elasticsearch-core project (#29486)
  Fix NPE in InternalGeoCentroidTests#testReduceRandom (#29481)
  Build: introduce keystoreFile for cluster config (#29491)
  test: Index more docs, so that it is less likely the search request does not time out.
2018-04-13 15:31:43 +02:00
Simon Willnauer eab530ce11 Ensure flush happens on shard idle
This adds 2 testcases that test if a shard goes idle
pending (uncommitted) segments are committed and unreferenced
files will be freed.

Relates to #29482
2018-04-13 15:06:51 +02:00
Nhat Nguyen f96e00badf
Add primary term to translog header (#29227)
This change adds the current primary term to the header of the current
translog file. Having a term in a translog header is a prerequisite step
that allows us to trim translog operations given the max valid seq# for
that term.

This commit also updates tests to conform the primary term invariant 
which guarantees that all translog operations in a translog file have
its terms at most the term stored in the translog header.
2018-04-12 13:57:59 -04:00
Lee Hinman d72d3f996e
Add a helper method to get a random java.util.TimeZone (#29487)
* Add a helper method to get a random java.util.TimeZone

This adds a helper method to ESTestCase that returns a randomized
`java.util.TimeZone`. This can be used when transitioning code from Joda to the
JDK's time classes.
2018-04-12 11:56:42 -06:00
Nhat Nguyen b24ca650ad Merge branch 'master' into ccr 2018-04-11 11:35:15 -04:00
Adrien Grand 4918924fae
Remove legacy mapping code. (#29224)
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
2018-04-11 09:41:37 +02:00
Nhat Nguyen 530eff79fe Merge branch 'master' into ccr 2018-04-10 21:43:08 -04:00
tomcallahan 2574064e66
Enable rest tests via IDEs (#29439)
Currently rest-based tests do not work from the IDE, as the security
manager is configured to permit certain network operations when
using the snapshot jars compiled by gradle.  We have an existing
workaround that explicitly associates a codebase with the path
from which the classes are loaded (in this case, the IDE build
directory).  This PR adds the rest client to this workaround list.
2018-04-10 09:08:58 -04:00
Nhat Nguyen 30c06a6f8c
Upgrade to Lucene-7.4.0-snapshot-2b27dd846a (#29398)
This snapshot version supports soft delete and the merge policy.
2018-04-09 12:25:49 -04:00
Nhat Nguyen b5032eab80 Merge branch 'master' into ccr 2018-04-09 08:08:06 -04:00
Lee Hinman a07ba9e400
Move Streams.copy into elasticsearch-core and make a multi-release jar (#29322)
* Move Streams.copy into elasticsearch-core and make a multi-release jar

This moves the method `Streams.copy(InputStream in, OutputStream out)` into the
`elasticsearch-core` project (inside the `o.e.core.internal.io` package). It
also makes this class into a multi-release class where the Java 9 equivalent
uses `InputStream#transferTo`.

This is a followup from
https://github.com/elastic/elasticsearch/pull/29300#discussion_r178147495
2018-04-06 11:07:20 -06:00
Lee Hinman a93c942927
Move ObjectParser into the x-content lib (#29373)
* Move ObjectParser into the x-content lib

This moves `ObjectParser`, `AbstractObjectParser`, and
`ConstructingObjectParser` into the libs/x-content dependency. This decoupling
allows them to be used for parsing for projects that don't want to depend on the
entire Elasticsearch jar.

Relates to #28504
2018-04-06 09:41:14 -06:00
Colin Goodheart-Smithe 55c8e80532
Fixes query_string query equals timezone check (#29406)
* Fixes query_string query equals timezone check

This change fixes a bug where two `QueryStringQueryBuilder`s were found
to be equal if they had the same timezone set even if the query string
in the builders were different

Closes #29403

* Adds mutate function to QueryStringQueryBuilderTests

* iter
2018-04-06 11:45:34 +01:00
Martijn van Groningen 1f306c321e
Merge remote-tracking branch 'es/master' into ccr
* es/master: (68 commits)
  Allow using distance measure in the geo context precision (#29273)
  Disable failing query in QueryBuilderBWCIT.
  Fixed quote_field_suffix in query_string (#29332)
  Use fixture to test repository-url module (#29355)
  Remove undocumented action.master.force_local setting (#29351)
  Enhance error for out of bounds byte size settings (#29338)
  Fix QueryAnalyzerTests.
  Fix HasChildQueryBuilderTests to not use the `classic` similarity.
  [Docs] Correct javadoc of GetIndexRequest (#29364)
  Make TransportRankEvalAction members final
  Add awaits fix for a query analyzer test
  Check presence of multi-types before validating new mapping (#29316)
  Add awaits fix for HasChildQueryBuilderTests
  Remove silent batch mode from install plugin (#29359)
  Align cat thread pool info to thread pool config (#29195)
  Track Lucene operations in engine explicitly (#29357)
  Build: Fix Java9 MR build (#29312)
  Reindex: Fix error in delete-by-query rest spec (#29318)
  Improve similarity integration. (#29187)
  Fix some query extraction bugs. (#29283)
  ...
2018-04-05 08:47:07 +02:00
Adrien Grand 569d0c0e89
Improve similarity integration. (#29187)
This improves the way similarities are plugged in in order to:
 - reject the classic similarity on 7.x indices and emit a deprecation
   warning otherwise
 - reject unkwown parameters on 7.x indices and emit a deprecation
   warning otherwise

Even though this breaks the plugin API, I'd like to backport to 7.x so
that users can get deprecation warnings when they are doing something
that will become unsupported in the future.

Closes #23208
Closes #29035
2018-04-03 16:45:25 +02:00
Adrien Grand 3bdfc8f3fb
Upgrade to lucene-7.3.0-snapshot-98a6b3d. (#29298)
Most notable changes include:
 - this release doesn't have the 7.2.1 version constant so I had to create one
 - spatial4j and jts were upgraded
2018-04-03 09:27:14 +02:00
Lee Hinman 6b2167f462
Begin moving XContent to a separate lib/artifact (#29300)
* Begin moving XContent to a separate lib/artifact

This commit moves a large portion of the XContent code from the `server` project
to the `libs/xcontent` project. For the pieces that have been moved, some
helpers have been duplicated to allow them to be decoupled from ES helper
classes. In addition, `Booleans` and `CheckedFunction` have been moved to the
`elasticsearch-core`  project.

This decoupling is a move so that we can eventually make things like the
high-level REST client not rely on the entire ES jar, only the parts it needs.

There are some pieces that are still not decoupled, in particular some of the
XContent tests still remain in the server project, this is because they test a
large portion of the pluggable xcontent pieces through
`XContentElasticsearchException`. They may be decoupled in future work.
Additionally, there may be more piecese that we want to move to the xcontent lib
in the future that are not part of this PR, this is a starting point.

Relates to #28504
2018-04-02 15:58:31 -06:00
Mayya Sharipova e70cd35bda
Revert "REST high-level client: add support for Indices Update Settings API (#28892)" (#29323)
This reverts commit b67b5b1bbd.
2018-03-30 16:26:46 -07:00
Andy Bristol b7e6fb9ac5
[test] remove Streamable serde assertions (#29307)
Removes a set of assertions in the test framework that verified that
Streamable objects could be serialized and deserialized across different
versions. When this was discussed the consensus was that this approach
has not caught many bugs in a long time and that serialization testing of
objects was best left to their respective unit and integration tests.

This commit also removes a transport interceptor that was used in
ESIntegTestCase tests to make these assertions about objects coming in
or off the wire.
2018-03-30 14:09:26 -07:00
olcbean b67b5b1bbd REST high-level client: add support for Indices Update Settings API (#28892)
Relates to #27205
2018-03-30 10:53:29 +02:00
Jason Tedor 4ef3de40bc
Fix handling of bad requests (#29249)
Today we have a few problems with how we handle bad requests:
 - handling requests with bad encoding
 - handling requests with invalid value for filter_path/pretty/human
 - handling requests with a garbage Content-Type header

There are two problems:
 - in every case, we give an empty response to the client
 - in most cases, we leak the byte buffer backing the request!

These problems are caused by a broader problem: poor handling preparing
the request for handling, or the channel to write to when the response
is ready. This commit addresses these issues by taking a unified
approach to all of them that ensures that:
 - we respond to the client with the exception that blew us up
 - we do not leak the byte buffer backing the request
2018-03-28 16:25:01 -04:00
Simon Willnauer 13e19e7428
Allow _update and upsert to read from the transaction log (#29264)
We historically removed reading from the transaction log to get consistent
results from _GET calls. There was also the motivation that the read-modify-update
principle we apply should not be hidden from the user. We still agree on the fact
that we should not hide these aspects but the impact on updates is quite significant
especially if the same documents is updated before it's written to disk and made serachable.

This change adds back the ability to read from the transaction log but only for update calls.
Calls to the _GET API will always do a refresh if necessary to return consistent results ie.
if stored fields or DocValues Fields are requested.

Closes #26802
2018-03-28 18:03:34 +02:00
Yannick Welsch cacf759213
Remove RELOCATED index shard state (#29246)
as this information is already covered by ReplicationTracker.primaryMode.
2018-03-28 12:25:46 +02:00
Martijn van Groningen ffb5281cc0
Merge remote-tracking branch 'es/master' into ccr
* es/master: (22 commits)
  Fix building Javadoc JARs on JDK for client JARs (#29274)
  Require JDK 10 to build Elasticsearch (#29174)
  Decouple NamedXContentRegistry from ElasticsearchException (#29253)
  Docs: Update generating test coverage reports (#29255)
  [TEST] Fix issue with HttpInfo passed invalid parameter
  Remove all dependencies from XContentBuilder (#29225)
  Fix sporadic failure in CompositeValuesCollectorQueueTests
  Propagate ignore_unmapped to inner_hits (#29261)
  TEST: Increase timeout for testPrimaryReplicaResyncFailed
  REST client: hosts marked dead for the first time should not be immediately retried (#29230)
  TEST: Use different translog dir for a new engine
  Make SearchStats implement Writeable (#29258)
  [Docs] Spelling and grammar changes to reindex.asciidoc (#29232)
  Do not optimize append-only if seen normal op with higher seqno (#28787)
  [test] packaging: gradle tasks for groovy tests (#29046)
  Prune only gc deletes below local checkpoint (#28790)
  remove testUnassignedShardAndEmptyNodesInRoutingTable
  #28745: remove extra option in the composite rest tests
  Fold EngineDiskUtils into Store, for better lock semantics (#29156)
  Add file permissions checks to precommit task
  ...
2018-03-28 09:24:27 +02:00
Nhat Nguyen 87957603c0
Prune only gc deletes below local checkpoint (#28790)
Once a document is deleted and Lucene is refreshed, we will not be able 
to look up the `version/seq#` associated with that delete in Lucene. As
conflicting operations can still be indexed, we need another mechanism
to remember these deletes. Therefore deletes should still be stored in
the Version Map, even after Lucene is refreshed. Obviously, we can't
remember all deletes forever so a trimming mechanism is needed.
Currently, we remember deletes for at least 1 minute (the default GC
deletes cycle) and clean them periodically. This is, at the moment, the
best we can do on the primary for user facing APIs but this arbitrary
time limit is problematic for replicas. Furthermore, we can't rely on
the primary and replicas doing the trimming in a synchronized manner,
and failing to do so results in the replica and primary making different
decisions. 

The following scenario can cause inconsistency between
primary and replica.

1. Primary index doc (index, id=1, v2)
2. Network packet issue causes index operation to back off and wait
3. Primary deletes doc (delete, id=1, v3)
4. Replica processes delete (delete, id=1, v3)
5. 1+ minute passes (GC deletes runs replica)
6. Indexing op is finally sent to the replica which no processes it 
   because it forgot about the delete.

We can reply on sequence-numbers to prevent this issue. If we prune only 
deletes whose seqno at most the local checkpoint, a replica will
correctly remember what it needs. The correctness is explained as
follows:

Suppose o1 and o2 are two operations on the same document with seq#(o1) 
< seq#(o2), and o2 arrives before o1 on the replica. o2 is processed
normally since it arrives first; when o1 arrives it should be discarded:
 
1. If seq#(o1) <= LCP, then it will be not be added to Lucene, as it was
  already previously added.

2. If seq#(o1)  > LCP, then it depends on the nature of o2:
  - If o2 is a delete then its seq# is recorded in the VersionMap,
    since seq#(o2) > seq#(o1) > LCP, so a lookup can find it and
    determine that o1 is stale.
  
  - If o2 is an indexing then its seq# is either in Lucene (if
    refreshed) or the VersionMap (if not refreshed yet), so a 
    real-time lookup can find it and determine that o1 is stale.

In this PR, we prefer to deploy a single trimming strategy, which 
satisfies both requirements, on primary and replicas because:

- It's simpler - no need to distinguish if an engine is running at
primary mode or replica mode or being promoted.

- If a replica subsequently is promoted, user experience is fully
maintained as that replica remembers deletes for the last GC cycle.

However, the version map may consume less memory if we deploy two 
different trimming strategies for primary and replicas.
2018-03-26 13:42:08 -04:00
Boaz Leskes f5d4550e93
Fold EngineDiskUtils into Store, for better lock semantics (#29156)
#28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change
translog and lucene commit points. That util class bundled everything that's needed to create and
empty shard, bootstrap a shard from a lucene index that was just restored etc. 

In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That
would sometime fail due to concurrent shard store fetching or other short activities that require the
files not to be changed while they read from them. 

Since there is no way to wait on the index writer lock, the `Store` class has other locks to make
sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this
PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and
shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the
translog manipulations on their own.
2018-03-26 14:08:03 +02:00
Jim Ferenczi 5288235ca3
Optimize the composite aggregation for match_all and range queries (#28745)
This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:

```
"composite" : {
  "sources" : [
    { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
  ],
  "size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.

This mode can execute iff:
 * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
 * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
 * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).

If these conditions are not met this aggregation visits each document like any other agg.
2018-03-26 09:51:37 +02:00
Martijn van Groningen 5498be7e96
Merge remote-tracking branch 'es/master' into ccr
* es/master: (50 commits)
  Reject updates to the `_default_` mapping. (#29165)
  Improve similarity docs. (#29089)
  [Docs] Update api.asciidoc (#29166)
  Docs: Add note about missing mapping for doc values field (#29036)
  Fix BWC issue for PreSyncedFlushResponse
  Remove BytesArray and BytesReference usage from XContentFactory (#29151)
  Add pluggable XContentBuilder writers and human readable writers (#29120)
  Add unreleased version 6.2.4 (#29171)
  Add unreleased version 6.1.5 (#29168)
  Add a note about using the `retry_failed` flag before accepting data loss (#29160)
  Fix typo in percolate-query.asciidoc (#29155)
  Require HTTP::Tiny 0.070 for release notes script
  Set Java 9 checkstyle to depend on checkstyle conf (#28383)
  REST high-level client: add clear cache API (#28866)
  Docs: Add example of resetting index setting (#29048)
  Plugins: Fix module name conflict check for meta plugins (#29146)
  Build: Fix meta plugin bundled plugin names (#29147)
  Build: Simplify rest spec hack configuration (#29149)
  Build: Fix meta modules to not install as plugin in tests (#29150)
  Fix javadoc warning in Strings for missing parameter description
  ...
2018-03-21 10:54:20 +01:00
Lee Hinman b4af451ec5
Remove BytesArray and BytesReference usage from XContentFactory (#29151)
* Remove BytesArray and BytesReference usage from XContentFactory

This removes the usage of `BytesArray` and `BytesReference` from
`XContentFactory`. Instead, a regular `byte[]` should be passed. To assist with
this a helper has been added to `XContentHelper` that will preserve the offset
and length from the underlying BytesReference.

This is part of ongoing work to separate the XContent parts from ES so they can
be factored into their own jar.

Relates to #28504
2018-03-20 11:52:26 -06:00
Nik Everett a813492fe3
Tests: Make $_path support dots in paths (#28917)
`$_path` is used by documentation tests to ignore a value from a
response, for example:

```
[source,js]
----
{
  "count": 1,
  "datafeeds": [
    {
      "datafeed_id": "datafeed-total-requests",
      "state": "started",
      "node": {
        ...
        "attributes": {
          "ml.machine_memory": "17179869184",
          "ml.max_open_jobs": "20",
          "ml.enabled": "true"
        }
      },
      "assignment_explanation": ""
    }
  ]
}
----
// TESTRESPONSE[s/"17179869184"/$body.$_path/]
```

That example shows `17179869184` in the compiled docs but when it runs
the tests generated by that doc it ignores `17179869184` and asserts
instead that there is a value in that field. This is required because we
can't predict things like "how many milliseconds will this take?" and
"how much memory will this take?".

Before this change it was impossible to use `$_path` when any component
of the path contained a `.`. This fixes the `$_path` evaluator to
properly escape `.`.

Closes #28770
2018-03-19 14:17:09 -04:00
Martijn van Groningen 50f48e5184
Merge remote-tracking branch 'es/master' into ccr
* es/master: (97 commits)
  Clarify requirements of strict date formats. (#29090)
  Clarify that dates are always rendered as strings. (#29093)
  Compilation fix for #29067
  [Docs] Fix link to Grok patterns (#29088)
  Store offsets in index prefix fields when stored in the parent field (#29067)
  Fix starting on Windows from another drive (#29086)
  Use removeTask instead of finishTask in PersistentTasksClusterService (#29055)
  Added minimal docs for reindex api in java-api docs
  Allow overriding JVM options in Windows service (#29044)
  Clarify how to set compiler and runtime JDKs (#29101)
  CLI: Close subcommands in MultiCommand (#28954)
  TEST: write ops should execute under shard permit (#28966)
  [DOCS] Add X-Pack upgrade details (#29038)
  Revert "Improve error message for installing plugin (#28298)"
  Docs: HighLevelRestClient#exists (#29073)
  Validate regular expressions in dynamic templates. (#29013)
  [Tests] Fix GetResultTests and DocumentFieldTests failures (#29083)
  Reenable LiveVersionMapTests.testRamBytesUsed on Java 9. (#29063)
  Mute failing GetResultTests and DocumentFieldTests
  Improve error message for installing plugin (#28298)
  ...
2018-03-16 15:54:10 +01:00
Christoph Büscher 312ccc05d5
[Tests] Fix GetResultTests and DocumentFieldTests failures (#29083)
Changes made in #28972 seems to have changed some assumptions about how
SMILE and CBOR write byte[] values and how this is tested. This changes
the generation of the randomized DocumentField values back to BytesArray
while expecting the JSON and YAML deserialisation to produce Base64
encoded strings and SMILE and CBOR to parse back BytesArray instances.

Closes #29080
2018-03-15 16:42:26 +01:00
Boaz Leskes bf65cb4914
Untangle Engine Constructor logic (#28245)
Currently we have a fairly complicated logic in the engine constructor logic to deal with all the 
various ways we want to mutate the lucene index and translog we're opening.

We can:
1) Create an empty index
2) Use the lucene but create a new translog
3) Use both
4) Force a new history uuid in all cases.

This leads complicated code flows which makes it harder and harder to make sure we cover all the 
corner cases. This PR tries to take another approach. Constructing an InternalEngine always opens 
things as they are and all needed modifications are done by static methods directly on the 
directory, one at a time.
2018-03-14 20:59:47 +01:00
Lee Hinman 8e8fdc4f0e
Decouple XContentBuilder from BytesReference (#28972)
* Decouple XContentBuilder from BytesReference

This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.

While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).

Relates to #28504
2018-03-14 13:47:57 -06:00
Jason Tedor 5904d936fa
Copy Lucene IOUtils (#29012)
As we have factored Elasticsearch into smaller libraries, we have ended
up in a situation that some of the dependencies of Elasticsearch are not
available to code that depends on these smaller libraries but not server
Elasticsearch. This is a good thing, this was one of the goals of
separating Elasticsearch into smaller libraries, to shed some of the
dependencies from other components of the system. However, this now
means that simple utility methods from Lucene that we rely on are no
longer available everywhere. This commit copies IOUtils (with some small
formatting changes for our codebase) into the fold so that other
components of the system can rely on these methods where they no longer
depend on Lucene.
2018-03-13 12:49:33 -04:00
Jason Tedor 8b6fbe2c11
Add test for dying with dignity (#28987)
I have long wanted an actual test that dying with dignity works. It is
tricky because if dying with dignity works, it means the test JVM dies
which is usually an abnormal condition. And anyway, how does one force a
fatal error to be thrown. I was motivated to investigate this again by
the fact that I missed a backport to one branch leading to an issue
where Elasticsearch would not successfully die with dignity. And now we
have a solution: we install a plugin that throws an out of memory error
when it receives a request. We hack the standalone test infrastructure
to prevent this from failing the test. To do this, we bypass the
security manager and remove the PID file for the node; this tricks the
test infrastructure into thinking that it does not need to stop the
node. We also bypass seccomp so that we can fork jps to make sure that
Elasticsearch really died. And to be extra paranoid, we parse the logs
of the dead Elasticsearch process to make sure it died with
dignity. Never forget.
2018-03-12 23:20:07 -04:00
Martijn van Groningen bfd587d66c Merge remote-tracking branch 'es/master' into ccr
* es/master: (48 commits)
  Update bucket-sort-aggregation.asciidoc (#28937)
  [Docs] REST high-level client: Fix code for most basic search request (#28916)
  Improved percolator's random candidate query duel test and fixed bugs that were exposed by this:
  Revert "Rescore collapsed documents (#28521)"
  Build: Fix test logger NPE when no tests are run (#28929)
  [TEST] AwaitsFix QueryRescorerIT.testRescoreAfterCollapse
  Decouple XContentType from StreamInput/Output (#28927)
  Remove BytesRef usage from XContentParser and its subclasses (#28792)
  [DOCS] Correct typo in configuration (#28903)
  Fix incorrect datemath example (#28904)
  Add a usage example of the JLH score (#28905)
  Wrap stream passed to createParser in try-with-resources (#28897)
  Rescore collapsed documents (#28521)
  Fix (simple)_query_string to ignore removed terms (#28871)
  [Docs] Fix typo in composite aggregation (#28891)
  Try if tombstone is eligable for pruning before locking on it's key (#28767)
  Limit analyzed text for highlighting (improvements) (#28808)
  Missing `timeout` parameter from the REST API spec JSON files (#28328)
  Clarifies how query_string splits textual part (#28798)
  Update outdated java version reference (#28870)
  ...
2018-03-08 16:55:29 +01:00
Luca Cavanna 184a8718d8
REST high-level client: add flush API (#28852)
Relates to #27205
2018-03-01 10:56:03 +01:00
Luca Cavanna cd3d9c9f80
[TEST] share code between streamable/writeable/xcontent base test classes (#28785)
Today we have two test base classes that have a lot in common when it comes to testing wire and xcontent serialization: `AbstractSerializingTestCase` and `AbstractXContentStreamableTestCase`. There are subtle differences though between the two, in the way they work, what can be overridden and features that they support (e.g. insertion of random fields).

This commit introduces a new base class called `AbstractWireTestCase` which holds all of the serialization test code in common between `Streamable` and `Writeable`. It has two minimal subclasses called `AbstractWireSerializingTestCase` and `AbstractStreamableTestCase` which are specialized for `Writeable` and `Streamable`.

This commit also introduces a new test class called `AbstractXContentTestCase` for all of the xContent testing, which holds a testFromXContent method for parsing and rendering to xContent. This one can be delegated to from the existing `AbstractStreamableXContentTestCase` and `AbstractSerializingTestCase` so that we avoid code duplicate as much as possible and all these base classes offer the same functionalities in the same way. Having this last base class decoupled from the serialization testing may also help with the REST high-level client testing, as there are some classes where it's hard to implement equals/hashcode and this makes it possible to override `assertEqualInstances` for custom equality comparisons (also this base class doesn't require implementing equals/hashcode as it doesn't test such methods.
2018-02-23 10:48:48 +01:00
Tim Brooks 5a8ec9b762
Selectors operate on channel contexts (#28468)
This commit is related to #27260. Currently there is a weird
relationship between channel contexts and nio channels. The selectors
use the context for read and writing. But the selector operates directly
on the nio channel for registering, closing, and connecting.

This commit works on improving this relationship. The selector operates
directly on the context which wraps the low level java.nio.channels. The
NioChannel class is simply an API that is used to interact with the
channel (sending messages from outside the selector event loop,
scheduling a close, adding listeners, etc). The context is only used
internally by the channel to implement these apis and by the selector to
perform these operations.
2018-02-22 09:44:52 -07:00
Martijn van Groningen f6b46db90c
Merge remote-tracking branch 'es/master' into ccr
* es/master: (143 commits)
  Revert "Disable BWC tests for build issues"
  Remove AcknowledgedRestListener in favour of RestToXContentListener (#28724)
  Build: Consolidate archives and packages configuration (#28760)
  Skip some plugins service tests on Windows
  Migrate some *ResponseTests to AbstractStreamableXContentTestCase (#28749)
  Disable BWC tests for build issues
  Ensure that azure stream has socket privileges (#28751)
  [DOCS] Fixed broken link.
  Pass InputStream when creating XContent parser (#28754)
  [DOCS] Changed to use transient setting to reenabled allocation. Closes #27677
  Delay path expansion on Windows
  [TEST] replace randomAsciiAlphanumOfLengthBetween with randomAsciiLettersOfLengthBetween
  [Tests] Extract the testing logic for Google Cloud Storage (#28576)
  [Docs] Update links to java9 docs (#28750)
  version set in ingest pipeline (#27573)
  Revert "Add startup logging for standalone tests"
  Tests: don't wait for completion while trying to get completed task
  Add 5.6.9 snapshot version
  [Docs] Java high-level REST client : clean up (#28703)
  Updated distribution outputs in contributing docs
  ...
2018-02-22 14:52:08 +01:00
Luca Cavanna 8b4a298874
Migrate some *ResponseTests to AbstractStreamableXContentTestCase (#28749)
This allows us to save a bit of code, but also adds more coverage as it tests serialization which was missing in some of the existing tests. Also it requires implementing equals/hashcode and we get the corresponding tests for them for free from the base test class.
2018-02-21 20:04:12 +01:00