Commit Graph

972 Commits

Author SHA1 Message Date
Koen De Groote 878ae8eb3c Size lists in advance when known
When constructing an array list, if we know the size of the list in
advance (because we are adding objects to it derived from another list),
we should size the array list to the appropriate capacity in advance (to
avoid resizing allocations). This commit does this in various places.

Relates #24439
2017-05-12 10:36:13 -04:00
Jim Ferenczi 279a18a527 Add parent-join module (#24638)
* Add parent-join module

This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.

Relates #20257
2017-05-12 15:58:06 +02:00
Simon Willnauer be2a6ce80b Notify onConnectionClosed rather than onNodeDisconnect to prune transport handlers (#24639)
Today we prune transport handlers in TransportService when a node is disconnected.
This can cause connections to starve in the TransportService if the connection is
opened as a short living connection ie. without sharing the connection to a node
via registering in the transport itself. This change now moves to pruning based
on the connections cache key to ensure we notify handlers as soon as the connection
is closed for all connections not just for registered connections.

Relates to #24632
Relates to #24575
Relates to #24557
2017-05-12 15:40:40 +02:00
Yannick Welsch 04e08f5e49 Simplify Discovery interface (#24608)
- Removes clusterState, getInitialClusterState and getMinimumMasterNodes methods from Discovery interface.
- Sets PingContextProvider in ZenPing constructor
- Renames state in ZenDiscovery to committedState
2017-05-12 14:08:14 +02:00
Ryan Ernst f477a6472d Settings: Deprecate settings in .yml and .json (#24059)
This commit adds a deprecation warning when elasticsearch.yml or
elasticsearch.json is read during startup.

relates #19391
2017-05-11 13:11:18 -07:00
Simon Willnauer 1155615536 Move DeleteByQuery and Reindex requests into core (#24578)
This allows other plugins to use a client to call the functionality
that is in the core modules without duplicating the logic.
Plugins can now safely send the request and response classes via the
client even if the requests are executed locally. All relevant classes
are loaded by the core classloader such that plugins can share them.

This is re-adds this commit that was revered in 952feb58e4
2017-05-11 20:22:30 +02:00
qwerty4030 e7d352b489 Compound order for histogram aggregations. (#22343)
This commit adds support for histogram and date_histogram agg compound order by refactoring and reusing terms agg order code. The major change is that the Terms.Order and Histogram.Order classes have been replaced/refactored into a new class BucketOrder. This is a breaking change for the Java Transport API. For backward compatibility with previous ES versions the (date)histogram compound order will use the first order. Also the _term and _time aggregation order keys have been deprecated; replaced by _key.

Relates to #20003: now that all these aggregations use the same order code, it should be easier to move validation to parse time (as a follow up PR).

Relates to #14771: histogram and date_histogram aggregation order will now be validated at reduce time.

Closes #23613: if a single BucketOrder that is not a tie-breaker is added with the Java Transport API, it will be converted into a CompoundOrder with a tie-breaker.
2017-05-11 18:06:26 +01:00
Simon Willnauer 952feb58e4 Revert "Move DeleteByQuery and Reindex requests into core (#24578)"
This reverts commit 6ea2ae32b8.
2017-05-11 18:26:40 +02:00
Lee Hinman 57fddce8c4 [TEST] Use at least 1ms for FunctionScoreQueryBuilderTests
Previously micros or nanoseconds could be used, which was reduced to 0
milliseconds and `scale` must be higher than 0.
2017-05-11 10:10:55 -06:00
Simon Willnauer 6ea2ae32b8 Move DeleteByQuery and Reindex requests into core (#24578)
This allows other plugins to use a client to call the functionality
that is in the core modules without duplicating the logic.
Plugins can now safely send the request and response classes via the
client even if the requests are executed locally. All relevant classes
are loaded by the core classloader such that plugins can share them.
2017-05-11 16:20:40 +02:00
Nik Everett 8188569fd1 Add qa module that tests reindex-from-remote against pre-5.0 versions of Elasticsearch (#24561)
Adds tests for reindex-from-remote for the latest 2.4, 1.7, and
0.90 releases. 2.4 and 1.7 are fairly popular versions but 0.90
is a point of pride.

This fixes any issues those tests revealed.

Closes #23828
Closes #24520
2017-05-11 10:06:20 -04:00
Nik Everett 65f2717ab7 Make PreConfiguredTokenFilter harder to misuse (#24572)
There are now three public static method to build instances of
PreConfiguredTokenFilter and the ctor is private. I chose static
methods instead of constructors because those allow us to change
out the implementation returned if we so desire.

Relates to #23658
2017-05-10 22:39:43 -04:00
Jack Conradson 6ac8a1eb85 Deprecate Fine Grain Settings for Scripts (#24573) 2017-05-10 13:09:31 -07:00
Christoph Büscher fbc8345db5 Tests: Fix VersionUtilsTests after version bump 2017-05-10 17:36:12 +02:00
Martijn van Groningen 51c74ce547
Added unit tests for InternalMatrixStats.
Also moved InternalAggregationTestCase to test-framework module in order to make use of it from other modules than core.

Relates to #22278
2017-05-10 11:06:18 +02:00
Matt Weber b24326271e Add ICUCollationFieldMapper (#24126)
Adds a new "icu_collation" field type that exposes lucene's
ICUCollationDocValuesField.  ICUCollationDocValuesField is the replacement
for ICUCollationKeyFilter which has been deprecated since Lucene 5.
2017-05-10 10:35:11 +02:00
Ryan Ernst 9ca7d28552 Scripting: Remove "service" from ScriptEngine interface name (#24574)
This commit renames ScriptEngineService to ScriptEngine.  It is often
confusing because we have the ScriptService, and then
ScriptEngineService implementations, but the latter are not services as
we see in other places in elasticsearch.
2017-05-10 00:47:33 -07:00
Ryan Ernst ebd3e5f73f Scripting: Deprecate file script settings (#24555)
File scripts have 2 related settings: the path of file scripts, and
whether they can be dynamically reloaded. This commit deprecates those
settings.

relates #21798
2017-05-09 16:14:57 -07:00
Jason Tedor 8f873620ee Inline global checkpoints
Today we rely on background syncs to relay the global checkpoint under
the mandate of the primary to its replicas. This means that the global
checkpoint on a replica can lag far behind the primary. The commit moves
to inlining global checkpoints with replication requests. When a
replication operation is performed, the primary will send the latest
global checkpoint inline with the replica requests. This keeps the
replicas closer in-sync with the primary.

However, consider a replication request that is not followed by another
replication request for an indefinite period of time. When the replicas
respond to the primary with their local checkpoint, the primary will
advance its global checkpoint. During this indefinite period of time,
the replicas will not be notified of the advanced global
checkpoint. This necessitates a need for another sync. To achieve this,
we perform a global checkpoint sync when a shard falls idle.

Relates #24513
2017-05-09 15:08:11 -04:00
Nik Everett bb06d8ec4f Allow plugins to build pre-configured token filters (#24223)
This changes the way we register pre-configured token filters so that
plugins can declare them and starts to move all of the pre-configured
token filters out of core. It doesn't finish the job because doing
so would make the change unreviewably large. So this PR includes
a shim that keeps the "old" way of registering pre-configured token
filters around.

The Lowercase token filter is special because there is a "special"
interaction between it and the lowercase tokenizer. I'm not sure
exactly what to do about it so for now I'm leaving it alone with
the intent of figuring out what to do with it in a followup.

This also renames these pre-configured token filters from
"pre-built" to "pre-configured" because that seemed like a more
descriptive name.

This is a part of #23658
2017-05-09 14:50:49 -04:00
Jim Ferenczi b6c714ccc8 Fix BWC for query_and_fetch 2017-05-09 18:52:53 +02:00
Adrien Grand a72eaa8e0f Identify documents by their `_id`. (#24460)
Now that indices have a single type by default, we can move to the next step
and identify documents using their `_id` rather than the `_uid`.

One notable change in this commit is that I made deletions implicitly create
types. This helps with the live version map in the case that documents are
deleted before the first type is introduced. Otherwise there would be no way
to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from
`DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are
different if versioning is involved.
2017-05-09 16:33:52 +02:00
Hendrik Muhs f41ddb3607 Move MockLogAppender to elasticsearch test (#24542)
In order to make MockLogAppender (utility to test logging) available outside
of es-core move MockLogAppender from test core-tests to test framework. As
package names do not change, no need to change clients.
2017-05-08 13:02:27 +02:00
Koen De Groote 13c17c75b5 Remove unneeded empty string concatentation
This commit removes concatenation by empty string in places where it
is simply not needed to obtain a string representation.

Relates #24411
2017-05-06 00:28:53 -04:00
Yannick Welsch c8712e9531 Limit AllocationService dependency injection hack (#24479)
Changes the scope of the AllocationService dependency injection hack so that it is at least contained to the AllocationService and does not leak into the Discovery world.
2017-05-05 08:39:18 +02:00
Jason Tedor 61d5eddbd6 Fix typo in comment in IndexShardTestCase
This commit fixes a silly typo in IndexShardTestCase.java.
2017-05-04 21:04:35 -04:00
Yannick Welsch be19ccef57 Discard stale node responses from async shard fetching (#24434)
Async shard fetching only uses the node id to correlate responses to requests. This can lead to a situation where a response from an earlier request is mistaken as response from a new request when a node is restarted. This commit adds unique round ids to correlate responses to requests.
2017-05-03 09:47:21 +02:00
Simon Willnauer 2f9e9460d4 Move RemoteClusterService into TransportService (#24424)
TransportService and RemoteClusterService are closely coupled already today
and to simplify remote cluster integration down the road it can be a direct
dependency of TransportService. This change moves RemoteClusterService into
TransportService with the goal to make it a hidden implementation detail
of TransportService in followup changes.
2017-05-02 18:09:32 +02:00
Koen De Groote 0fef5acd01 Cleanup collections construction
This commit cleans up some cases where a list or map was being
constructed, and then an existing collection was copied into the new
collection. The clean is to instead use an appropriate constructor to
directly copy the existing collection in during collection
construction. The advantage of this is that the new collection is sized
appropriately.

Relates #24409
2017-04-30 21:26:51 -04:00
Yannick Welsch 35f78d098a Separate publishing from applying cluster states (#24236)
Separates cluster state publishing from applying cluster states:

- ClusterService is split into two classes MasterService and ClusterApplierService. MasterService has the responsibility to calculate cluster state updates for actions that want to change the cluster state (create index, update shard routing table, etc.). ClusterApplierService has the responsibility to apply cluster states that have been successfully published and invokes the cluster state appliers and listeners.
- ClusterApplierService keeps track of the last applied state, but MasterService is stateless and uses the last cluster state that is provided by the discovery module to calculate the next prospective state. The ClusterService class is still kept around, which now just delegates actions to ClusterApplierService and MasterService.
- The discovery implementation is now responsible for managing the last cluster state that is used by the consensus layer and the master service. It also exposes the initial cluster state which is used by the ClusterApplierService. The discovery implementation is also responsible for adding the right cluster-level blocks to the initial state.
- NoneDiscovery has been renamed to TribeDiscovery as it is exclusively used by TribeService. It adds the tribe blocks to the initial state.
- ZenDiscovery is synchronized on state changes to the last cluster state that is used by the consensus layer and the master service, and does not submit cluster state update tasks anymore to make changes to the disco state (except when becoming master).

Control flow for cluster state updates is now as follows:

- State updates are sent to MasterService
- MasterService gets the latest committed cluster state from the discovery implementation and calculates the next cluster state to publish
- MasterService submits the new prospective cluster state to the discovery implementation for publishing
- Discovery implementation publishes cluster states to all nodes and, once the state is committed, asks the ClusterApplierService to apply the newly committed state.
- ClusterApplierService applies state to local node.
2017-04-28 09:34:31 +02:00
Yannick Welsch 2fa1c9fff1 Provide target allocation id as part of start recovery request (#24333)
This makes it possible for the recovery source to verify that it is talking to the shard it thinks it is talking to.

Closes #24167
2017-04-27 14:45:44 +02:00
Adrien Grand 1be2800120 Only allow one type on 7.0 indices (#24317)
This adds the `index.mapping.single_type` setting, which enforces that indices
have at most one type when it is true. The default value is true for 6.0+ indices
and false for old indices.

Relates #15613
2017-04-27 08:43:20 +02:00
Jason Tedor 74acc594a9 Fix inconsistencies in long GC disruption
This commit fixes some inconsistencies in long GC disruption where we
mixed stopping and suspending when the action we are performing on
threads is suspending which is distinct from stopping a thread.
2017-04-26 21:23:19 -04:00
Nik Everett bc45d10e82 Remove most usages of 1-arg Script ctor (#24325)
The one argument ctor for `Script` creates a script with the
default language but most usages of are for testing and either
don't care about the language or are for use with
`MockScriptEngine`. This replaces most usages of the one argument
ctor on `Script` with calls to `ESTestCase#mockScript` to make
it clear that the tests don't need the default scripting language.

I've also factored out some copy and pasted script generation
code into a single place. I would have had to change that code
to use `mockScript` anyway, so it was easier to perform the
refactor.

Relates to #16314
2017-04-26 16:04:38 -04:00
Jason Tedor 2ed1f7a339 Avoid leaks in Long GC disruption tests
We can leak disrupted threads here since we never wait for them to
complete after freeing them from their loops. This commit addresses this
by joining on disrupted threads, and addresses fallout from trying to
join here.

Relates #24338
2017-04-26 15:26:36 -04:00
Nik Everett 7c3efb829b Move char filters into analysis-common (#24261)
Another step down the road to dropping the
lucene-analyzers-common dependency from core.

Note that this removes some tests that no longer compile from
core. I played around with adding them to the analysis-common
module where they would compile but we already test these in
the tests generated from the example usage in the documentation.

I'm not super happy with the way that `requriesAnalysisSettings`
works with regards to plugins. I think it'd be fairly bug-prone
for plugin authors to use. But I'm making it visible as is for
now and I'll rethink later.

A part of #23658
2017-04-26 13:25:34 -04:00
Ryan Ernst 51b33f1fd5 S3 Repository: Deprecate remaining `repositories.s3.*` settings (#24144)
Most of these settings should always be pulled from the repository
settings. A couple were leftover that should be moved to client
settings. The path style access setting should be removed altogether.
This commit adds deprecations for all of these existing settings, as
well as adding new client specific settings for max retries and
throttling.

relates #24143
2017-04-25 23:43:20 -07:00
Nik Everett fc97e25b56 Add task to look for tests in src/main (#24298)
Creates a new task `namingConventionsMain`, that runs on the
`buildSrc` and `test:framework` projects and fails the build if
any of the classes in the main artifacts are named like tests or
are non-abstract subclasses of ESTestCase.

It also fixes the three tests that would cause it to fail.
2017-04-25 21:11:47 -04:00
Simon Willnauer e69147a870 Add support for `tests.enable_mock_modules` to ESIntegTestCase (#24309)
`tests.enable_mock_modules` is a documented but unrespected / unused
option to disable all mock modules / pluings during test runs. This
will basically site-step mock assertions like check-index on shard closing.
This can speed up test-execution dramatically on nodes with slow disks etc.

Relates to #24304
2017-04-25 17:34:25 +02:00
Koen De Groote 88de33d43d Minor changes to collection creation from enums (#24274)
These changes are mainly cosmetic with minor perf advantages drawn from checkstyle.
2017-04-25 13:13:55 +02:00
Jason Tedor 1500beafc7 Check for default.path.data included in path.data
If the user explicitly configured path.data to include
default.path.data, then we should not fail the node if we find indices
in default.path.data. This commit addresses this.

Relates #24285
2017-04-24 09:31:54 -04:00
Ryan Ernst aadc33d260 Scripts: Remove unwrap method from executable scripts (#24263)
The unwrap method was leftover from support javascript and python. Since
those languages are removed in 6.0, this commit removes the unwrap
feature from scripts.
2017-04-21 17:50:22 -07:00
Simon Willnauer 2ca7072b24 Fill missing sequence IDs up to max sequence ID when recovering from store (#24238)
Today we might promote a primary and recover from store where after translog
recovery the local checkpoint is still behind the maximum sequence ID seen.
To fill the holes in the sequence ID history this PR adds a utility method
that fills up all missing sequence IDs up to the maximum seen sequence ID
with no-ops.

Relates to #10708
2017-04-21 20:28:00 +02:00
Adrien Grand 2b8fa64cf7 ESIntegTestCase.indexRandom should not introduce types. (#24202)
Since we plan on removing types, `indexRandom` should not introduce new types.
This commit refactors `indexRandom` to reuse existing types.
2017-04-21 10:38:36 +02:00
Nik Everett caf376c8af Start building analysis-common module (#23614)
Start moving built in analysis components into the new analysis-common
module. The goal of this project is:
1. Remove core's dependency on lucene-analyzers-common.jar which should
shrink the dependencies for transport client and high level rest client.
2. Prove that analysis plugins can do all the "built in" things by moving all
"built in" behavior to a plugin.
3. Force tests not to depend on any oddball analyzer behavior. If tests
need anything more than the standard analyzer they can use the mock
analyzer provided by Lucene's test infrastructure.
2017-04-19 18:51:34 -04:00
Ali Beyad 3c82eea5fb Wait for cluster to become quiescent between REST tests (#24148)
[TEST] ensures REST tests wait for cluster state updates to finish
processing before moving to the next test
2017-04-19 13:17:09 -04:00
Jim Ferenczi f05af0a382 Enable index-time sorting (#24055)
This change adds an index setting to define how the documents should be sorted inside each Segment.
It allows any numeric, date, boolean or keyword field inside a mapping to be used to sort the index on disk.
It is not allowed to use a `nested` fields inside an index that defines an index sorting since `nested` fields relies on the original sort of the index.
This change does not add early termination capabilities in the search layer. This will be added in a follow up.

Relates #6720
2017-04-19 14:36:11 +02:00
Ryan Ernst 212f24aa27 Tests: Clean up rest test file handling (#21392)
This change simplifies how the rest test runner finds test files and
removes all leniency.  Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.

closes #20240
2017-04-18 15:07:08 -07:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Jason Tedor 8033c576b7 Detect remnants of path.data/default.path.data bug
In Elasticsearch 5.3.0 a bug was introduced in the merging of default
settings when the target setting existed as an array. When this bug
concerns path.data and default.path.data, we ended up in a situation
where the paths specified in both settings would be used to write index
data. Since our packaging sets default.path.data, users that configure
multiple data paths via an array and use the packaging are subject to
having shards land in paths in default.path.data when that is very
likely not what they intended.

This commit is an attempt to rectify this situation. If path.data and
default.path.data are configured, we check for the presence of indices
there. If we find any, we log messages explaining the situation and fail
the node.

Relates #24099
2017-04-17 07:03:46 -04:00