This commit effectively reverts e1aa91d , as it is not needed anymore to add the original listed nodes. The cluster state local call made will in fact always return at least the local node (see #6811).
There were a couple of downsides caused by putting the original listed nodes among the connected nodes:
1) in the following retries, they weren't seen as listed nodes anymore, thus the light connect wasn't used
2) among the connected nodes some were "bad" duplicates as they are already there and don't contain all needed info for each node. This was causing serialization problems for instance given that the node version was missing on the `DiscoveryNode` object.
Closes#7067
The RetryListener was notified twice for each single failure, which caused some additional retries, but more importantly was making the client reach the maximum number of retries (number of connected nodes) too quickly, meanwhile ongoing retries which could succeed were not completed yet.
The TransportService used to throw ConnectTransportException due to throwConnectException set to true, and also notify the listener of any exception received from a separate thread through the request holder.
Simplified exception handling by just removing the throwConnectException option from the TransportService, used only in the transport client. The transport client now relies solely on the request holder to notify of failures and eventually retry.
Closes#6829
Adds additional version checks in NodeStats for older versions
When using an external cluster (backwards compatibility tests), the act
of checking the request breaker requires a network buffer, which
increments the breaker. This change only checks the request breaker in
InternalTestCluster and uses Guice to retrieve it instead of
a (possible) network request.
Also removed the now unused InternalCircuitBreakerService class
Also improved filter chain tests to not rely on execution time, and made filter chain tests look more similar to what happens in reality by removing multiple threads creation in testTooManyContinueProcessing (something we don't support anyway, makes little sense to test it).
Closes#7021
With commit 07c632a2d4dbefe44e8f25dc4ded6cf143d60e41, we now have a new Lucene.parseVersionLenient(String, Version) method which tries to find an existing Lucene version based on the two first digits X.Y of X.Y.Z String.
Pull Request #7055 fixed Version parsing for bugfix releases
causing problems with minor version in segments files. Even though
we never release anything with lucene in alpha / beta status this
commit fixes lenient parsing for these cases.
Relates to #7055
We parse the version that is shipped with the Lucene segments in order
to find the version of lucene that wrote a particular segment. Yet, some lucene
version ie:
* 4.3.1 (Elasticsearch 0.90.2)
* 4.5.1 (Elasticsearch 0.90.7)
* 3.6.1 (pre Elasticsearch 0.90.0)
wrote illegal strings containing the minor version which causes IAE exceptions
being thrown from lucenes parsing method.
Closes#7055
Adds a breaker for request BigArrays, which are used for parent/child
queries as well as some aggregations. Certain operations like Netty HTTP
responses and transport responses increment the breaker, but will not
trip.
This also changes the output of the nodes' stats endpoint to show the
parent breaker as well as the fielddata and request breakers.
There are a number of new settings for breakers now:
`indices.breaker.total.limit`: starting limit for all memory-use breaker,
defaults to 70%
`indices.breaker.fielddata.limit`: starting limit for fielddata breaker,
defaults to 60%
`indices.breaker.fielddata.overhead`: overhead for fielddata breaker
estimations, defaults to 1.03
(the fielddata breaker settings also use the backwards-compatible
setting `indices.fielddata.breaker.limit` and
`indices.fielddata.breaker.overhead`)
`indices.breaker.request.limit`: starting limit for request breaker,
defaults to 40%
`indices.breaker.request.overhead`: request breaker estimation overhead,
defaults to 1.0
The breaker service infrastructure is now generic and opens the path to
adding additional circuit breakers in the future.
Fixes#6129
Conflicts:
src/main/java/org/elasticsearch/index/fielddata/IndexFieldData.java
src/main/java/org/elasticsearch/index/fielddata/IndexFieldDataService.java
src/main/java/org/elasticsearch/index/fielddata/RamAccountingTermsEnum.java
src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
src/main/java/org/elasticsearch/index/fielddata/ordinals/InternalGlobalOrdinalsBuilder.java
src/main/java/org/elasticsearch/index/fielddata/plain/AbstractIndexOrdinalsFieldData.java
src/main/java/org/elasticsearch/index/fielddata/plain/DisabledIndexFieldData.java
src/main/java/org/elasticsearch/index/fielddata/plain/IndexIndexFieldData.java
src/main/java/org/elasticsearch/index/fielddata/plain/NonEstimatingEstimator.java
src/main/java/org/elasticsearch/index/fielddata/plain/PackedArrayIndexFieldData.java
src/main/java/org/elasticsearch/index/fielddata/plain/ParentChildIndexFieldData.java
src/main/java/org/elasticsearch/index/fielddata/plain/SortedSetDVOrdinalsIndexFieldData.java
src/main/java/org/elasticsearch/node/internal/InternalNode.java
src/test/java/org/elasticsearch/index/aliases/IndexAliasesServiceTests.java
src/test/java/org/elasticsearch/index/codec/CodecTests.java
src/test/java/org/elasticsearch/index/fielddata/AbstractFieldDataTests.java
src/test/java/org/elasticsearch/index/fielddata/IndexFieldDataServiceTests.java
src/test/java/org/elasticsearch/index/mapper/MapperTestUtils.java
src/test/java/org/elasticsearch/index/query/IndexQueryParserFilterCachingTests.java
src/test/java/org/elasticsearch/index/query/SimpleIndexQueryParserTests.java
src/test/java/org/elasticsearch/index/query/guice/IndexQueryParserModuleTests.java
src/test/java/org/elasticsearch/index/search/FieldDataTermsFilterTests.java
src/test/java/org/elasticsearch/index/search/child/ChildrenConstantScoreQueryTests.java
src/test/java/org/elasticsearch/index/similarity/SimilarityTests.java
No that we are using the index created version to make index-time decisions,
assuming that the version is the current version when settings are null is
very error-prone. Instead we should ensure that settings are always non-null
and contain the version when the index was created.
Close#7032
In context of mapper attachment and other mapper plugins, when dealing with multi fields, sub fields never get the `externalValue` although it was set.
Here is a full script which reproduce the issue when used with mapper attachment plugin:
```
DELETE /test
PUT /test
{
"mappings": {
"test": {
"properties": {
"f": {
"type": "attachment",
"fields": {
"f": {
"analyzer": "english",
"fields": {
"no_stemming": {
"type": "string",
"store": "yes",
"analyzer": "standard"
}
}
}
}
}
}
}
}
}
PUT /test/test/1
{
"f": "VGhlIHF1aWNrIGJyb3duIGZveGVz"
}
GET /test/_search
{
"query": {
"match": {
"f": "quick"
}
}
}
GET /test/_search
{
"query": {
"match": {
"f.no_stemming": "quick"
}
}
}
GET /test/test/1?fields=f.no_stemming
```
Related to https://github.com/elasticsearch/elasticsearch-mapper-attachments/issues/57Closes#5402.
This is only applicable when the order is set to _count. The upper bound of the error in the doc count is calculated by summing the doc count of the last term on each shard which did not return the term. The implementation calculates the error by summing the doc count for the last term on each shard for which the term IS returned and then subtracts this value from the sum of the doc counts for the last term from ALL shards.
Closes#6696
This commit adds regular expression support for the allow-origin
header depending on the value of the request `Origin` header.
The existing HttpRequestBuilder is also extended to support the
OPTIONS HTTP method.
Relates #5601Closes#6891
The HTTP client implementation used by the Elasticsearch REST tests is
backed by apache http client instead of a self written helper class,
that uses HttpUrlConnection. This commit removes the old simple HttpClient
class and uses the more powerful and reliable one for all tests.
It also fixes a minor bug, that when sending a 301 redirect, a Location
header needs to be added as well, which was uncovered by the switching
to the new client.
Closes#7003
The bug reproduces when the point under test for the placement of the hole of the polygon has an x coordinate which only intersects with the ends of edges in the main polygon. The previous code threw out these cases as not relevant but an intersect at 1.0 of the distance from the start to the end of an edge is just as valid as an intersect at any other point along the edge. The fix corrects this and adds a test.
Closes#5773
This commit adds the ability to force blocking on the flush operaition
to make sure all files have been written and synced to disk. Without
this option a flush might be executing at the same time causing the
current flush to fail and return before all files being synced.
Closes#6996
Added two new interfaces:
1) IndicesRequest that allows to retrieve the indices the request relates to in a generic manner, together with the indices options that tell how they are going to get resolved and expanded
2) CompositeIndicesRequest for compound requests that hold multiple indices request like MultiSearchRequest, MultiGetRequest, MultiTermVectorsRequest, BulkRequest, BenchmarkRequest, PercolateRequest, MultiPercolateRequest and MoreLikeThisRequest
Taken the chance to streamline the indices options and add them to every request where it makes sense (although they can't be changed from the outside), rather than leaving them implicit in the related TransportAction when indices get expanded (tipycally MetaData#concreteIndices or MetaData#concreteSingleIndex). Added IndicesOptions parameter to MetaData#concreteSingleIndex to make sure it is taken from the request, where the information belongs, instead of hardcoded within MetaData. The concreteSingleIndex method remains but it's just a utility method that returns a single index instead of an array and complains otherwise.
Also made sure NPE is never thrown when setting indices(null) to IndicesAliasesRequest, similar to what SearchRequest does.
Closes#6933
The benchmark indexes 200 unique full-width longs. For uninverted field data
we try to use the most memory-efficient storage, and in that case it would use
two arrays: one for the doc->ordinals mapping and one for the ordinal->value
mapping. Which is slower than what doc values do by storing directly the
mapping from docs to values.
Before this change each aggregation had to output an object field with its name and write its JSON inside that object. This allowed for badly behaved aggregations which could write JSON content in the root of the 'aggs' object. this change move the writing of the aggregation name to a level above the aggregation itself, ensuring that aggregations can only write within there own scope in the JSON output.
Closes#7004
Today if the user supplies a custom missing value for a string sort,
we do it in an extremely slow way, not using ordinals but dereferencing
bytes for every document. Ordinals are only used if the missing value
is _first or _last.
Instead, use ordinals with custom missing values too.
Closes#7005
Single index operations to use the newly added IndexClosedException introduced with #6475. This way we can also fail faster when we are trying to execute operations on closed indices and their use is not allowed (depending on indices options). Indices blocks are still checked but we can already throw error while resolving indices (MetaData#concreteIndices).
Effectively this change also affects what we return when using one of the following apis: analyze, bulk, index, update, delete, explain, get, multi_get, mlt, term vector, multi_term vector. We now return `{"error":"IndexClosedException[[test] closed]","status":403}` instead of `{"error":"ClusterBlockException[blocked by: [FORBIDDEN/4/index closed];]","status":403}`.
Closes#6988
The default shard size in the terms aggregation now uses BucketUtils.suggestShardSideQueueSize() to set the shard size if the user does not specify it as a parameter.
Closes#6857
Allow users to control document collection termination, if a specified terminate_after number is
set. Upon setting the newly added parameter, the response will include a boolean terminated_early
flag, indicating if the document collection for any shard terminated early.
closes#6876
instead of a custom encoding in BINARY.
In low level benchmarks this is 2x to 5x faster: its also optimized
for the common case where fields actually only contain at most one
value for each document.
Additionally SORTED_NUMERIC doesn't lose values if they appear more
than once, so mathematical computations such as averages are correct.
Closes#6967
The recycling happening in facets is done manually and arrays are sometimes not
released. Aggregations do it in a less error-prone way by registering on to the
SearchContext.
This commit removes custom comparators in favor of the ones that are in Lucene.
The major change is for nested documents: instead of having a comparator wrapper
that deals with nested documents, this is done at the fielddata level by having
a selector that returns the value to use for comparison.
Sorting with custom missing string values might be slower since it is using
TermValComparator since Lucene's TermOrdValComparator only supports sorting
missing values first or last. But other than this particular case, this change
will allow us to benefit from improvements on comparators from the Lucene side.
Close#5980
This change just changes the default for index.codec.bloom.load to
false: with recent performance improvements to ID lookup, such as
#6298, bloom filters don't give much of a performance gain anymore,
and they can consume non-trivial RAM when there are many tiny
documents.
For now, we still index the bloom filters, so if a given app wants
them back, it can just update the index.codec.bloom.load to true.
Closes#6959
GET only returned null even when stored if requested with GET like this:
`curl -XGET "http://localhost:9200/test/test/1?fields=_all"`
Instead, it should simply behave like a String field and return the
concatenated fields as String.
closes#6924
A new option `prune` has been added to allow users to control phrase suggestion pruning when `collate`
is set. If the new option is set, the phrase suggestion option will contain a boolean `collate_match`
indicating whether the respective result had hits in collation.
CLoses#6927
Today we only do count searches to ensure sane results are returned
after upgrading etc. This change adds sorting to the picture asserting
on simple numeric sorting that uses field data etc. after upgrading.
Relates to #6967
In unicast discovery, we try to reuse existing discovery nodes based on the node address they have. If we find an existing node based on its address, and for some reason its not connected, don't add it to the list of nodes to disconnect from, as that (full) connection is useful down the road
closes#6966
Looking at the connect code, if 2 threads at the same time try and connect to a node, and both enter sequentially the connectLock code block, the second one would try and put the connection in the map, and close the replaced channels, which will cause the existing connection to close as well (since it removes the node from the connectedNodes map)
To fix this, simply make sure we properly check the existence of the connection within the connectionLock block, so there won't be concurrent connections going on.
While doing this, also went over all the mutation code that handles disconnections, and made sure they are properly done only within a connection lock.
closes#6964
This commits removes BytesValues/LongValues/DoubleValues/... and tries to use
Lucene's APIs such as NumericDocValues or RandomAccessOrds instead whenever
possible.
The next step would be to take advantage of the fact that APIs are the same in
Lucene and Elasticsearch in order to remove our custom comparators and use
Lucene's.
There are a few side-effects to this change:
- GeoDistanceComparator has been removed, DoubleValuesComparator is used instead
on top of dynamically computed values (was easier than migrating
GeoDistanceComparator).
- SortedNumericDocValues doesn't guarantee uniqueness so long/double terms
aggregators have been updated to make sure a document cannot fall twice in
the same bucket.
- Sorting by maximum value of a field or running a `max` aggregation is
potentially significantly faster thanks to the random-access API.
Our aggs and p/c aggregations benchmarks don't report differences with this
change on uninverted field data. However the fact that doc values don't need
to be wrapped anymore seems to help a lot. For example
TermsAggregationSearchBenchmark reports ~30% faster terms aggregations on doc
values on string fields with this change, which are now only ~18% slower than
uninverted field data although stored on disk.
Close#6908
The assertion on binary equality for streamable serialization
sometimes fails due to the usage of identify hashmaps inside
the InternalSearchHits serialization. This only happens if
the number of shards the result set is composed of is very high.
This commit makes the serialziation deterministic and removes
the need to serialize the ordinal due to in-order serialization.
The change introduced in #6949 (do not serialize the cluster state) also means master now responds with an empty response rather then a JoinResponse. However, sendJoinRequestBlocking still expected a JoinRequest.
At the moment we serialize the cluster state in JoinResponse and ValidateJoinRequest. However this state is not used anywhere and can be removed to save on network overhead
Closes#6949
This test deletes and updates using upserts documents over several threads in a
tight loop. It counts the number of responses and verifies that the versions at
the end are correct.
today if a snapshot is corrupted the restore operation
never terminates. Yet, if the snapshot is corrupted there
is no way to restore it anyway. If such a snapshot is restored
today the only way to cancle it is to delete the entire index which
might cause dataloss. This commit also fixes an issue in InternalEngine
where a deadlock can occur if a corruption is detected during flush
since the InternalEngine#snapshotIndex aqcuires a topLevel read lock
which prevents closing the engine.
Closes#6938
The `index.fail_on_corruption` was not updateable via the index settings
API. This commit also fixed the setting prefix to be consistent with other
setting on the engine. Yet, this feature is unreleased so this won't break anything.
Closes#6941
Small refactorings to make the MessageChannelHandler more extensible.
Also allowed access to the different netty pipelines
This is the fix after the first version had problems with the HTTP
transport due to wrong reusing channel handlers, which is the reason
why tests failed.
Relates #6889Closes#6915
It's now possible to inject action filters from plugins via `ActionModule#registerFilter` through the following code:
```
public void onModule(ActionModule actionModule) {
actionModule.registerFilter(MyFilter.class);
}
```
Also made `TransportAction#execute` methods final to enforce the execution of the filter chain. By default the chain is empty though.
Note that the action filter chain is executed right after the request validation, as the filters might rely on a valid request to do their work.
Closes#6921
Today when we start a `TransportClient` we use the given transport
addresses and create a `DiscoveryNode` from it without knowing the
actual nodes version. We just use the `Version.CURRENT` which is an
upper bound. Yet, the other node might be a version less than the
currently running and serialisation of the nodes info might break. We
should rather use a lower bound here which is the version of the first
release with the same major version as `Version.CURRENT` since this is
what we officially support.
This commit moves to use the minimum major version or an RC / Snapshot
if the current version is a snapshot.
Closes#6894
Adds the ability to the Term Vector API to generate term vectors for some
chosen fields, even though they haven't been explicitely stored in the index.
Relates to #5184Closes#6567
The bulk processor tries to acquire all leases for the semaphore to wait
for all pending requests. Yet, we should release them afterwards again to
ensure we don't ever deadlock if there is a bug in the processor.
This commit also adds a testcase for this method
It's now possible to register watchers along with a specified check frequency. There are three frequencies: low, medium, high. Each one is associated with a check interval that determines how frequent the watchers will check for changes and notify listeners if needed. By default, the intervals are 5s, 30s and 60s respectively, but they can also be customized in the settings. also:
- Added the WatcherHandle construct by which one can stop it (remove it) and resume it (re add it). Also provices access to the watchers itself and the frequency by which it's checked
- Change the default frequency to 30 seconds interval (used to be 60 seconds). The only watcher that is currently effected by this is the script watcher (now auto-loading scripts will auto-load every 30 seconds if changed)
This is to prevent a rare racing condition where the very same shard gets allocated to the node after our sanity check that the cluster state didn't check and the actual deletion of the files.
Closes#6902
This results in unstable tests, most likely due to Channels being mixed
up by wrongly creating the pipelines. Needs investigation and a test.
This reverts commit db7f0d36af.
The IndexingMemoryController determines the amount of indexing buffer size and translog buffer size each shard should have. It takes memory from inactive shards (indexing wise) and assigns it to other shards. To do so it needs to know about the addition and closing of shards. The current implementation hooks into the indicesService.indicesLifecycle() mechanism to receive call backs, such shard entered the POST_RECOVERY state. Those call backs are typically run on the thread that actually made the change. A mutex was used to synchronize those callbacks with IndexingMemoryController's background thread, which updates the internal engines memory usage on a regular interval. This introduced a dependency between those threads and the locks of the internal engines hosted on the node. In a *very* rare situation (two tests runs locally) this can cause recovery time outs where two nodes are recovering replicas from each other.
This commit introduces a a lock free approach that updates the internal data structures during iterations in the background thread.
Closes#6892
FieldMapper has two methods
`Filter termsFilter(List values, @Nullable QueryParseContext)` which is supposed
to work on the inverted index and
`Filter termsFilter(QueryParseContext, List, QueryParseContext)` which is
supposed to work on field data. Let's rename the second one to
`fieldDataTermsFilter` and remove the unused `QueryParseContext`.
Close#6888
If you call `bin/plugin --remove es-plugin` the plugin got removed but the file `bin/plugin` itself was also deleted.
We now don't allow the following plugin names:
* elasticsearch
* plugin
* elasticsearch.bat
* plugin.bat
* elasticsearch.in.sh
* service.bat
Closes#6745
With the removal of setNextScore in #6864, script engines must use
the Scorer to find the score of a document. The DocLookup is updated
appropriately to do this, but most script engines require a Number to be
bound for numeric variables. Groovy already had an encapsulation for
this funtionality, and this moves it out to be shared with other script
engines.
closes#6898
While it would be nice to do this all the way up the chain (into
score functions), this at least removes the weird dual
setNextScore/setScorer for SearchScripts.
closes#6864
This fixes an issue introduced by the serialization changes in #6486
which are not needed at all. Node that the serialization itself is not broken
but the TransportClient uses its own version on initial connect and getting
the NodeInfos.
Today we synchronize when updating the IndexFieldDataService
datastructures. This might unnecessarily block progress if multiple
request need different fielddata instance for different fields.
This commit also fixes clear calls to actually consistently clear
the caches in the case of an exception.
Closes#6855
As a SizeValue is used for serializing the thread pool size, a negative number
resulted in throwing an exception when deserializing (using -ea an assertionerror
was thrown).
This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value.
Closes#6325Closes#5357
Due to change introduced in #6825, we now start a local gateway recovery for replicas, if the source node can not be found. The recovery then fails because we never recover replicas from disk.
Closes#6879
In rare cases we may fail to send a shard failure event to the master, or there is no known master when the shard has failed (ex. a couple of node leave the cluster canceling recoveries and causing a master to step down at the same time). When that happens and a cluster state arrives from the (new) master we should resend the shard failure in order for the master to remove the shard from this node.
Closes#6881
make the builder releasable (auto closeable), and use it in shards state
also make XContentParser releasable (AutoCloseable) and not closeable since it doesn't throw an IOException
closes#6869
use similar mechanism that shares numeric analyzers for long/double/... for dates as well. This has nice memory save properties with many date fields mapping case, as well as analysis saves (thread local resources)
closes#6843
Enviorment variables might override the tests settings even if
they are explicitly set. Other base classes like InternalTestCluster
also specify `config.ignore_system_properties: true` to ensure `what
we set is what we get`
These are javascript expressions, which can only access numeric
fielddata, parameters, and _score. They can only be used for searches (not document updates).
closes#6818
This test makes it easy to create a lightweight node (no http, indices stored
in RAM, ...) whose main purpose is to get an instance of the Guice injector
for unit tests.
This should help not have to update lots of unit tests when we add a new
Guice dependency.
Throwables thrown on update retries are now caught and handled via
the provided callback. This commit also contains an integration test
demonstrating the bug and validating the fix.
Closes#6355Closes#6724
Each transport action is associated with at least an action name, which is the action name that gets serialized together with the request and identifies what to do with the request itself. Also, the action name is the name of the registered transport handler that handles incoming request for the transport action.
This commit makes the action name available in a generic manner in the TransportAction base class, so that it can be used when needed by subclasses, or in the base class for instance for action filtering.
Closes#6860
Today we close/reopen IW when we change the RAM buffer but that's
costly because it means the next NRT reader is a full reopen. The RAM
buffer size setting is a live one in IndexWriter, even if there are no
buffered docs in RAM when you call it.
Separately it would be nice if Lucene let you manage a "reader pool"
that could outlive individual IW instances ...
Closes#6856
This commit cleans up some code around the indexed script/templates feature.
Remove dead code in ScriptService.
Remove setXScript methods for UpdateRequestBuilder and use setScript(script,type) instead
CorruptFileTest sometimes hits conditions where lots of rebalancing
happens. In such a case the default timeout is just not enough - this
timeout just makes sure that the cluster has enough time to balance
itself.