Fixes#4334
The deadlock occurs between monitor object of EsThreadPoolExecutor and mainLock of ThreadPoolExecutor. The shutdown method of EsThreadPoolExecutor obtains the lock on monitor first and waits for mainLock of ThreadPoolExecutor in ThreadPoolExecutor#shutdown for part of the processing, while EsThreadPoolExecutor#terminated is executed under mainLock and tries to obtain monitor to notify listeners.
When the ValuesSource has ordinals, terms ordinals are used as a cache key to
bucket ordinals. This can make terms aggregations on String terms significantly
faster.
Close#4350
The percolator uses this option to deal with the fact that the MemoryIndex doesn't support stored fields,
this is possible b/c the _source of the document being percolated is always present.
Closes#4348
This adds support for Lucene's SimpleQueryParser by adding a new type
of query called the `simple_query_string`. The `simple_query_string`
query is designed to be able to parse human-entered queries without
throwing any exceptions.
Resolves#4159
It could happen although we internally use IgnoreIndices.MISSING, due to MetaData#concreteIndices contract, which throws IndexMissingException anyway if all requested indices are missing.
In case all the indices specified in the query/filter are missing, we just execute the no_match query/filter, no need to throw any error.
Closes#3428
Increased also default publish state timeout to 30 seconds (from 5 seconds) and introduced constant for it.
Introduced AcknowledgedRequest.DEFAULT_ACK_TIMEOUT constant.
Removed misleading default values coming from the REST layer.
Removed (in a bw compatible manner) the timeout support in put/delete index template as the timeout parameter was ignored.
Closes#4395
Also made the call to PercolatorQueriesRegistry#enableRealTimePercolator and #disableRealTimePercolator synchronized, so that for the same shard the RealTimePercolatorOperationListener can't registered twice.
when we bulk changes, we need to use the same index metadata builder across the tasks, otherwise we might remove mappings erroneously
also, when we check if we can use a higher order mapping, we need to verify that its for the same mapping type
When sending a request, mainly to multiple nodes, if we already have the "body" of the request in bytes, we can share it instead of copying it over to a new buffer. Also, it helps a lot when sending a relatively large body to multiple nodes, since it will use the same body buffer across all nodes
When the search preference is set to only node, but this node is not a
data (or does not exist), we return a search exception, which indicates,
that this is actually a server problem.
However specifying a non-existing node id is a client problem
and should return a more useful error message than
{"error":"SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed]","status":503}
The explain output for function_score queries with score_mode=max or
score_mode=min was incorrect, returning instead the value of the last
function. This change fixes this.
The postings hl now uses a searcher that only encapsulate the view of segment the document being highlighted is in,
this should be better than using the top level engine searcher.
Closes#4385
When searching with a query containing query_strings inside a bool query, the specified _name is randomly missing from the results due to caching.
Closes#4361.
Closes#4371.
Instead of processing all the bulk of update mappings we have per index/node, we can only update the last ordered one out of those (cause they are incremented on the node/index level). This will improve the processing time of an index that have large updates of mappings.
closes#4373
If a phrase query is wrapped in a filtered query due to type filtering
slop was not applied correctly. Also if the default field required a
type filter the filter was not applied.
Closes#4356
_all boosting used to rely on the fact that the TokenStream doesn't eagerly
consume the input java.io.Reader. This fixes the issue by using binary search
in order to find the right boost given a token's start offset.
Close#4315
When a node processed an index request, which caused it to update its own mapping, then it sends that mapping to the master. While the master process it, that node receives a state that includes an older version of the mapping. Now, there is a conflict, its not bad (i.e. the cluster state will eventually have the correct mapping), but we send for a refresh just in case form that node to the master.
With a system that has extreme cases of updates and frequent mapping changes, it might make sense to disable this feature. The indices.cluster.send_refresh_mapping setting can be introduced to support that (note, this setting need to be set on the data nodes)
Note, sending refresh mapping is more important when the reverse happens, and for some reason, the mapping in the master is ahead, or in conflict, with the actual parsing of it in the actual node the index exists on. In this case, the refresh mapping will result in warning being logged on the master node.
closes#4342
We have the situation that some tests fail since they don't handle
EsRejectedExecutionException which gets thrown when a node shuts
down. That is ok to ignore this exception and not fail.
We also suffer from OOMs that can't create native threads but don't
get threaddumps for those failures. This patch prints the thread
stacks once we catch a OOM which can' create native threads.
The newly added afterIndexShardPostRecovery method to InternalIndicesLifecycle, that the percolator now uses to trigger the loading of the registered queries will make sure that a shard doesn't go to started state before the queries have been loaded.
The percolate api (like other apis) will retry the execution on a different shard copy if a shard isn't in a started state preventing empty results if there registered queries to be loaded. Percolator tests fail sometimes for this reason.
The Fast Vector Highlighter can combine matches on multiple fields to
highlight a single field using `matched_fields`. This is most
intuitive for multifields that analyze the same string in different
ways. Example:
{
"query": {
"query_string": {
"query": "content.plain:running scissors",
"fields": ["content"]
}
},
"highlight": {
"order": "score",
"fields": {
"content": {
"matched_fields": ["content", "content.plain"],
"type" : "fvh"
}
}
}
}
Closes#3750
For example when a has_child is wrapped in a filtered query as query and the wrapped filter is cached.
The short circuit mechanism in that case counts down based on deleted docs, which then yields lower results than is expected.
Relates to #4306
* Removed the applyAcceptedDocs in ChildrenConstantScoreQuery, they need to be applied at all times. (because of short circuit mechanism)
* Moved ParentDocSet to FilteredDocIdSetIterator, because it fits better than MatchDocIdSet.
* Made similar changes to ParentConstantScoreQuery for consistency between the two queries. The bug accepted docs bug didn't occur in the ParentConstantScoreQuery.
* Updated random p/c tests to randomly update parent or child docs during the test run.
Closes#4306
* Force the cat API classes to have simple help available by extending from a AbstractCatAction
* Use a set binder on guice creation to create the help on node start up
* Make sure that the help/field info is returned without querying any data
This happens when reverting the trans transaction log on failure, and when that happens, actually we might have failed on the transient translog creation to being with....
fixes#4223
The use of freq() instead of sloppyFreq() and the fact that `numMatches` was
not updated in `setFreqCurrentDoc` could lead to an inaccurate score in the
explanation.
Close#4298
Note: we were previously waiting for ack only from all nodes that contain shards for the indices that the mapping updatewas applied to. This change introduces a wait for ack from all nodes, consitent with other api as the ack is meant more on the cluster state itself, which is held by all nodes and needs to be updated on all nodes anyway.
Closes#4228
FilterDirectory has been ported to Lucene in LUCENE-5204 which makes
the class in elasticsearch obsolet. This commit removes the class and
moves the static utils that are not in lucene to `DirectoryUtils`
* Hole charactor now can change with new releases
* Fixed bug where the SEP_LABEL constant was used instead of the sepLabel instance variable
* Replaced if- with switch-statement
When we relocate a shard we might still have pending SearchContext
instances hanging around that will be used in "in-flight" searches
on the already relocated shard. This is a valid operation but if
we have already closed the underlying directory which happens during
cleanup concurrently the close call on the IndexReader can trigger
an AlreadyClosedException when the NRT reader tries to cleanup files
via the IndexWriter.
Closes#4273
This method has 2 signatures and one of them is dangerous since it allows to
discard fielddata configuration of the field mapper. This commit changes the
percolator so that it uses fielddata configuration of the _id field mapper
instead of forcing the paged_bytes format.
Closes#4270