The `text` query was replaced by the `match` query and has been
deprecated for quite a while.
The `field` query should be replaced by a `query_string` query with
the `default_field` specified.
Fixes#4033
If the term suggester is used the results are merged depending on
the number of terms produced by the tokenizer / tokenfilter. If a
term suggester is executed across multiple indices that share the
same field but with different analysis chains we can't merge the
result anymore sicne tokens are our of order or have a different size.
This commit throws ESIllegalArgumentException if the number of entries
are not the same across all results.
Closes#3196
This commit changes field data configuration updates so that they are
immediately taken into account for loading new segments. The way it works
is that field data configuration is now cached separately from the field
data cache, meaning that it is now possible to clear the field data
configuration from IndexFieldDataService while the cache will stay around. On
the next time that Elasticsearch will reload field data configuration, it will
check if there is already a cache entry, and reuse it if it exists.
To disable field data loading, all that is required is to change the field
data format to "none" (supported by all field data types) using the update
mapping API. Elasticsearch will then refuse to load field data on any new
segment, but field data which has been loaded on the previous segments will
remain available. So you need to clear the field data cache in order to
reclaim memory (otherwise memory will be reclaimed slower, as segments get
merged).
Close#4430Close#4431
Currently we miss to reset the source shards status to ACTIVE if we cancel
a relocation. If the shard is RELOCATING we need to reset to state ACTIVE.
Closes#4457
Currently the RoutingNodes API allows modification of it's internal state outside of the class.
This commit improves the APIs of `RoutingNode` and `RoutingNode` to change internal state
only within the classes itself.
Closes#4458
Fixes#4334
The deadlock occurs between monitor object of EsThreadPoolExecutor and mainLock of ThreadPoolExecutor. The shutdown method of EsThreadPoolExecutor obtains the lock on monitor first and waits for mainLock of ThreadPoolExecutor in ThreadPoolExecutor#shutdown for part of the processing, while EsThreadPoolExecutor#terminated is executed under mainLock and tries to obtain monitor to notify listeners.
if the MockDirWrapper checks the index on close it closes all
closeables that is has to crash the index if something is not flushed yet.
For us this is a problem since the input is still used. We need to fix this in
lucene first.
When the ValuesSource has ordinals, terms ordinals are used as a cache key to
bucket ordinals. This can make terms aggregations on String terms significantly
faster.
Close#4350
The percolator uses this option to deal with the fact that the MemoryIndex doesn't support stored fields,
this is possible b/c the _source of the document being percolated is always present.
Closes#4348
This adds support for Lucene's SimpleQueryParser by adding a new type
of query called the `simple_query_string`. The `simple_query_string`
query is designed to be able to parse human-entered queries without
throwing any exceptions.
Resolves#4159
It could happen although we internally use IgnoreIndices.MISSING, due to MetaData#concreteIndices contract, which throws IndexMissingException anyway if all requested indices are missing.
In case all the indices specified in the query/filter are missing, we just execute the no_match query/filter, no need to throw any error.
Closes#3428
Increased also default publish state timeout to 30 seconds (from 5 seconds) and introduced constant for it.
Introduced AcknowledgedRequest.DEFAULT_ACK_TIMEOUT constant.
Removed misleading default values coming from the REST layer.
Removed (in a bw compatible manner) the timeout support in put/delete index template as the timeout parameter was ignored.
Closes#4395
Also made the call to PercolatorQueriesRegistry#enableRealTimePercolator and #disableRealTimePercolator synchronized, so that for the same shard the RealTimePercolatorOperationListener can't registered twice.
when we bulk changes, we need to use the same index metadata builder across the tasks, otherwise we might remove mappings erroneously
also, when we check if we can use a higher order mapping, we need to verify that its for the same mapping type
In order to be sure that memory mapped lucene directories are working
one can configure the kernel about how many memory mapped areas
a process may have. This setting ensure for the debian and redhat initscripts
as well as the systemd startup, that this setting is set high enough.
Closes#4397
When sending a request, mainly to multiple nodes, if we already have the "body" of the request in bytes, we can share it instead of copying it over to a new buffer. Also, it helps a lot when sending a relatively large body to multiple nodes, since it will use the same body buffer across all nodes
When the search preference is set to only node, but this node is not a
data (or does not exist), we return a search exception, which indicates,
that this is actually a server problem.
However specifying a non-existing node id is a client problem
and should return a more useful error message than
{"error":"SearchPhaseExecutionException[Failed to execute phase [query_fetch], all shards failed]","status":503}
The explain output for function_score queries with score_mode=max or
score_mode=min was incorrect, returning instead the value of the last
function. This change fixes this.
Currently we sometimes see test failures that fail because not all replicase are
`searchable` which means they are not started yet or still recovering. Yet, the usual
situation is where two nodes have the same clusterstate but the one that acts as
the search target has not yet processed that clusterstate. The requester sees the
shard as started but it's not mark as such on the target node. For now the #ensureSearchable()
just delegates to #ensureYellow() to make sure the cluster is not red. In the future if we have
the possibilty to recover from situations like this in the search logic we can easily test
this by making the impl a no-op. Note: this problem only occurs if you have low number of docs
and the indexing is really quick such that first request are exectued but shards are not
fully `started`
The postings hl now uses a searcher that only encapsulate the view of segment the document being highlighted is in,
this should be better than using the top level engine searcher.
Closes#4385
Adding functionality to call cluster().transportClient() in tests in order
to get an arbitrary TransportClient object back, independently if the
transport client ratio in returning the normal clients is configured.
Also made sure, that if the normal client is already a transport client
(or a node client) we do not generate another one.
When searching with a query containing query_strings inside a bool query, the specified _name is randomly missing from the results due to caching.
Closes#4361.
Closes#4371.
Instead of processing all the bulk of update mappings we have per index/node, we can only update the last ordered one out of those (cause they are incremented on the node/index level). This will improve the processing time of an index that have large updates of mappings.
closes#4373
If a phrase query is wrapped in a filtered query due to type filtering
slop was not applied correctly. Also if the default field required a
type filter the filter was not applied.
Closes#4356
_all boosting used to rely on the fact that the TokenStream doesn't eagerly
consume the input java.io.Reader. This fixes the issue by using binary search
in order to find the right boost given a token's start offset.
Close#4315
When a node processed an index request, which caused it to update its own mapping, then it sends that mapping to the master. While the master process it, that node receives a state that includes an older version of the mapping. Now, there is a conflict, its not bad (i.e. the cluster state will eventually have the correct mapping), but we send for a refresh just in case form that node to the master.
With a system that has extreme cases of updates and frequent mapping changes, it might make sense to disable this feature. The indices.cluster.send_refresh_mapping setting can be introduced to support that (note, this setting need to be set on the data nodes)
Note, sending refresh mapping is more important when the reverse happens, and for some reason, the mapping in the master is ahead, or in conflict, with the actual parsing of it in the actual node the index exists on. In this case, the refresh mapping will result in warning being logged on the master node.
closes#4342
We have the situation that some tests fail since they don't handle
EsRejectedExecutionException which gets thrown when a node shuts
down. That is ok to ignore this exception and not fail.
We also suffer from OOMs that can't create native threads but don't
get threaddumps for those failures. This patch prints the thread
stacks once we catch a OOM which can' create native threads.