This change adds an option called `split_on_whitespace` which prevents the query parser to split free text part on whitespace prior to analysis. Instead the queryparser would parse around only real 'operators'. Default to true.
For instance the query `"foo bar"` would let the analyzer of the targeted field decide how the tokens should be splitted.
Some options are missing in this change but I'd like to add them in a follow up PR in order to be able to simplify the backport in 5.x. The missing options (changes) are:
* A `type` option which similarly to the `multi_match` query defines how the free text should be parsed when multi fields are defined.
* Simple range query with additional tokens like ">100 50" are broken when `split_on_whitespace` is set to false. It should be possible to preserve this syntax and make the parser aware of this special syntax even when `split_on_whitespace` is set to false.
* Since all this options would make the `query_string_query` very similar to a match (multi_match) query we should be able to share the code that produce the final Lucene query.
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.
This commit introduces a single-shard balance step for deciding on
rebalancing a single shard (without taking any other shards in the
cluster into account). This method will be used by the cluster
allocation explain API to explain in detail the decision process for
finding a more optimal location for a started shard, if one exists.
* Add docs with up to date instructions on updating default similarity
The default similarity can no longer be set in the configuration file
(you will get an error on startup). Update the docs with the method
that works.
* Add instructions for changing similarity on index creation
Today when installing Elasticsearch from an archive distribution (tar.gz
or zip), an empty plugins folder is not included. This means that if you
install Elasticsearch and immediately run elasticsearch-plugin list, you
will receive an error message about the plugins directory missing. While
the plugins directory would be created when starting Elasticsearch for
the first time, it would be better to just include an empty plugins
directory in the archive distributions. This commit makes this the
case. Note that the package distributions already include an empty
plugins folder.
Relates #21204
This doesn't make much sense to have at all, since a user can do a `GET`
request without a version of they want to get it unconditionally.
Relates to #20995
It was 10mb and that was causing trouble when folks reindex-from-remoted
with large documents.
We also improve the error reporting so it tells folks to use a smaller
batch size if they hit a buffer size exception. Finally, adds some docs
to reindex-from-remote mentioning the buffer and giving an example of
lowering the size.
Closes#21185
There may be cases where the XContentBuilder is not used and therefore it never gets
closed, which can cause a leak of bytes. This change moves the creation of the builder
into a try with resources block and adds an assertion to verify that we always consume
the bytes in our code; the try-with resources provides protections against memory leaks
caused by plugins, which do not test this.
Previously Elasticsearch would only use the package name for logging
levels, truncating the package prefix and the class name. This meant
that logger names for Netty were just prefixed by netty3 and netty. We
changed this for Elasticsearch so that it's the fully-qualified class
name now, but never corrected this for Netty. This commit fixes the
logger names for the Netty modules so that their levels are controlled
by the fully-qualified class name.
Relates #21223
Refactored ScriptType to clean up some of the variable and method names. Added more documentation. Deprecated the 'in' ParseField in favor of 'stored' to match the indexed scripts being replaced by stored scripts.
We throw this exception in some cases that the shard is closed, so we have to be consistent here. Otherwise we get logs like:
```
1> [2016-10-30T21:06:22,529][WARN ][o.e.i.IndexService ] [node_s_0] [test] failed to run task refresh - suppressing re-occurring exceptions unless the exception changes
1> org.elasticsearch.index.shard.IndexShardClosedException: CurrentState[CLOSED] operation only allowed when not closed
1> at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:1147) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShard.verifyNotClosed(IndexShard.java:1141) ~[main/:?]
```
Since Lucene 6.2. the UkrainianMorfologikAnalyzer is available through the
lucene-analyzers-morfologik jar. This change exposes it to be used as an
elasticsearch plugin.
Setting `discovery.initial_state_timeout: 0s` to make `discovery.zen.minimum_master_nodes: N`
work reliably can cause issues in clusters that rely on state recovery once the cluster is available.
This change makes the use or `discovery.zen.minimum_master_nodes` optional for clusters where this behavior is desirable.
When installing the Windows service, certain settings like the minimum
heap, maximum heap and thread stack size setting must be set. While
there is an error message making mention of this fact, the error message
is not explicit exactly what setting needs to be set. This commit makes
these settings explicit.
Relates #21200
Before publishing a cluster state the master connects to the nodes that are added in the cluster state. When publishing fails, however, it does not disconnect from these nodes, leaving NodeConnectionsService out of sync with the currently applied cluster state.
The assertion assertMaster checks if all nodes have each other in the cluster state and the correct master set.
It is usually called after a disruption has been healed and ensureStableCluster been called. In presence of a low
publish timeout of 1s in this test class, publishing might not be fully done even after ensureStableCluster returns.
This commit adds an assertBusy to assertMaster so that the node has a bit more time to apply the cluster state from
the master, even if it's a bit slow.
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case.
This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates.
This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/
Relates #20211
Java 9's exception message when lists have an out of bounds index
is much better than java 8 but the painless code asserted on the
java 8 message. Now it'll accept either.
I'm tempted to weaken the assertion but I like asserting that the
message is readable.
Adds support for indexing into lists and arrays with negative
indexes meaning "counting from the back". So for if
`x = ["cat", "dog", "chicken"]` then `x[-1] == "chicken"`.
This adds an extra branch to every array and list access but
some performance testing makes it look like the branch predictor
successfully predicts the branch every time so there isn't a
in execution time for this feature when the index is positive.
When the index is negative performance testing showed the runtime
is the same as writing `x[x.length - 1]`, again, presumably thanks
to the branch predictor.
Those performance metrics were calculated for lists and arrays but
`def`s get roughly the same treatment though instead of inlining
the test they need to make a invoke dynamic so we don't screw up
maps.
Closes#20870