Multiple nodes are now started when running REST tests against the `TestCluster` (default randomized settings are now used instead of the hardcoded `1`)
Added also randomized round-robin based on all available nodes, and ability to provide multiple addresses when running tests against an external cluster to have the same behaviour
When an analysis plugins provides default index settings using `PreBuiltAnalyzerProviderFactory`, `PreBuiltTokenFilterFactoryFactory` or `PreBuiltTokenizerFactoryFactory` it fails when upgrading it with elasticsearch superior or equal to 0.90.5.
Related issue: #4936
Fix is needed in core. But, in the meantime, analysis plugins developers can fix that issue by overloading default prebuilt factories.
For example:
```java
public class StempelAnalyzerProviderFactory extends PreBuiltAnalyzerProviderFactory {
private final PreBuiltAnalyzerProvider analyzerProvider;
public StempelAnalyzerProviderFactory(String name, AnalyzerScope scope, Analyzer analyzer) {
super(name, scope, analyzer);
analyzerProvider = new PreBuiltAnalyzerProvider(name, scope, analyzer);
}
@Override
public AnalyzerProvider create(String name, Settings settings) {
return analyzerProvider;
}
public Analyzer analyzer() {
return analyzerProvider.get();
}
}
```
And instead of:
```java
@Inject
public PolishIndicesAnalysis(Settings settings, IndicesAnalysisService indicesAnalysisService) {
super(settings);
indicesAnalysisService.analyzerProviderFactories().put("polish", new PreBuiltAnalyzerProviderFactory("polish", AnalyzerScope.INDICES, new PolishAnalyzer(Lucene.ANALYZER_VERSION)));
}
```
do
```java
@Inject
public PolishIndicesAnalysis(Settings settings, IndicesAnalysisService indicesAnalysisService) {
super(settings);
indicesAnalysisService.analyzerProviderFactories().put("polish", new StempelAnalyzerProviderFactory("polish", AnalyzerScope.INDICES, new PolishAnalyzer(Lucene.ANALYZER_VERSION)));
}
```
Closes#5030
The byte[] array that was used to store the term was owned by the BytesRefHash
which is used to compute counts. However, the BytesRefHash is released at some
point and its content may be recycled.
MockPageCacheRecycler has been improved to expose this issue (putting random
content into the arrays upon release).
Number of documents/terms have been increased in RandomTests to make sure page
recycling occurs.
Close#5021
`cross_fields` attemps to treat fields with the same analysis
configuration as a single field and uses maximum score promotion or
combination of the scores based depending on the `use_dis_max` setting.
By default scores are combined. `cross_fields` can also search across
fields of hetrogenous types for instance if numbers can be part of
the query it makes sense to search also on numeric fields if an analyzer
is provided in the reqeust.
Relates to #2959
This commit removes FilterBytesValues which is very trappy as the default
implementation forwards all method calls to the delegate. So if you do any
non-trivial modification to the terms or to the order of the terms, you need
to remember to override currentValueHash, copyShared, and this is very
error-prone.
FieldDataSource.WithScript.BytesValues and ScriptBytesValues now return correct
hash codes, future bugs here would be catched by the new assertion in
SortedUniqueBytesValues.
This bug was causing performance issues with scripts as all terms were assumed
to have the same hash code.
Close#5004
* Mostly minor things like typos and grammar stuff
* Some clarifications
* The note on the deprecation was ambiguous. I've removed the problematic part so that it now definitely says it's deprecated