AndDocIdSet#IteratorBasedIterator was potentially initialized with
NO_MORE_DOCS which violates the initial state of DocIdSetIterator and
could lead to undefined behavior when used in a search context.
Closes#5049
REST tests get run against either 1 node or multiple nodes. Wait for yellow with replicas>0 is not enough when running against multiple nodes as replicas shard might get initialized during testing, which can cause timing issues.
Replaced also wait for yellow with wait for green when using no replicas.
get_source/60_realtime_refresh tests per shard refresh using refresh:true and realtime:true in get api. We might run into troubles though if we have a replica that gets initialized after a doc was indexed without a refresh, as that doc will be found when searching against that specific replica shard (as a refresh happens automatically before a replica gets exposed as started).
delete/50_refresh tests per shard refresh using refresh:true in delete api. We might run into troubles though if we have a replica that gets initialized after a doc was indexed and deleted, without a refresh, as that doc won't be found when searching against that specific replica shard (as a refresh happens automatically before a replica gets exposed as started).
Multiple nodes are now started when running REST tests against the `TestCluster` (default randomized settings are now used instead of the hardcoded `1`)
Added also randomized round-robin based on all available nodes, and ability to provide multiple addresses when running tests against an external cluster to have the same behaviour
When an analysis plugins provides default index settings using `PreBuiltAnalyzerProviderFactory`, `PreBuiltTokenFilterFactoryFactory` or `PreBuiltTokenizerFactoryFactory` it fails when upgrading it with elasticsearch superior or equal to 0.90.5.
Related issue: #4936
Fix is needed in core. But, in the meantime, analysis plugins developers can fix that issue by overloading default prebuilt factories.
For example:
```java
public class StempelAnalyzerProviderFactory extends PreBuiltAnalyzerProviderFactory {
private final PreBuiltAnalyzerProvider analyzerProvider;
public StempelAnalyzerProviderFactory(String name, AnalyzerScope scope, Analyzer analyzer) {
super(name, scope, analyzer);
analyzerProvider = new PreBuiltAnalyzerProvider(name, scope, analyzer);
}
@Override
public AnalyzerProvider create(String name, Settings settings) {
return analyzerProvider;
}
public Analyzer analyzer() {
return analyzerProvider.get();
}
}
```
And instead of:
```java
@Inject
public PolishIndicesAnalysis(Settings settings, IndicesAnalysisService indicesAnalysisService) {
super(settings);
indicesAnalysisService.analyzerProviderFactories().put("polish", new PreBuiltAnalyzerProviderFactory("polish", AnalyzerScope.INDICES, new PolishAnalyzer(Lucene.ANALYZER_VERSION)));
}
```
do
```java
@Inject
public PolishIndicesAnalysis(Settings settings, IndicesAnalysisService indicesAnalysisService) {
super(settings);
indicesAnalysisService.analyzerProviderFactories().put("polish", new StempelAnalyzerProviderFactory("polish", AnalyzerScope.INDICES, new PolishAnalyzer(Lucene.ANALYZER_VERSION)));
}
```
Closes#5030
The byte[] array that was used to store the term was owned by the BytesRefHash
which is used to compute counts. However, the BytesRefHash is released at some
point and its content may be recycled.
MockPageCacheRecycler has been improved to expose this issue (putting random
content into the arrays upon release).
Number of documents/terms have been increased in RandomTests to make sure page
recycling occurs.
Close#5021
`cross_fields` attemps to treat fields with the same analysis
configuration as a single field and uses maximum score promotion or
combination of the scores based depending on the `use_dis_max` setting.
By default scores are combined. `cross_fields` can also search across
fields of hetrogenous types for instance if numbers can be part of
the query it makes sense to search also on numeric fields if an analyzer
is provided in the reqeust.
Relates to #2959