At the moment plain highligher only uses an analyzer defined for on the type
level. However, during the indexing stage it is possible to define analyzer on
per document level, for example mapping '_analyzer' to another field, containing
required name. This commit attempts to make sure that highlighting works
correctly in this scenario.
Closes#5497
Because the NetworkExceptionHelper class relies on the english language in
order to extract information and decide whether a certain exception is a
network problem, we need to set the english locale on startup in order
to prevent other locales to circumvent this check.
Guava's caches have overflow issues around 32GB with our default segment
count of 16 and weight of 1 unit per byte. We give them 100MB of headroom
so 31.9GB.
This limits the sizes of both the field data and filter caches, the two
large guava caches.
Closes#6268
Because json objects are unordered this also adds an explicit order syntax
that looks like
"highlight": {
"fields": [
{"title":{ /*params*/ }},
{"text":{ /*params*/ }}
]
}
This is not useful for any of the builtin highlighters but will be useful
in plugins.
Closes#4649
When multiple date formats are specified using the || syntax in the field mappings the date_histogram aggregation breaks. This is because we are getting a parser rather than a printer from the date formatter for the object we use to convert the DateTime values back into Strings. Simple fix to get the printer from the date format and test to back it up
Closes#6239
Assertion was triggered for percolating documents with nested object
in mapping if the document did not actually contain a nested object.
Reason:
MultiDocumentPercolatorIndex checks if the number of documents is
actualu >1. Instead we can just use the SingleDocumentPercolatorIndex
in this case.
closes#6263
The `FieldDataWeighter` allowed to use a concrete subclass of the caches
generic type to be used that causes ClassCastException and also trips the
CirciutBreaker to not be decremented appropriately.
This was tripped by settings randomization also part of this commit.
Closes#6260
Today we check if the DocIdSet we filter by is `fast` but the check fails
if the DocIdSet if wrapped in an `ApplyAcceptedDocsFilter` which is always
the case if the index has deleted documents. This commit unwraps
the original DocIdSet in the case of deleted documents.
Closes#6247
AllTokenStream, used to index the _all field, adds some overhead, but
it's not necessary when no fields were boosted or when positions are
not indexed the _all field.
Closes#6187Closes#6219