OpenSearch/server
Jim Ferenczi 691421f287 Fix invalid break iterator highlighting on keyword field (#49566)
By default the unified highlighter splits the input into passages using
a sentence break iterator. However we don't check if the field is tokenized
or not so `keyword` field also applies the break iterator even though they can
only match on the entire content. This means that by default we'll split the
content of a `keyword` field on sentence break if the requested number of fragments
is set to a value different than 0 (default to 5). This commit changes this behavior
to ignore the break iterator on non-tokenized fields (keyword) in order to always
highlight the entire values. The number of requested fragments control the number of
matched values are returned but the boundary_scanner_type is now ignored.
Note that this is the behavior in 6x but some refactoring of the Lucene's highlighter
exposed this bug in Elasticsearch 7x.
2019-12-04 11:14:44 +01:00
..
licenses Upgrade lucene to 8.4.0-snapshot-e648d601efb (#49641) 2019-11-28 11:59:58 -05:00
src Fix invalid break iterator highlighting on keyword field (#49566) 2019-12-04 11:14:44 +01:00
build.gradle Apply 2-space indent to all gradle scripts (#49071) 2019-11-14 11:01:23 +00:00