Relying on deleted documents when loading field data is dangerous because a
field data instance might be loaded for a given generation of a segment and
then loaded from the cache by an older generation of the same segment which
has fewer deleted documents. This could, for example, lead to under-estimated
facet counts. The same issue applies to the ID cache and filter caches.
Close#3224
The `parent` option was ignored in the delete api (rest only) and for delete actions in the bulk api.
This bug occurred in the case that the _parent field is enabled, and only the parent option was used. This resulted in a situation that documents are deleted even if the specified parent value is incorrect.
Closes#3257
The current implementation of parsing suggestions executed inside of the
the pull parser - which resulted in being reliable of the order of the
elements in the request. This fix changes the behaviour to parse the
relevant parts of the request first and then execute all the suggestions
afterwards, so we can be sure that every information has been extracted
from the request before execution.
Closes#3247
Even though proposed in the documentation, the realtime enabling/disabling of
index warmers was not supported. This commit adds support for
index.warmer.enabled as a dynamic setting.
Closes#3246
This commit merges field data implementations for byte, short, int and long
data into PackedArrayAtomicFieldData which uses Lucene's PackedInts API to
store data.
Close#3220
IndexUpgraderMergePolicy assumed that field numbers were dense and that
fieldInfos.size() was a free field number. This can however be wrong for a
segment which doesn't have one or more fields that some older segments have.
Close#3237
This enforces that settings are taken into account whichever mean is used to
import the project into Eclipse (manual import, m2e, mvn eclipse:eclipse, ...).
NumericTokenizer is a simple wrapper aroung a NumericTokenStream. However, its
implementations had a few issues: its reset() method was not idempotent,
causing exceptions if reset() was called twice (causing #3211) and it had no
attributes, meaning that the only thing it allowed to do is counting the number
of generated tokens. The reason why indexing numeric data worked is that
the mapper's parseCreateField directly generates a NumericTokenStream and
by-passes the analyzer.
This commit makes NumericTokenizer.reset idempotent and makes consuming a
NumericTokenizer behave the same way as consuming the underlying
NumericTokenStream.
##############
Previous versions of the GeoPointFieldMapper just stored the actual geohash
of a point. This commit changes the behavior of storing geohashes by storing
the geohash and all its prefixes in decreasing order in the same field. To
enable this functionality the option geohash_prefix must be set in the mapping.
This behavior allows to filter GeoPoints by their geohashes. Basically a
geohash prefix is defined by the filter and all geohashes that match this
prefix will be returned. The neighbors flag allows to filter geohashes
that surround the given geohash cell. In general the neighborhood of a
geohash is defined by its eight adjacent cells.
To enable this, the type of filtered fields must be geo_point with geohashes
and geohash_prefix enabled.
For example:
curl -XPUT 'http://127.0.0.1:9200/locations/?pretty=true' -d '{
"mappings" : {
"location": {
"properties": {
"pin": {
"type": "geo_point",
"geohash": true,
"geohash_prefix": true
}
}
}
}
}'
This example defines a mapping for a type location in an index locations
with a field pin. The option geohash arranges storing the geohash of
the pin field.
To filter the results by the geohash a geohash_cell needs to be defined.
For example
curl -XGET 'http://127.0.0.1:9200/locations/_search?pretty=true' -d '{
"query": {
"match_all":{}
},
"filter": {
"geohash_cell": {
"field": "pin",
"geohash": "u30",
"neighbors": true
}
}
}'
This filter will match all geohashes that start with one of the following
prefixes: u30, u1r, u32, u33, u1p, u31, u0z, u2b and u2c.
Internally the GeoHashFilter is either a simple TermFilter, in case no
neighbors should be filtered or a BooleanFilter combining the TermFilters
of the geohash and all its neighbors.
Closes#2778
Lucene 4.4 will feature new n-gram tokenizers and filters that should not
generate broken offsets (that cause highlighting bugs) anymore. They also
correctly handle supplementary characters and the tokenizers can work in a
streaming fashion (they are not limited to the first 1024 chars of the
stream anymore).
Suggesters might need access to the shard they run on as well as the
index they operate on. This patch adds indexname and shard ID to the
SuggestionContext
Closes#3199