This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.
Closes#43599
The `string` type (with option `analyzed`) has been replaced by `text` after `6.0`,
also the `annonated_text` field do not support doc values and should be mentioned.
This PR makes the following two fixes around updating flattened fields:
* Make sure that the new value for ignore_above is immediately taken into
affect. Previously we recorded the new value but did not use it when parsing
documents.
* Allow depth_limit to be updated dynamically. It seems plausible that a user
might want to tweak this setting as they encounter more data.
This PR makes the following updates:
* Update the supported query types to include `prefix` and `wildcard`.
* Specify that queries accept index aliases.
* Clarify that when querying on a remote index name, the separator `:` must be
present.
We have not seen much adoption of this experimental field type, and don't see a
clear use case as it's currently designed. This PR deprecates the field type in
7.x. It will be removed from 8.0 in a follow-up PR.
The `ignore_malformed` setting only works on selected mapping types, otherwise
we throw an mapper_parsing_exception. We should add a list of all the mapping
types that support it, since the number of types not supporting it seems larger.
Closes#47166
Although they do not support eager_global_ordinals, ip fields use global
ordinals for certain aggregations like 'terms'.
This commit also corrects a reference to the sampler aggregation.
Currently we allow `_field_names` fields to be disabled explicitely, but since
the overhead is negligible now we decided to keep it turned on by default and
deprecate the `enable` option on the field type. This change adds a deprecation
warning whenever this setting is used, going forward we want to ignore and finally
remove it.
Closes#27239
This commit updates the eager_global_ordinals documentation to give more
background on what global ordinals are and when they are used. The docs also now
mention that global ordinal loading may be expensive, and describes the cases
where in which loading them can be avoided.
This PR merges the `vectors-optimize-brute-force` feature branch, which makes
the following changes to how vector functions are computed:
* Precompute the L2 norm of each vector at indexing time. (#45390)
* Switch to ByteBuffer for vector encoding. (#45936)
* Decode vectors and while computing the vector function. (#46103)
* Use an array instead of a List for the query vector. (#46155)
* Precompute the normalized query vector when using cosine similarity. (#46190)
Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
* Introduce Spatial Plugin (#44389)
Introduce a skeleton Spatial plugin that holds new licensed features coming to
Geo/Spatial land!
* [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923)
Refactor DeprecatedParameters specific to legacy geo_shape out of
AbstractGeometryFieldMapper.TypeParser#parse.
* [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980)
Add a new ShapeFieldMapper to the xpack spatial module for
indexing arbitrary cartesian geometries using a new field type called shape.
The indexing approach leverages lucene's new XYShape field type which is
backed by BKD in the same manner as LatLonShape but without the WGS84
latitude longitude restrictions. The new field mapper builds on and
extends the refactoring effort in AbstractGeometryFieldMapper and accepts
shapes in either GeoJSON or WKT format (both of which support non geospatial
geometries).
Tests are provided in the ShapeFieldMapperTest class in the same manner
as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests.
Documentation for how to use the new field type and what parameters are
accepted is included. The QueryBuilder for searching indexed shapes is
provided in a separate commit.
* [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108)
Add a new ShapeQueryBuilder to the xpack spatial module for
querying arbitrary Cartesian geometries indexed using the new shape field
type.
The query builder extends AbstractGeometryQueryBuilder and leverages the
ShapeQueryProcessor added in the previous field mapper commit.
Tests are provided in ShapeQueryTests in the same manner as
GeoShapeQueryTests and docs are updated to explain how the query works.
Previously, the reindex examples did not include `_doc` as the destination type.
This would result in the reindex failing with the error "Rejecting mapping
update to [users] as the final mapping would have more than 1 type: [_doc,
user]".
Relates to #43100.
Some small clarifications about force-merging and global ordinals, particularly
that global ordinals are cheap on a single-segment index and how this relates
to frozen indices.
Fixes#41687
Typically, dense vectors of both documents and queries must have the same
number of dimensions. Different number of dimensions among documents
or query vector indicate an error. This PR enforces that all vectors
for the same field have the same number of dimensions. It also enforces
that query vectors have the same number of dimensions.
A few places in the documentation had mentioned 6.7 as the version to
upgrade from, when doing an upgrade to 7.0. While this is technically
possible, this commit will replace all those mentions to 6.8, as this is
the latest version with the latest bugfixes, deprecation checks and
ugprade assistant features - which should be the one used for upgrades.
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
This commit merges the `object-fields` feature branch. The new 'flattened
object' field type allows an entire JSON object to be indexed into a field, and
provides limited search functionality over the field's contents.
It is possible for internal ML indices like `.data-frame-notifications-1` to leak,
causing other docs tests to fail when they accidentally search over these
indices. This PR updates the ignore_above tests to only search a specific index.
Together with types removal, any mention of "fields with the same name in the same index" doesn't make sense anymore.
(cherry picked from commit c5190106cbd4c007945156249cce462956933326)
This PR updates the docs for `docvalue_fields` and `stored_fields` to clarify
that nested fields must be accessed through `inner_hits`. It also tweaks the
nested fields documentation to make this point more visible.
Addresses #23766.
* Previously, we mentioned multiple times that each nested object was indexed as its own document. This is repetitive, and is also a bit confusing in the context of `index.mapping.nested_fields.limit`, as that applies to the number of distinct `nested` types in the mappings, not the number of nested objects. We now just describe the issue once at the beginning of the section, to illustrate why `nested` types can be expensive.
* Reference the ongoing example to clarify the meaning of the two settings.
Addresses #28363.
The `path_match` and `path_unmatch` parameters in dynamic templates match on
object fields in addition to leaf fields. This is not obvious and can cause
surprising errors when a template is meant for a leaf field, but there are
object fields that match. This PR adds a note to the docs to describe the
current behavior.
We received some feedback that it is not completely clear why `_doc` is present
in the typeless document APIs:
> The new index APIs are PUT {index}/_doc/{id} in case of explicit ids and POST
{index}/_doc for auto-generated ids."_ Isn't this contradicting? Specifying
*types in requests is deprecated*, but we are supposed to still mention *_doc*
in write requests?
This PR updates the 'removal of types' documentation to try to clarify that
`_doc` now represents the endpoint name, as opposed to a type.