OpenSearch/docs/reference/mapping.asciidoc

175 lines
6.1 KiB
Plaintext
Raw Normal View History

[[mapping]]
= Mapping
[partintro]
--
Mapping is the process of defining how a document, and the fields it contains,
are stored and indexed. For instance, use mappings to define:
* which string fields should be treated as full text fields.
* which fields contain numbers, dates, or geolocations.
* the <<mapping-date-format,format>> of date values.
* custom rules to control the mapping for
<<dynamic-mapping,dynamically added fields>>.
[float]
[[mapping-type]]
== Mapping Type
Each index has one _mapping type_ which determines how the document will be
indexed.
deprecated[6.0.0,See <<removal-of-types>>].
A mapping type has:
<<mapping-fields,Meta-fields>>::
Meta-fields are used to customize how a document's metadata associated is
treated. Examples of meta-fields include the document's
<<mapping-index-field,`_index`>>, <<mapping-type-field,`_type`>>,
<<mapping-id-field,`_id`>>, and <<mapping-source-field,`_source`>> fields.
<<mapping-types,Fields>> or _properties_::
A mapping type contains a list of fields or `properties` pertinent to the
document.
[float]
== Field datatypes
Each field has a data `type` which can be:
2016-03-18 12:01:27 -04:00
* a simple type like <<text,`text`>>, <<keyword,`keyword`>>, <<date,`date`>>, <<number,`long`>>,
<<number,`double`>>, <<boolean,`boolean`>> or <<ip,`ip`>>.
* a type which supports the hierarchical nature of JSON such as
<<object,`object`>> or <<nested,`nested`>>.
* or a specialised type like <<geo-point,`geo_point`>>,
<<geo-shape,`geo_shape`>>, or <<search-suggesters-completion,`completion`>>.
2015-08-06 14:09:42 -04:00
It is often useful to index the same field in different ways for different
purposes. For instance, a `string` field could be <<mapping-index,indexed>> as
2016-03-18 12:01:27 -04:00
a `text` field for full-text search, and as a `keyword` field for
2015-08-06 14:09:42 -04:00
sorting or aggregations. Alternatively, you could index a string field with
the <<analysis-standard-analyzer,`standard` analyzer>>, the
<<english-analyzer,`english`>> analyzer, and the
<<french-analyzer,`french` analyzer>>.
This is the purpose of _multi-fields_. Most datatypes support multi-fields
via the <<multi-fields>> parameter.
[[mapping-limit-settings]]
[float]
=== Settings to prevent mappings explosion
Defining too many fields in an index is a condition that can lead to a
mapping explosion, which can cause out of memory errors and difficult
situations to recover from. This problem may be more common than expected.
As an example, consider a situation in which every new document inserted
introduces new fields. This is quite common with dynamic mappings.
Every time a document contains new fields, those will end up in the index's
mappings. This isn't worrying for a small amount of data, but it can become a
problem as the mapping grows.
The following settings allow you to limit the number of field mappings that
can be created manually or dynamically, in order to prevent bad documents from
causing a mapping explosion:
`index.mapping.total_fields.limit`::
The maximum number of fields in an index. Field and object mappings, as well as
field aliases count towards this limit. The default value is `1000`.
`index.mapping.depth.limit`::
The maximum depth for a field, which is measured as the number of inner
objects. For instance, if all fields are defined at the root object level,
then the depth is `1`. If there is one object mapping, then the depth is
`2`, etc. The default is `20`.
`index.mapping.nested_fields.limit`::
The maximum number of `nested` fields in an index, defaults to `50`.
Indexing 1 document with 100 nested fields actually indexes 101 documents
as each nested document is indexed as a separate hidden document.
`index.mapping.nested_objects.limit`::
The maximum number of `nested` json objects within a single document across
all nested fields, defaults to 10000. Indexing one document with an array of
100 objects within a nested field, will actually create 101 documents, as
each nested object will be indexed as a separate hidden document.
[float]
== Dynamic mapping
Fields and mapping types do not need to be defined before being used. Thanks
to _dynamic mapping_, new field names will be added automatically, just by
indexing a document. New fields can be added both to the top-level mapping
type, and to inner <<object,`object`>> and <<nested,`nested`>> fields.
The <<dynamic-mapping,dynamic mapping>> rules can be configured to customise
the mapping that is used for new fields.
[float]
== Explicit mappings
You know more about your data than Elasticsearch can guess, so while dynamic
mapping can be useful to get started, at some point you will want to specify
your own explicit mappings.
You can create field mappings when you
<<indices-create-index,create an index>>, and you can add
fields to an existing index with the <<indices-put-mapping,PUT mapping API>>.
[float]
== Updating existing field mappings
Other than where documented, *existing field mappings cannot be
updated*. Changing the mapping would mean invalidating already indexed
documents. Instead, you should create a new index with the correct mappings
and <<docs-reindex,reindex>> your data into that index. If you only wish
to rename a field and not change its mappings, it may make sense to introduce
an <<alias, `alias`>> field.
[float]
== Example mapping
A mapping could be specified when creating an index, as follows:
[source,js]
---------------------------------------
Update the default for include_type_name to false. (#37285) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.
2019-01-14 16:08:01 -05:00
PUT my_index?include_type_name=true <1>
{
"mappings": {
"_doc": { <2>
"properties": { <3>
"title": { "type": "text" }, <4>
"name": { "type": "text" }, <4>
"age": { "type": "integer" }, <4>
"created": {
"type": "date", <4>
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
---------------------------------------
// CONSOLE
<1> Create an index called `my_index`.
<2> Add a mapping type called `doc`.
<3> Specify fields or _properties_.
<4> Specify the data `type` and mapping for each field.
--
include::mapping/removal_of_types.asciidoc[]
include::mapping/types.asciidoc[]
include::mapping/fields.asciidoc[]
include::mapping/params.asciidoc[]
include::mapping/dynamic-mapping.asciidoc[]