417 lines
10 KiB
Plaintext
417 lines
10 KiB
Plaintext
[[mapping-all-field]]
|
|
=== `_all` field
|
|
|
|
The `_all` field is a special _catch-all_ field which concatenates the values
|
|
of all of the other fields into one big string, which is then
|
|
<<analysis,analyzed>> and indexed, but not stored. This means that it can be
|
|
searched, but not retrieved.
|
|
|
|
The `_all` field allows you to search for values in documents without knowing
|
|
which field contains the value. This makes it a useful option when getting
|
|
started with a new dataset. For instance:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT my_index/user/1 <1>
|
|
{
|
|
"first_name": "John",
|
|
"last_name": "Smith",
|
|
"date_of_birth": "1970-10-24"
|
|
}
|
|
|
|
GET my_index/_search
|
|
{
|
|
"query": {
|
|
"match": {
|
|
"_all": "john smith 1970"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
<1> The `_all` field will contain the terms: [ `"john"`, `"smith"`, `"1970"`, `"10"`, `"24"` ]
|
|
|
|
[NOTE]
|
|
.All values treated as strings
|
|
=============================================================================
|
|
|
|
The `date_of_birth` field in the above example is recognised as a `date` field
|
|
and so will index a single term representing `1970-10-24 00:00:00 UTC`. The
|
|
`_all` field, however, treats all values as strings, so the date value is
|
|
indexed as the three string terms: `"1970"`, `"24"`, `"10"`.
|
|
|
|
It is important to note that the `_all` field combines the original values
|
|
from each field as a string. It does not combine the _terms_ from each field.
|
|
|
|
=============================================================================
|
|
|
|
The `_all` field is just a <<string,`string`>> field, and accepts the same
|
|
parameters that other string fields accept, including `analyzer`,
|
|
`term_vectors`, `index_options`, and `store`.
|
|
|
|
The `_all` field can be useful, especially when exploring new data using
|
|
simple filtering. However, by concatenating field values into one big string,
|
|
the `_all` field loses the distinction between short fields (more relevant)
|
|
and long fields (less relevant). For use cases where search relevance is
|
|
important, it is better to query individual fields specifically.
|
|
|
|
The `_all` field is not free: it requires extra CPU cycles and uses more disk
|
|
space. If not needed, it can be completely <<disabling-all-field,disabled>> or
|
|
customised on a <<include-in-all,per-field basis>>.
|
|
|
|
[[querying-all-field]]
|
|
==== Using the `_all` field in queries
|
|
|
|
The <<query-dsl-query-string-query,`query_string`>> and
|
|
<<query-dsl-simple-query-string-query,`simple_query_string`>> queries query
|
|
the `_all` field by default, unless another field is specified:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
GET _search
|
|
{
|
|
"query": {
|
|
"query_string": {
|
|
"query": "john smith 1970"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
The same goes for the `?q=` parameter in <<search-uri-request, URI search
|
|
requests>> (which is rewritten to a `query_string` query internally):
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
GET _search?q=john+smith+1970
|
|
--------------------------------
|
|
|
|
Other queries, such as the <<query-dsl-match-query,`match`>> and
|
|
<<query-dsl-term-query,`term`>> queries require you to specify
|
|
the `_all` field explicitly, as per the
|
|
<<mapping-all-field,first example>>.
|
|
|
|
[[disabling-all-field]]
|
|
==== Disabling the `_all` field
|
|
|
|
The `_all` field can be completely disabled per-type by setting `enabled` to
|
|
`false`:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT my_index
|
|
{
|
|
"mappings": {
|
|
"type_1": { <1>
|
|
"properties": {...}
|
|
},
|
|
"type_2": { <2>
|
|
"_all": {
|
|
"enabled": false
|
|
},
|
|
"properties": {...}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
<1> The `_all` field in `type_1` is enabled.
|
|
<2> The `_all` field in `type_2` is completely disabled.
|
|
|
|
If the `_all` field is disabled, then URI search requests and the
|
|
`query_string` and `simple_query_string` queries will not be able to use it
|
|
for queries (see <<querying-all-field>>). You can configure them to use a
|
|
different field with the `index.query.default_field` setting:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT my_index
|
|
{
|
|
"mappings": {
|
|
"my_type": {
|
|
"_all": {
|
|
"enabled": false <1>
|
|
},
|
|
"properties": {
|
|
"content": {
|
|
"type": "string"
|
|
}
|
|
}
|
|
}
|
|
},
|
|
"settings": {
|
|
"index.query.default_field": "content" <2>
|
|
},
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
<1> The `_all` field is disabled for the `my_type` type.
|
|
<2> The `query_string` query will default to querying the `content` field in this index.
|
|
|
|
[[include-in-all]]
|
|
==== Including specific fields in `_all`
|
|
|
|
Individual fields can be included or excluded from the `_all` field with the
|
|
`include_in_all` setting, which defaults to `true`:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT my_index
|
|
{
|
|
"mappings": {
|
|
"my_type": {
|
|
"properties": {
|
|
"title": { <1>
|
|
"type": "string"
|
|
}
|
|
"content": { <1>
|
|
"type": "string"
|
|
},
|
|
"date": { <2>
|
|
"type": "date",
|
|
"include_in_all": false
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
<1> The `title` and `content` fields with be included in the `_all` field.
|
|
<2> The `date` field will not be included in the `_all` field.
|
|
|
|
The `include_in_all` parameter can also be set at the type level and on
|
|
<<mapping-object-type,`object`>> or <<mapping-nested-type,`nested`>> fields,
|
|
in which case all sub-fields inherit that setting. For instance:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT my_index
|
|
{
|
|
"mappings": {
|
|
"my_type": {
|
|
"include_in_all": false, <1>
|
|
"properties": {
|
|
"title": { "type": "string" },
|
|
"author": {
|
|
"include_in_all": true, <2>
|
|
"properties": {
|
|
"first_name": { "type": "string" },
|
|
"last_name": { "type": "string" }
|
|
}
|
|
},
|
|
"editor": {
|
|
"properties": {
|
|
"first_name": { "type": "string" }, <3>
|
|
"last_name": { "type": "string", "include_in_all": true } <3>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
<1> All fields in `my_type` are excluded from `_all`.
|
|
<2> The `author.first_name` and `author.last_name` fields are included in `_all`.
|
|
<3> Only the `editor.last_name` field is included in `_all`.
|
|
The `editor.first_name` inherits the type-level setting and is excluded.
|
|
|
|
[[all-field-and-boosting]]
|
|
==== Index boosting and the `_all` field
|
|
|
|
Individual fields can be _boosted_ at index time, with the `boost` parameter.
|
|
The `_all` field takes these boosts into account:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT myindex
|
|
{
|
|
"mappings": {
|
|
"mytype": {
|
|
"properties": {
|
|
"title": { <1>
|
|
"type": "string",
|
|
"boost": 2
|
|
},
|
|
"content": { <1>
|
|
"type": "string"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
<1> When querying the `_all` field, words that originated in the
|
|
`title` field are twice as relevant as words that originated in
|
|
the `content` field.
|
|
|
|
WARNING: Using index-time boosting with the `_all` field has a significant
|
|
impact on query performance. Usually the better solution is to query fields
|
|
individually, with optional query time boosting.
|
|
|
|
|
|
[[custom-all-fields]]
|
|
==== Custom `_all` fields
|
|
|
|
While there is only a single `_all` field per index, the <<copy-to,`copy_to`>>
|
|
parameter allows the creation of multiple __custom `_all` fields__. For
|
|
instance, `first_name` and `last_name` fields can be combined together into
|
|
the `full_name` field:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT myindex
|
|
{
|
|
"mappings": {
|
|
"mytype": {
|
|
"properties": {
|
|
"first_name": {
|
|
"type": "string",
|
|
"copy_to": "full_name" <1>
|
|
},
|
|
"last_name": {
|
|
"type": "string",
|
|
"copy_to": "full_name" <1>
|
|
},
|
|
"full_name": {
|
|
"type": "string"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
PUT myindex/mytype/1
|
|
{
|
|
"first_name": "John",
|
|
"last_name": "Smith"
|
|
}
|
|
|
|
GET myindex/_search
|
|
{
|
|
"query": {
|
|
"match": {
|
|
"full_name": "John Smith"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
<1> The `first_name` and `last_name` values are copied to the `full_name` field.
|
|
|
|
[[highlighting-all-field]]
|
|
==== Highlighting and the `_all` field
|
|
|
|
A field can only be used for <<search-request-highlighting,highlighting>> if
|
|
the original string value is available, either from the
|
|
<<mapping-source-field,`_source`>> field or as a stored field.
|
|
|
|
The `_all` field is not present in the `_source` field and it is not stored by
|
|
default, and so cannot be highlighted. There are two options. Either
|
|
<<all-field-store,store the `_all` field>> or highlight the
|
|
<<all-highlight-fields,original fields>>.
|
|
|
|
[[all-field-store]]
|
|
===== Store the `_all` field
|
|
|
|
If `store` is set to `true`, then the original field value is retrievable and
|
|
can be highlighted:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT myindex
|
|
{
|
|
"mappings": {
|
|
"mytype": {
|
|
"_all": {
|
|
"store": true
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
PUT myindex/mytype/1
|
|
{
|
|
"first_name": "John",
|
|
"last_name": "Smith"
|
|
}
|
|
|
|
GET _search
|
|
{
|
|
"query": {
|
|
"match": {
|
|
"_all": "John Smith"
|
|
}
|
|
},
|
|
"highlight": {
|
|
"fields": {
|
|
"_all": {}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
Of course, storing the `_all` field will use significantly more disk space
|
|
and, because it is a combination of other fields, it may result in odd
|
|
highlighting results.
|
|
|
|
The `_all` field also accepts the `term_vector` and `index_options`
|
|
parameters, allowing the use of the fast vector highlighter and the postings
|
|
highlighter.
|
|
|
|
[[all-highlight-fields]]
|
|
===== Highlight original fields
|
|
|
|
You can query the `_all` field, but use the original fields for highlighting as follows:
|
|
|
|
[source,js]
|
|
--------------------------------
|
|
PUT myindex
|
|
{
|
|
"mappings": {
|
|
"mytype": {
|
|
"_all": {}
|
|
}
|
|
}
|
|
}
|
|
|
|
PUT myindex/mytype/1
|
|
{
|
|
"first_name": "John",
|
|
"last_name": "Smith"
|
|
}
|
|
|
|
GET _search
|
|
{
|
|
"query": {
|
|
"match": {
|
|
"_all": "John Smith" <1>
|
|
}
|
|
},
|
|
"highlight": {
|
|
"fields": {
|
|
"*_name": { <2>
|
|
"require_field_match": "false" <3>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------
|
|
// AUTOSENSE
|
|
|
|
<1> The query inspects the `_all` field to find matching documents.
|
|
<2> Highlighting is performed on the two name fields, which are available from the `_source`.
|
|
<3> The query wasn't run against the name fields, so set `require_field_match` to `false`.
|
|
|
|
|