2013-08-28 19:24:34 -04:00
|
|
|
[[mapping-all-field]]
|
2015-07-19 19:24:29 -04:00
|
|
|
=== `_all` field
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2016-08-19 17:06:10 -04:00
|
|
|
The `_all` field is a special _catch-all_ field which concatenates the values of
|
|
|
|
all of the other string fields into one big string, using space as a delimiter,
|
|
|
|
which is then <<analysis,analyzed>> and indexed, but not stored. This means that
|
|
|
|
it can be searched, but not retrieved.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2016-08-19 17:06:10 -04:00
|
|
|
The `_all` field allows you to search for string values in documents without
|
|
|
|
knowing which field contains the value. This makes it a useful option when
|
|
|
|
getting started with a new dataset. For instance:
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2015-07-19 19:24:29 -04:00
|
|
|
[source,js]
|
|
|
|
--------------------------------
|
|
|
|
PUT my_index/user/1 <1>
|
|
|
|
{
|
|
|
|
"first_name": "John",
|
|
|
|
"last_name": "Smith",
|
2016-08-19 17:06:10 -04:00
|
|
|
"date_of_birth": "Born 1970-10-24"
|
2015-07-19 19:24:29 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
GET my_index/_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"match": {
|
2016-08-23 11:24:33 -04:00
|
|
|
"_all": "john smith 1970"
|
2015-07-19 19:24:29 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2016-08-19 17:06:10 -04:00
|
|
|
<1> The `_all` field will contain the terms: [ `"john"`, `"smith"`, `"born"`, `"1970"`, `"10"`, `"24"` ]
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
[NOTE]
|
2016-08-19 17:06:10 -04:00
|
|
|
.Only string values are added to _all
|
2015-07-19 19:24:29 -04:00
|
|
|
=============================================================================
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2016-08-19 17:06:10 -04:00
|
|
|
The `date_of_birth` field in the above example is recognised as a `string` field
|
|
|
|
and thus will be analyzed as a string, resulting in the terms `"born"`,
|
|
|
|
`"1970"`, `"24"`, and `"10"`. If the `date_of_birth` field were an actual date
|
|
|
|
type field, it would not be included in the `_all` field, since `_all` only
|
|
|
|
contains content from string fields.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2015-07-19 19:24:29 -04:00
|
|
|
It is important to note that the `_all` field combines the original values
|
|
|
|
from each field as a string. It does not combine the _terms_ from each field.
|
|
|
|
|
|
|
|
=============================================================================
|
|
|
|
|
2016-03-18 12:01:27 -04:00
|
|
|
The `_all` field is just a <<text,`text`>> field, and accepts the same
|
2015-07-19 19:24:29 -04:00
|
|
|
parameters that other string fields accept, including `analyzer`,
|
|
|
|
`term_vectors`, `index_options`, and `store`.
|
|
|
|
|
|
|
|
The `_all` field can be useful, especially when exploring new data using
|
|
|
|
simple filtering. However, by concatenating field values into one big string,
|
|
|
|
the `_all` field loses the distinction between short fields (more relevant)
|
|
|
|
and long fields (less relevant). For use cases where search relevance is
|
|
|
|
important, it is better to query individual fields specifically.
|
|
|
|
|
|
|
|
The `_all` field is not free: it requires extra CPU cycles and uses more disk
|
|
|
|
space. If not needed, it can be completely <<disabling-all-field,disabled>> or
|
|
|
|
customised on a <<include-in-all,per-field basis>>.
|
|
|
|
|
|
|
|
[[querying-all-field]]
|
|
|
|
==== Using the `_all` field in queries
|
|
|
|
|
|
|
|
The <<query-dsl-query-string-query,`query_string`>> and
|
|
|
|
<<query-dsl-simple-query-string-query,`simple_query_string`>> queries query
|
|
|
|
the `_all` field by default, unless another field is specified:
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
[source,js]
|
2015-07-19 19:24:29 -04:00
|
|
|
--------------------------------
|
|
|
|
GET _search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string": {
|
2016-08-23 11:24:33 -04:00
|
|
|
"query": "john smith 1970"
|
2015-07-19 19:24:29 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
The same goes for the `?q=` parameter in <<search-uri-request, URI search
|
|
|
|
requests>> (which is rewritten to a `query_string` query internally):
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------
|
2016-08-23 11:24:33 -04:00
|
|
|
GET _search?q=john+smith+1970
|
2015-07-19 19:24:29 -04:00
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
Other queries, such as the <<query-dsl-match-query,`match`>> and
|
|
|
|
<<query-dsl-term-query,`term`>> queries require you to specify
|
|
|
|
the `_all` field explicitly, as per the
|
|
|
|
<<mapping-all-field,first example>>.
|
|
|
|
|
|
|
|
[[disabling-all-field]]
|
|
|
|
==== Disabling the `_all` field
|
|
|
|
|
|
|
|
The `_all` field can be completely disabled per-type by setting `enabled` to
|
|
|
|
`false`:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------
|
|
|
|
PUT my_index
|
|
|
|
{
|
|
|
|
"mappings": {
|
|
|
|
"type_1": { <1>
|
|
|
|
"properties": {...}
|
|
|
|
},
|
|
|
|
"type_2": { <2>
|
|
|
|
"_all": {
|
|
|
|
"enabled": false
|
|
|
|
},
|
|
|
|
"properties": {...}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2016-04-29 10:42:03 -04:00
|
|
|
// TEST[s/\.\.\.//]
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
<1> The `_all` field in `type_1` is enabled.
|
|
|
|
<2> The `_all` field in `type_2` is completely disabled.
|
|
|
|
|
|
|
|
If the `_all` field is disabled, then URI search requests and the
|
|
|
|
`query_string` and `simple_query_string` queries will not be able to use it
|
|
|
|
for queries (see <<querying-all-field>>). You can configure them to use a
|
|
|
|
different field with the `index.query.default_field` setting:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------
|
|
|
|
PUT my_index
|
|
|
|
{
|
|
|
|
"mappings": {
|
|
|
|
"my_type": {
|
|
|
|
"_all": {
|
|
|
|
"enabled": false <1>
|
|
|
|
},
|
|
|
|
"properties": {
|
|
|
|
"content": {
|
2016-03-18 12:01:27 -04:00
|
|
|
"type": "text"
|
2015-07-19 19:24:29 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"settings": {
|
|
|
|
"index.query.default_field": "content" <2>
|
2016-04-29 10:42:03 -04:00
|
|
|
}
|
2015-07-19 19:24:29 -04:00
|
|
|
}
|
|
|
|
--------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
<1> The `_all` field is disabled for the `my_type` type.
|
|
|
|
<2> The `query_string` query will default to querying the `content` field in this index.
|
|
|
|
|
2015-08-06 11:24:29 -04:00
|
|
|
[[excluding-from-all]]
|
|
|
|
==== Excluding fields from `_all`
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
Individual fields can be included or excluded from the `_all` field with the
|
2015-08-06 11:24:29 -04:00
|
|
|
<<include-in-all,`include_in_all`>> setting.
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
|
|
|
|
[[all-field-and-boosting]]
|
|
|
|
==== Index boosting and the `_all` field
|
|
|
|
|
2016-03-02 04:51:47 -05:00
|
|
|
Individual fields can be _boosted_ at index time, with the <<mapping-boost,`boost`>>
|
2015-08-06 11:24:29 -04:00
|
|
|
parameter. The `_all` field takes these boosts into account:
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------
|
|
|
|
PUT myindex
|
|
|
|
{
|
|
|
|
"mappings": {
|
|
|
|
"mytype": {
|
|
|
|
"properties": {
|
|
|
|
"title": { <1>
|
2016-03-18 12:01:27 -04:00
|
|
|
"type": "text",
|
2015-07-19 19:24:29 -04:00
|
|
|
"boost": 2
|
|
|
|
},
|
|
|
|
"content": { <1>
|
2016-03-18 12:01:27 -04:00
|
|
|
"type": "text"
|
2015-07-19 19:24:29 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
<1> When querying the `_all` field, words that originated in the
|
|
|
|
`title` field are twice as relevant as words that originated in
|
|
|
|
the `content` field.
|
|
|
|
|
|
|
|
WARNING: Using index-time boosting with the `_all` field has a significant
|
|
|
|
impact on query performance. Usually the better solution is to query fields
|
|
|
|
individually, with optional query time boosting.
|
|
|
|
|
|
|
|
|
|
|
|
[[custom-all-fields]]
|
|
|
|
==== Custom `_all` fields
|
|
|
|
|
|
|
|
While there is only a single `_all` field per index, the <<copy-to,`copy_to`>>
|
|
|
|
parameter allows the creation of multiple __custom `_all` fields__. For
|
|
|
|
instance, `first_name` and `last_name` fields can be combined together into
|
|
|
|
the `full_name` field:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------
|
|
|
|
PUT myindex
|
|
|
|
{
|
|
|
|
"mappings": {
|
|
|
|
"mytype": {
|
|
|
|
"properties": {
|
|
|
|
"first_name": {
|
2016-03-18 12:01:27 -04:00
|
|
|
"type": "text",
|
2015-07-19 19:24:29 -04:00
|
|
|
"copy_to": "full_name" <1>
|
|
|
|
},
|
|
|
|
"last_name": {
|
2016-03-18 12:01:27 -04:00
|
|
|
"type": "text",
|
2015-07-19 19:24:29 -04:00
|
|
|
"copy_to": "full_name" <1>
|
|
|
|
},
|
|
|
|
"full_name": {
|
2016-03-18 12:01:27 -04:00
|
|
|
"type": "text"
|
2015-07-19 19:24:29 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
PUT myindex/mytype/1
|
|
|
|
{
|
|
|
|
"first_name": "John",
|
|
|
|
"last_name": "Smith"
|
|
|
|
}
|
|
|
|
|
|
|
|
GET myindex/_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"match": {
|
|
|
|
"full_name": "John Smith"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
<1> The `first_name` and `last_name` values are copied to the `full_name` field.
|
|
|
|
|
|
|
|
[[highlighting-all-field]]
|
|
|
|
==== Highlighting and the `_all` field
|
|
|
|
|
|
|
|
A field can only be used for <<search-request-highlighting,highlighting>> if
|
|
|
|
the original string value is available, either from the
|
|
|
|
<<mapping-source-field,`_source`>> field or as a stored field.
|
|
|
|
|
|
|
|
The `_all` field is not present in the `_source` field and it is not stored by
|
|
|
|
default, and so cannot be highlighted. There are two options. Either
|
|
|
|
<<all-field-store,store the `_all` field>> or highlight the
|
|
|
|
<<all-highlight-fields,original fields>>.
|
|
|
|
|
|
|
|
[[all-field-store]]
|
|
|
|
===== Store the `_all` field
|
|
|
|
|
|
|
|
If `store` is set to `true`, then the original field value is retrievable and
|
|
|
|
can be highlighted:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------
|
|
|
|
PUT myindex
|
|
|
|
{
|
|
|
|
"mappings": {
|
|
|
|
"mytype": {
|
|
|
|
"_all": {
|
|
|
|
"store": true
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
PUT myindex/mytype/1
|
|
|
|
{
|
|
|
|
"first_name": "John",
|
|
|
|
"last_name": "Smith"
|
|
|
|
}
|
|
|
|
|
|
|
|
GET _search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"match": {
|
|
|
|
"_all": "John Smith"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"highlight": {
|
|
|
|
"fields": {
|
|
|
|
"_all": {}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2015-07-19 19:24:29 -04:00
|
|
|
|
|
|
|
Of course, storing the `_all` field will use significantly more disk space
|
|
|
|
and, because it is a combination of other fields, it may result in odd
|
|
|
|
highlighting results.
|
|
|
|
|
|
|
|
The `_all` field also accepts the `term_vector` and `index_options`
|
|
|
|
parameters, allowing the use of the fast vector highlighter and the postings
|
|
|
|
highlighter.
|
|
|
|
|
|
|
|
[[all-highlight-fields]]
|
|
|
|
===== Highlight original fields
|
|
|
|
|
|
|
|
You can query the `_all` field, but use the original fields for highlighting as follows:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------
|
|
|
|
PUT myindex
|
|
|
|
{
|
|
|
|
"mappings": {
|
|
|
|
"mytype": {
|
|
|
|
"_all": {}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
PUT myindex/mytype/1
|
|
|
|
{
|
|
|
|
"first_name": "John",
|
|
|
|
"last_name": "Smith"
|
|
|
|
}
|
|
|
|
|
|
|
|
GET _search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"match": {
|
|
|
|
"_all": "John Smith" <1>
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"highlight": {
|
|
|
|
"fields": {
|
|
|
|
"*_name": { <2>
|
2016-04-29 10:42:03 -04:00
|
|
|
"require_field_match": false <3>
|
2015-07-19 19:24:29 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
// CONSOLE
|
2013-08-28 19:24:34 -04:00
|
|
|
|
2015-07-19 19:24:29 -04:00
|
|
|
<1> The query inspects the `_all` field to find matching documents.
|
|
|
|
<2> Highlighting is performed on the two name fields, which are available from the `_source`.
|
|
|
|
<3> The query wasn't run against the name fields, so set `require_field_match` to `false`.
|