[Docs] Added _source filtering to documentation

Relates to #3301
This commit is contained in:
Boaz Leskes 2013-11-13 16:53:10 +01:00
parent 3c4fc119ab
commit c63d8c4fb5
7 changed files with 182 additions and 21 deletions

View File

@ -65,22 +65,55 @@ are stored.
The get API allows for `_type` to be optional. Set it to `_all` in order
to fetch the first document matching the id across all types.
[float]
[[get-source-filtering]]
=== Source filtering
added[1.0.0.Beta1]
By default, the get operation returns the contents of the `_source` field unless
you have used the `fields` parameter or if the `_source` field is disabled.
You can turn off `_source` retrieval by using the `_source` parameter:
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/twitter/tweet/1?_source=false'
--------------------------------------------------
If you only need one or two fields from the complete `_source`, you can use the `_source_include`
& `_source_exclude` parameters to include or filter out that parts you need. This can be especially helpful
with large documents where partial retrieval can save on network overhead. Both parameters take a comma separated list
of fields or wildcard expressions. Example:
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/twitter/tweet/1?_source_include=*.id&_source_exclude=entities'
--------------------------------------------------
If only want to specify includes, you can use a shorter notation:
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/twitter/tweet/1?_source=*.id,retweeted'
--------------------------------------------------
[float]
[[get-fields]]
=== Fields
The get operation allows specifying a set of fields that will be
returned (by default, the `_source` field) by passing the `fields`
parameter. For example:
The get operation allows specifying a set of stored fields that will be
returned by passing the `fields` parameter. For example:
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/twitter/tweet/1?fields=title,content'
--------------------------------------------------
The returned fields will either be loaded if they are stored, or fetched
from the `_source` (parsed and extracted). It also supports sub objects
extraction from _source, like `obj1.obj2`.
For backward compatibility, if the requested fields are not stored, they will be fetched
from the `_source` (parsed and extracted). This functionality has been replaced by the
<<get-source-filtering,source filtering>> parameter.
[float]
[[_source]]
@ -95,8 +128,15 @@ without any additional content around it. For example:
curl -XGET 'http://localhost:9200/twitter/tweet/1/_source'
--------------------------------------------------
Note, there is also a HEAD variant for the _source endpoint. Curl
example:
You can also use the same source filtering parameters to control which parts of the `_source` will be returned:
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/twitter/tweet/1/_source?_source_include=*.id&_source_exclude=entities'
--------------------------------------------------
Note, there is also a HEAD variant for the _source endpoint to efficiently test for document existence.
Curl example:
[source,js]
--------------------------------------------------

View File

@ -70,11 +70,55 @@ curl 'localhost:9200/test/type/_mget' -d '{
}'
--------------------------------------------------
[float]
[[mget-source-filtering]]
=== Source filtering
added[1.0.0.Beta1]
By default, the `_source` field will be returned for every document (if stored).
Similar to the <<get-source-filtering,get>> API, you can retrieve only parts of
the `_source` (or not at all) by using the `_source` parameter. You can also use
the url parameters `_source`,`_source_include` & `_source_exclude` to specify defaults,
which will be used when there are no per-document instructions.
For example:
[source,js]
--------------------------------------------------
curl 'localhost:9200/_mget' -d '{
"docs" : [
{
"_index" : "test",
"_type" : "type",
"_id" : "1",
"_source" : false
},
{
"_index" : "test",
"_type" : "type",
"_id" : "2",
"_source" : ["field3", "field4"]
},
{
"_index" : "test",
"_type" : "type",
"_id" : "3",
"_source" : {
"include": ["user"],
"_exclude": ["user.location"]
}
}
]
}'
--------------------------------------------------
[float]
[[mget-fields]]
=== Fields
Specific fields can be specified to be retrieved per document to get.
Specific stored fields can be specified to be retrieved per document to get, similar to the <<get-fields,fields>> parameter of the Get API.
For example:
[source,js]

View File

@ -62,9 +62,14 @@ This will yield the same result as the previous request.
=== All parameters:
[horizontal]
`_source`::
added[1.0.0.Beta1] Set to `true` to retrieve the `_source` of the document explained. You can also
retrieve part of the document by using `_source_include` & `_source_exclude` (see <<get-source-filtering,Get API>> for more details)
`fields`::
Allows to control which fields to return as part of the
document explained (support `_source` for the full document).
Allows to control which stored fields to return as part of the
document explained.
`routing`::
Controls the routing in the case the routing was used

View File

@ -81,6 +81,8 @@ include::request/from-size.asciidoc[]
include::request/sort.asciidoc[]
include::request/source-filtering.asciidoc[]
include::request/fields.asciidoc[]
include::request/script-fields.asciidoc[]

View File

@ -1,8 +1,8 @@
[[search-request-fields]]
=== Fields
Allows to selectively load specific fields for each document represented
by a search hit. Defaults to load the internal `_source` field.
Allows to selectively load specific stored fields for each document represented
by a search hit.
[source,js]
--------------------------------------------------
@ -14,10 +14,6 @@ by a search hit. Defaults to load the internal `_source` field.
}
--------------------------------------------------
The fields will automatically load stored fields (`store` mapping set to
`true`), or, if not stored, will load the `_source` and extract it from
it (allowing to return nested document object).
`*` can be used to load all stored fields from the document.
An empty array will cause only the `_id` and `_type` for each hit to be
@ -33,6 +29,11 @@ returned, for example:
}
--------------------------------------------------
For backwards compatibility, if the fields parameter specifies fields which are not stored (`store` mapping set to
`false`), it will load the `_source` and extract it from it. This functionality has been replaced by the
<<search-request-source-filtering,source filtering>> parameter.
Script fields can also be automatically detected and used as fields, so
things like `_source.obj1.obj2` can be used, though not recommended, as
`obj1.obj2` will work as well.
@ -40,6 +41,9 @@ things like `_source.obj1.obj2` can be used, though not recommended, as
[[partial]]
==== Partial
deprecated[1.0.0Beta1,Replaced by <<search-request-source-filtering>>]
When loading data from `_source`, partial fields can be used to use
wildcards to control what part of the `_source` will be loaded based on
`include` and `exclude` patterns. For example:

View File

@ -0,0 +1,64 @@
[[search-request-source-filtering]]
=== Source filtering
added[1.0.0.Beta1]
Allows to control how the `_source` field is returned with every hit.
By default, the contents of the `_source` field unless
you have used the `fields` parameter or if the `_source` field is disabled.
You can turn off `_source` retrieval by using the `_source` parameter:
To disable `_source` retrieval set to `false`:
[source,js]
--------------------------------------------------
{
"_source": false,
"query" : {
"term" : { "user" : "kimchy" }
}
}
--------------------------------------------------
The `_source` also accepts one or more wildcard patterns to control what parts of the `_source` should be returned:
For example:
[source,js]
--------------------------------------------------
{
"_source": "obj.*",
"query" : {
"term" : { "user" : "kimchy" }
}
}
--------------------------------------------------
Or
[source,js]
--------------------------------------------------
{
"_source": [ "obj1.*", "obj2.*" ],
"query" : {
"term" : { "user" : "kimchy" }
}
}
--------------------------------------------------
Finally, for complete control, you can specify both include and exclude patterns:
[source,js]
--------------------------------------------------
{
"_source": {
"include": [ "obj1.*", "obj2.*" ],
"exclude": [ "*.description" ],
}
"query" : {
"term" : { "user" : "kimchy" }
}
}
--------------------------------------------------

View File

@ -62,10 +62,12 @@ query.
|`explain` |For each hit, contain an explanation of how scoring of the
hits was computed.
|`fields` |The selective fields of the document to return for each hit
(either retrieved from the index if stored, or from the `_source` if
not), comma delimited. Defaults to the internal `_source` field. Not
specifying any value will cause no fields to return.
|`_source`| added[1.0.0.Beta1]Set to `false` to disable retrieval of the `_source` field. You can also retrieve
part of the document by using `_source_include` & `_source_exclude` (see the <<search-request-source-filtering, request body>>
documentation for more details)
|`fields` |The selective stored fields of the document to return for each hit,
comma delimited. Not specifying any value will cause no fields to return.
|`sort` |Sorting to perform. Can either be in the form of `fieldName`, or
`fieldName:asc`/`fieldName:desc`. The fieldName can either be an actual