From c63d8c4fb566882bd4655ddc7dec13b4e8719244 Mon Sep 17 00:00:00 2001 From: Boaz Leskes Date: Wed, 13 Nov 2013 16:53:10 +0100 Subject: [PATCH] [Docs] Added _source filtering to documentation Relates to #3301 --- docs/reference/docs/get.asciidoc | 56 +++++++++++++--- docs/reference/docs/multi-get.asciidoc | 46 ++++++++++++- docs/reference/search/explain.asciidoc | 9 ++- docs/reference/search/request-body.asciidoc | 2 + docs/reference/search/request/fields.asciidoc | 16 +++-- .../search/request/source-filtering.asciidoc | 64 +++++++++++++++++++ docs/reference/search/uri-request.asciidoc | 10 +-- 7 files changed, 182 insertions(+), 21 deletions(-) create mode 100644 docs/reference/search/request/source-filtering.asciidoc diff --git a/docs/reference/docs/get.asciidoc b/docs/reference/docs/get.asciidoc index d9b5521bd74..da4b3a69e7a 100644 --- a/docs/reference/docs/get.asciidoc +++ b/docs/reference/docs/get.asciidoc @@ -65,22 +65,55 @@ are stored. The get API allows for `_type` to be optional. Set it to `_all` in order to fetch the first document matching the id across all types. + +[float] +[[get-source-filtering]] +=== Source filtering + +added[1.0.0.Beta1] + +By default, the get operation returns the contents of the `_source` field unless +you have used the `fields` parameter or if the `_source` field is disabled. +You can turn off `_source` retrieval by using the `_source` parameter: + +[source,js] +-------------------------------------------------- +curl -XGET 'http://localhost:9200/twitter/tweet/1?_source=false' +-------------------------------------------------- + +If you only need one or two fields from the complete `_source`, you can use the `_source_include` +& `_source_exclude` parameters to include or filter out that parts you need. This can be especially helpful +with large documents where partial retrieval can save on network overhead. Both parameters take a comma separated list +of fields or wildcard expressions. Example: + +[source,js] +-------------------------------------------------- +curl -XGET 'http://localhost:9200/twitter/tweet/1?_source_include=*.id&_source_exclude=entities' +-------------------------------------------------- + +If only want to specify includes, you can use a shorter notation: + +[source,js] +-------------------------------------------------- +curl -XGET 'http://localhost:9200/twitter/tweet/1?_source=*.id,retweeted' +-------------------------------------------------- + + [float] [[get-fields]] === Fields -The get operation allows specifying a set of fields that will be -returned (by default, the `_source` field) by passing the `fields` -parameter. For example: +The get operation allows specifying a set of stored fields that will be +returned by passing the `fields` parameter. For example: [source,js] -------------------------------------------------- curl -XGET 'http://localhost:9200/twitter/tweet/1?fields=title,content' -------------------------------------------------- -The returned fields will either be loaded if they are stored, or fetched -from the `_source` (parsed and extracted). It also supports sub objects -extraction from _source, like `obj1.obj2`. +For backward compatibility, if the requested fields are not stored, they will be fetched +from the `_source` (parsed and extracted). This functionality has been replaced by the +<> parameter. [float] [[_source]] @@ -95,8 +128,15 @@ without any additional content around it. For example: curl -XGET 'http://localhost:9200/twitter/tweet/1/_source' -------------------------------------------------- -Note, there is also a HEAD variant for the _source endpoint. Curl -example: +You can also use the same source filtering parameters to control which parts of the `_source` will be returned: + +[source,js] +-------------------------------------------------- +curl -XGET 'http://localhost:9200/twitter/tweet/1/_source?_source_include=*.id&_source_exclude=entities' +-------------------------------------------------- + +Note, there is also a HEAD variant for the _source endpoint to efficiently test for document existence. +Curl example: [source,js] -------------------------------------------------- diff --git a/docs/reference/docs/multi-get.asciidoc b/docs/reference/docs/multi-get.asciidoc index 01f972e41b2..0e98ba9c005 100644 --- a/docs/reference/docs/multi-get.asciidoc +++ b/docs/reference/docs/multi-get.asciidoc @@ -70,11 +70,55 @@ curl 'localhost:9200/test/type/_mget' -d '{ }' -------------------------------------------------- +[float] +[[mget-source-filtering]] +=== Source filtering + +added[1.0.0.Beta1] + +By default, the `_source` field will be returned for every document (if stored). +Similar to the <> API, you can retrieve only parts of +the `_source` (or not at all) by using the `_source` parameter. You can also use +the url parameters `_source`,`_source_include` & `_source_exclude` to specify defaults, +which will be used when there are no per-document instructions. + +For example: + +[source,js] +-------------------------------------------------- +curl 'localhost:9200/_mget' -d '{ + "docs" : [ + { + "_index" : "test", + "_type" : "type", + "_id" : "1", + "_source" : false + }, + { + "_index" : "test", + "_type" : "type", + "_id" : "2", + "_source" : ["field3", "field4"] + }, + { + "_index" : "test", + "_type" : "type", + "_id" : "3", + "_source" : { + "include": ["user"], + "_exclude": ["user.location"] + } + } + ] +}' +-------------------------------------------------- + + [float] [[mget-fields]] === Fields -Specific fields can be specified to be retrieved per document to get. +Specific stored fields can be specified to be retrieved per document to get, similar to the <> parameter of the Get API. For example: [source,js] diff --git a/docs/reference/search/explain.asciidoc b/docs/reference/search/explain.asciidoc index fb346d01962..9f8ce9436eb 100644 --- a/docs/reference/search/explain.asciidoc +++ b/docs/reference/search/explain.asciidoc @@ -62,9 +62,14 @@ This will yield the same result as the previous request. === All parameters: [horizontal] +`_source`:: + + added[1.0.0.Beta1] Set to `true` to retrieve the `_source` of the document explained. You can also + retrieve part of the document by using `_source_include` & `_source_exclude` (see <> for more details) + `fields`:: - Allows to control which fields to return as part of the - document explained (support `_source` for the full document). + Allows to control which stored fields to return as part of the + document explained. `routing`:: Controls the routing in the case the routing was used diff --git a/docs/reference/search/request-body.asciidoc b/docs/reference/search/request-body.asciidoc index 85a264ec662..0e93eeef303 100644 --- a/docs/reference/search/request-body.asciidoc +++ b/docs/reference/search/request-body.asciidoc @@ -81,6 +81,8 @@ include::request/from-size.asciidoc[] include::request/sort.asciidoc[] +include::request/source-filtering.asciidoc[] + include::request/fields.asciidoc[] include::request/script-fields.asciidoc[] diff --git a/docs/reference/search/request/fields.asciidoc b/docs/reference/search/request/fields.asciidoc index 4be98dbdb00..df4075431e9 100644 --- a/docs/reference/search/request/fields.asciidoc +++ b/docs/reference/search/request/fields.asciidoc @@ -1,8 +1,8 @@ [[search-request-fields]] === Fields -Allows to selectively load specific fields for each document represented -by a search hit. Defaults to load the internal `_source` field. +Allows to selectively load specific stored fields for each document represented +by a search hit. [source,js] -------------------------------------------------- @@ -14,10 +14,6 @@ by a search hit. Defaults to load the internal `_source` field. } -------------------------------------------------- -The fields will automatically load stored fields (`store` mapping set to -`true`), or, if not stored, will load the `_source` and extract it from -it (allowing to return nested document object). - `*` can be used to load all stored fields from the document. An empty array will cause only the `_id` and `_type` for each hit to be @@ -33,6 +29,11 @@ returned, for example: } -------------------------------------------------- + +For backwards compatibility, if the fields parameter specifies fields which are not stored (`store` mapping set to +`false`), it will load the `_source` and extract it from it. This functionality has been replaced by the +<> parameter. + Script fields can also be automatically detected and used as fields, so things like `_source.obj1.obj2` can be used, though not recommended, as `obj1.obj2` will work as well. @@ -40,6 +41,9 @@ things like `_source.obj1.obj2` can be used, though not recommended, as [[partial]] ==== Partial +deprecated[1.0.0Beta1,Replaced by <>] + + When loading data from `_source`, partial fields can be used to use wildcards to control what part of the `_source` will be loaded based on `include` and `exclude` patterns. For example: diff --git a/docs/reference/search/request/source-filtering.asciidoc b/docs/reference/search/request/source-filtering.asciidoc new file mode 100644 index 00000000000..56b10913914 --- /dev/null +++ b/docs/reference/search/request/source-filtering.asciidoc @@ -0,0 +1,64 @@ +[[search-request-source-filtering]] +=== Source filtering + +added[1.0.0.Beta1] + + +Allows to control how the `_source` field is returned with every hit. + +By default, the contents of the `_source` field unless +you have used the `fields` parameter or if the `_source` field is disabled. +You can turn off `_source` retrieval by using the `_source` parameter: + +To disable `_source` retrieval set to `false`: + +[source,js] +-------------------------------------------------- +{ + "_source": false, + "query" : { + "term" : { "user" : "kimchy" } + } +} +-------------------------------------------------- + +The `_source` also accepts one or more wildcard patterns to control what parts of the `_source` should be returned: + +For example: + +[source,js] +-------------------------------------------------- +{ + "_source": "obj.*", + "query" : { + "term" : { "user" : "kimchy" } + } +} +-------------------------------------------------- + +Or + +[source,js] +-------------------------------------------------- +{ + "_source": [ "obj1.*", "obj2.*" ], + "query" : { + "term" : { "user" : "kimchy" } + } +} +-------------------------------------------------- + +Finally, for complete control, you can specify both include and exclude patterns: + +[source,js] +-------------------------------------------------- +{ + "_source": { + "include": [ "obj1.*", "obj2.*" ], + "exclude": [ "*.description" ], + } + "query" : { + "term" : { "user" : "kimchy" } + } +} +-------------------------------------------------- diff --git a/docs/reference/search/uri-request.asciidoc b/docs/reference/search/uri-request.asciidoc index 5940787be77..5f480e3c697 100644 --- a/docs/reference/search/uri-request.asciidoc +++ b/docs/reference/search/uri-request.asciidoc @@ -62,10 +62,12 @@ query. |`explain` |For each hit, contain an explanation of how scoring of the hits was computed. -|`fields` |The selective fields of the document to return for each hit -(either retrieved from the index if stored, or from the `_source` if -not), comma delimited. Defaults to the internal `_source` field. Not -specifying any value will cause no fields to return. +|`_source`| added[1.0.0.Beta1]Set to `false` to disable retrieval of the `_source` field. You can also retrieve +part of the document by using `_source_include` & `_source_exclude` (see the <> +documentation for more details) + +|`fields` |The selective stored fields of the document to return for each hit, +comma delimited. Not specifying any value will cause no fields to return. |`sort` |Sorting to perform. Can either be in the form of `fieldName`, or `fieldName:asc`/`fieldName:desc`. The fieldName can either be an actual