2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments]]
|
|
|
|
=== Mapper Attachments Plugin
|
2011-12-05 07:05:14 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
The mapper attachments plugin lets Elasticsearch index file attachments in common formats (such as PPT, XLS, PDF)
|
|
|
|
using the Apache text extraction library http://lucene.apache.org/tika/[Tika].
|
2015-09-03 17:09:31 -04:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
In practice, the plugin adds the `attachment` type when mapping properties so that documents can be populated with
|
|
|
|
file attachment contents (encoded as `base64`).
|
2015-09-03 17:09:31 -04:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-install]]
|
|
|
|
[float]
|
|
|
|
==== Installation
|
2011-12-05 07:05:14 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
This plugin can be installed using the plugin manager:
|
2014-03-19 18:04:09 -04:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,sh]
|
|
|
|
----------------------------------------------------------------
|
2016-02-04 10:00:55 -05:00
|
|
|
sudo bin/elasticsearch-plugin install mapper-attachments
|
2015-11-09 09:35:06 -05:00
|
|
|
----------------------------------------------------------------
|
2012-03-04 04:59:22 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
The plugin must be installed on every node in the cluster, and each node must
|
|
|
|
be restarted after installation.
|
|
|
|
|
|
|
|
[[mapper-attachments-remove]]
|
|
|
|
[float]
|
|
|
|
==== Removal
|
|
|
|
|
|
|
|
The plugin can be removed with the following command:
|
|
|
|
|
|
|
|
[source,sh]
|
|
|
|
----------------------------------------------------------------
|
2016-02-04 10:00:55 -05:00
|
|
|
sudo bin/elasticsearch-plugin remove mapper-attachments
|
2015-11-09 09:35:06 -05:00
|
|
|
----------------------------------------------------------------
|
|
|
|
|
|
|
|
The node must be stopped before removing the plugin.
|
|
|
|
|
|
|
|
[[mapper-attachments-helloworld]]
|
|
|
|
==== Hello, world
|
2015-09-03 17:09:31 -04:00
|
|
|
|
|
|
|
Create a property mapping using the new type `attachment`:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
2015-09-03 17:09:31 -04:00
|
|
|
POST /trying-out-mapper-attachments
|
|
|
|
{
|
|
|
|
"mappings": {
|
|
|
|
"person": {
|
|
|
|
"properties": {
|
|
|
|
"cv": { "type": "attachment" }
|
|
|
|
}}}}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2015-09-03 17:09:31 -04:00
|
|
|
|
|
|
|
Index a new document populated with a `base64`-encoded attachment:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
2015-09-03 17:09:31 -04:00
|
|
|
POST /trying-out-mapper-attachments/person/1
|
|
|
|
{
|
|
|
|
"cv": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
|
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2015-09-03 17:09:31 -04:00
|
|
|
|
|
|
|
Search for the document using words in the attachment:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
2015-09-03 17:09:31 -04:00
|
|
|
POST /trying-out-mapper-attachments/person/_search
|
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"query_string": {
|
|
|
|
"query": "ipsum"
|
|
|
|
}}}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2015-09-03 17:09:31 -04:00
|
|
|
|
|
|
|
If you get a hit for your indexed document, the plugin should be installed and working.
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-usage]]
|
|
|
|
==== Usage
|
2012-03-04 04:59:22 -05:00
|
|
|
|
|
|
|
Using the attachment type is simple, in your mapping JSON, simply set a certain JSON element as attachment, for example:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
2015-10-29 14:20:26 -04:00
|
|
|
PUT /test
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
PUT /test/person/_mapping
|
2013-08-20 10:03:31 -04:00
|
|
|
{
|
|
|
|
"person" : {
|
|
|
|
"properties" : {
|
|
|
|
"my_attachment" : { "type" : "attachment" }
|
2012-03-04 04:59:22 -05:00
|
|
|
}
|
|
|
|
}
|
2013-08-20 10:03:31 -04:00
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2012-03-04 04:59:22 -05:00
|
|
|
|
|
|
|
In this case, the JSON to index can be:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
PUT /test/person/1
|
2013-08-20 10:03:31 -04:00
|
|
|
{
|
|
|
|
"my_attachment" : "... base64 encoded attachment ..."
|
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2012-03-04 04:59:22 -05:00
|
|
|
|
2014-06-03 04:09:21 -04:00
|
|
|
Or it is possible to use more elaborated JSON if content type, resource name or language need to be set explicitly:
|
2012-03-04 04:59:22 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
PUT /test/person/1
|
2013-08-20 10:03:31 -04:00
|
|
|
{
|
|
|
|
"my_attachment" : {
|
|
|
|
"_content_type" : "application/pdf",
|
|
|
|
"_name" : "resource/name/of/my.pdf",
|
2014-06-03 04:09:21 -04:00
|
|
|
"_language" : "en",
|
2014-07-25 11:53:19 -04:00
|
|
|
"_content" : "... base64 encoded attachment ..."
|
2012-03-04 04:59:22 -05:00
|
|
|
}
|
2013-08-20 10:03:31 -04:00
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2013-08-20 10:03:31 -04:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
The `attachment` type not only indexes the content of the doc in `content` sub field, but also automatically adds meta
|
2015-10-29 14:20:26 -04:00
|
|
|
data on the attachment as well (when available).
|
2013-08-20 10:03:31 -04:00
|
|
|
|
|
|
|
The metadata supported are:
|
|
|
|
|
|
|
|
* `date`
|
|
|
|
* `title`
|
|
|
|
* `name` only available if you set `_name` see above
|
|
|
|
* `author`
|
|
|
|
* `keywords`
|
|
|
|
* `content_type`
|
|
|
|
* `content_length` is the original content_length before text extraction (aka file size)
|
2013-10-24 05:52:51 -04:00
|
|
|
* `language`
|
2012-03-04 04:59:22 -05:00
|
|
|
|
2013-08-20 10:03:31 -04:00
|
|
|
They can be queried using the "dot notation", for example: `my_attachment.author`.
|
2012-03-04 04:59:22 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
Both the meta data and the actual content are simple core type mappers (string, date, …), thus, they can be controlled
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
in the mappings. For example:
|
2012-03-04 04:59:22 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
PUT /test/person/_mapping
|
2013-08-20 10:03:31 -04:00
|
|
|
{
|
|
|
|
"person" : {
|
|
|
|
"properties" : {
|
|
|
|
"file" : {
|
|
|
|
"type" : "attachment",
|
|
|
|
"fields" : {
|
2015-10-29 14:20:26 -04:00
|
|
|
"content" : {"index" : "no"},
|
2014-06-03 03:24:05 -04:00
|
|
|
"title" : {"store" : "yes"},
|
2013-08-20 10:03:31 -04:00
|
|
|
"date" : {"store" : "yes"},
|
|
|
|
"author" : {"analyzer" : "myAnalyzer"},
|
2014-06-03 03:24:05 -04:00
|
|
|
"keywords" : {"store" : "yes"},
|
|
|
|
"content_type" : {"store" : "yes"},
|
|
|
|
"content_length" : {"store" : "yes"},
|
|
|
|
"language" : {"store" : "yes"}
|
2012-03-04 04:59:22 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2013-08-20 10:03:31 -04:00
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2012-03-04 04:59:22 -05:00
|
|
|
|
2015-10-29 14:20:26 -04:00
|
|
|
In the above example, the actual content indexed is mapped under `fields` name `content`, and we decide not to index it, so
|
2015-07-17 11:36:57 -04:00
|
|
|
it will only be available in the `_all` field. The other fields map to their respective metadata names, but there is no
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
need to specify the `type` (like `string` or `date`) since it is already known.
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-copy-to]]
|
|
|
|
==== Copy To feature
|
2015-02-11 17:04:07 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
If you want to use http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#copy-to[copy_to]
|
2015-02-11 17:04:07 -05:00
|
|
|
feature, you need to define it on each sub-field you want to copy to another field:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
2015-02-11 17:04:07 -05:00
|
|
|
PUT /test/person/_mapping
|
|
|
|
{
|
|
|
|
"person": {
|
|
|
|
"properties": {
|
|
|
|
"file": {
|
|
|
|
"type": "attachment",
|
|
|
|
"fields": {
|
2015-10-29 14:20:26 -04:00
|
|
|
"content": {
|
2015-02-11 17:04:07 -05:00
|
|
|
"type": "string",
|
|
|
|
"copy_to": "copy"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"copy": {
|
|
|
|
"type": "string"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2015-02-11 17:04:07 -05:00
|
|
|
|
|
|
|
In this example, the extracted content will be copy as well to `copy` field.
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-querying-metadata]]
|
|
|
|
==== Querying or accessing metadata
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
|
|
|
|
If you need to query on metadata fields, use the attachment field name dot the metadata field. For example:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
DELETE /test
|
|
|
|
PUT /test
|
|
|
|
PUT /test/person/_mapping
|
|
|
|
{
|
|
|
|
"person": {
|
|
|
|
"properties": {
|
|
|
|
"file": {
|
|
|
|
"type": "attachment",
|
|
|
|
"fields": {
|
|
|
|
"content_type": {
|
|
|
|
"type": "string",
|
|
|
|
"store": true
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
PUT /test/person/1?refresh=true
|
|
|
|
{
|
|
|
|
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
|
|
|
|
}
|
|
|
|
GET /test/person/_search
|
|
|
|
{
|
2015-07-17 11:36:57 -04:00
|
|
|
"fields": [ "file.content_type" ],
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
"query": {
|
|
|
|
"match": {
|
|
|
|
"file.content_type": "text plain"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
|
|
|
|
Will give you:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
{
|
|
|
|
"took": 2,
|
|
|
|
"timed_out": false,
|
|
|
|
"_shards": {
|
|
|
|
"total": 5,
|
|
|
|
"successful": 5,
|
|
|
|
"failed": 0
|
|
|
|
},
|
|
|
|
"hits": {
|
|
|
|
"total": 1,
|
|
|
|
"max_score": 0.16273327,
|
|
|
|
"hits": [
|
|
|
|
{
|
|
|
|
"_index": "test",
|
|
|
|
"_type": "person",
|
|
|
|
"_id": "1",
|
|
|
|
"_score": 0.16273327,
|
|
|
|
"fields": {
|
|
|
|
"file.content_type": [
|
|
|
|
"text/plain; charset=ISO-8859-1"
|
|
|
|
]
|
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
2012-03-04 04:59:22 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-indexed-characters]]
|
|
|
|
==== Indexed Characters
|
2012-03-07 14:56:48 -05:00
|
|
|
|
2015-07-17 11:36:57 -04:00
|
|
|
By default, `100000` characters are extracted when indexing the content. This default value can be changed by setting
|
|
|
|
the `index.mapping.attachment.indexed_chars` setting. It can also be provided on a per document indexed using the
|
|
|
|
`_indexed_chars` parameter. `-1` can be set to extract all text, but note that all the text needs to be allowed to be
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
represented in memory:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
PUT /test/person/1
|
|
|
|
{
|
|
|
|
"my_attachment" : {
|
|
|
|
"_indexed_chars" : -1,
|
|
|
|
"_content" : "... base64 encoded attachment ..."
|
|
|
|
}
|
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
2012-03-07 14:56:48 -05:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-error-handling]]
|
|
|
|
==== Metadata parsing error handling
|
2012-11-30 14:25:31 -05:00
|
|
|
|
|
|
|
While extracting metadata content, errors could happen for example when parsing dates.
|
2014-07-25 11:53:19 -04:00
|
|
|
Parsing errors are ignored so your document is indexed.
|
2012-11-30 14:25:31 -05:00
|
|
|
|
|
|
|
You can disable this feature by setting the `index.mapping.attachment.ignore_errors` setting to `false`.
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-language-detection]]
|
|
|
|
==== Language Detection
|
2013-10-24 05:52:51 -04:00
|
|
|
|
|
|
|
By default, language detection is disabled (`false`) as it could come with a cost.
|
|
|
|
This default value can be changed by setting the `index.mapping.attachment.detect_language` setting.
|
|
|
|
It can also be provided on a per document indexed using the `_detect_language` parameter.
|
|
|
|
|
2014-06-03 04:09:21 -04:00
|
|
|
Note that you can force language using `_language` field when sending your actual document:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
2014-06-03 04:09:21 -04:00
|
|
|
{
|
|
|
|
"my_attachment" : {
|
|
|
|
"_language" : "en",
|
2014-07-25 11:53:19 -04:00
|
|
|
"_content" : "... base64 encoded attachment ..."
|
2014-06-03 04:09:21 -04:00
|
|
|
}
|
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
2013-10-24 05:52:51 -04:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-highlighting]]
|
|
|
|
==== Highlighting attachments
|
2014-06-03 03:27:30 -04:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
If you want to highlight your attachment content, you will need to set `"store": true` and
|
|
|
|
`"term_vector":"with_positions_offsets"` for your attachment field. Here is a full script which does it:
|
2014-06-03 03:27:30 -04:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
DELETE /test
|
|
|
|
PUT /test
|
|
|
|
PUT /test/person/_mapping
|
2014-06-03 03:27:30 -04:00
|
|
|
{
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
"person": {
|
|
|
|
"properties": {
|
|
|
|
"file": {
|
|
|
|
"type": "attachment",
|
|
|
|
"fields": {
|
2015-10-29 14:20:26 -04:00
|
|
|
"content": {
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
"type": "string",
|
|
|
|
"term_vector":"with_positions_offsets",
|
|
|
|
"store": true
|
|
|
|
}
|
2014-06-03 03:27:30 -04:00
|
|
|
}
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
}
|
2014-06-03 03:27:30 -04:00
|
|
|
}
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
PUT /test/person/1?refresh=true
|
|
|
|
{
|
|
|
|
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
|
|
|
|
}
|
|
|
|
GET /test/person/_search
|
|
|
|
{
|
2015-07-17 11:36:57 -04:00
|
|
|
"fields": [],
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
"query": {
|
|
|
|
"match": {
|
2015-10-29 14:20:26 -04:00
|
|
|
"file.content": "king queen"
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
}
|
|
|
|
},
|
|
|
|
"highlight": {
|
|
|
|
"fields": {
|
2015-10-29 14:20:26 -04:00
|
|
|
"file.content": {
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
|
|
|
// AUTOSENSE
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
|
|
|
|
It gives back:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,js]
|
|
|
|
--------------------------
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
{
|
|
|
|
"took": 9,
|
|
|
|
"timed_out": false,
|
|
|
|
"_shards": {
|
|
|
|
"total": 1,
|
|
|
|
"successful": 1,
|
|
|
|
"failed": 0
|
|
|
|
},
|
|
|
|
"hits": {
|
|
|
|
"total": 1,
|
|
|
|
"max_score": 0.13561106,
|
|
|
|
"hits": [
|
|
|
|
{
|
|
|
|
"_index": "test",
|
|
|
|
"_type": "person",
|
|
|
|
"_id": "1",
|
|
|
|
"_score": 0.13561106,
|
|
|
|
"highlight": {
|
2015-10-29 14:20:26 -04:00
|
|
|
"file.content": [
|
Add support for multi-fields
Now https://github.com/elasticsearch/elasticsearch/pull/6867 is merged in elasticsearch core code (branch 1.x - es 1.4),
we can support multi fields in mapper attachment plugin.
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment",
"path": "full",
"fields": {
"file": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
}
}
},
"content_type": {
"type": "string",
"fields": {
"store": {
"type": "string",
"store": true
},
"untouched": {
"type": "string",
"index": "not_analyzed",
"store": true
}
}
}
}
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"fields": [
"file.store",
"file.content_type.store"
],
"aggs": {
"store": {
"terms": {
"field": "file.content_type.store"
}
},
"untouched": {
"terms": {
"field": "file.content_type.untouched"
}
}
}
}
```
It gives:
```js
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"file.store": [
"\"God Save the Queen\" (alternatively \"God Save the King\"\n"
],
"file.content_type.store": [
"text/plain; charset=ISO-8859-1"
]
}
}
]
},
"aggregations": {
"store": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "1",
"doc_count": 1
},
{
"key": "8859",
"doc_count": 1
},
{
"key": "charset",
"doc_count": 1
},
{
"key": "iso",
"doc_count": 1
},
{
"key": "plain",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
},
"untouched": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": "text/plain; charset=ISO-8859-1",
"doc_count": 1
}
]
}
}
}
```
Note that using shorter definition works as well:
```
DELETE /test
PUT /test
{
"settings": {
"number_of_shards": 1
}
}
PUT /test/person/_mapping
{
"person": {
"properties": {
"file": {
"type": "attachment"
}
}
}
}
PUT /test/person/1?refresh=true
{
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
GET /test/person/_search
{
"query": {
"match": {
"file": "king"
}
}
}
```
gives:
```js
{
"took": 53,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.095891505,
"hits": [
{
"_index": "test",
"_type": "person",
"_id": "1",
"_score": 0.095891505,
"_source": {
"file": "IkdvZCBTYXZlIHRoZSBRdWVlbiIgKGFsdGVybmF0aXZlbHkgIkdvZCBTYXZlIHRoZSBLaW5nIg=="
}
}
]
}
}
```
Closes #57.
(cherry picked from commit 432d7c0)
2014-07-25 18:03:28 -04:00
|
|
|
"\"God Save the <em>Queen</em>\" (alternatively \"God Save the <em>King</em>\"\n"
|
|
|
|
]
|
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
|
|
|
}
|
2014-06-03 03:27:30 -04:00
|
|
|
}
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
2013-10-24 05:52:51 -04:00
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[[mapper-attachments-standalone]]
|
|
|
|
==== Stand alone runner
|
2015-02-09 11:43:59 -05:00
|
|
|
|
|
|
|
If you want to run some tests within your IDE, you can use `StandaloneRunner` class.
|
|
|
|
It accepts arguments:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
* `-u file://URL/TO/YOUR/DOC`
|
|
|
|
* `--size` set extracted size (default to mapper attachment size)
|
|
|
|
* `BASE64` encoded binary
|
2015-02-09 11:43:59 -05:00
|
|
|
|
|
|
|
Example:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,sh]
|
|
|
|
--------------------------
|
2015-02-09 11:43:59 -05:00
|
|
|
StandaloneRunner BASE64Text
|
|
|
|
StandaloneRunner -u /tmp/mydoc.pdf
|
|
|
|
StandaloneRunner -u /tmp/mydoc.pdf --size 1000000
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|
2015-02-09 11:43:59 -05:00
|
|
|
|
|
|
|
It produces something like:
|
|
|
|
|
2015-11-09 09:35:06 -05:00
|
|
|
[source,text]
|
|
|
|
--------------------------
|
2015-02-09 11:43:59 -05:00
|
|
|
## Extracted text
|
|
|
|
--------------------- BEGIN -----------------------
|
|
|
|
This is the extracted text
|
|
|
|
---------------------- END ------------------------
|
|
|
|
## Metadata
|
|
|
|
- author: null
|
|
|
|
- content_length: null
|
|
|
|
- content_type: application/pdf
|
|
|
|
- date: null
|
|
|
|
- keywords: null
|
|
|
|
- language: null
|
|
|
|
- name: null
|
|
|
|
- title: null
|
2015-11-09 09:35:06 -05:00
|
|
|
--------------------------
|