2013-08-28 19:24:34 -04:00
|
|
|
[[indices-analyze]]
|
|
|
|
== Analyze
|
|
|
|
|
|
|
|
Performs the analysis process on a text and return the tokens breakdown
|
|
|
|
of the text.
|
|
|
|
|
|
|
|
Can be used without specifying an index against one of the many built in
|
|
|
|
analyzers:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
curl -XGET 'localhost:9200/_analyze?analyzer=standard' -d 'this is a test'
|
|
|
|
--------------------------------------------------
|
|
|
|
|
2014-02-17 23:25:12 -05:00
|
|
|
Or by building a custom transient analyzer out of tokenizers,
|
|
|
|
token filters and char filters. Token filters can use the shorter 'filters'
|
|
|
|
parameter name:
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&filters=lowercase' -d 'this is a test'
|
2014-02-17 23:25:12 -05:00
|
|
|
|
|
|
|
curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&token_filters=lowercase&char_filters=html_strip' -d 'this is a <b>test</b>'
|
|
|
|
|
2013-08-28 19:24:34 -04:00
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
It can also run against a specific index:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
curl -XGET 'localhost:9200/test/_analyze?text=this+is+a+test'
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
The above will run an analysis on the "this is a test" text, using the
|
|
|
|
default index analyzer associated with the `test` index. An `analyzer`
|
|
|
|
can also be provided to use a different analyzer:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
curl -XGET 'localhost:9200/test/_analyze?analyzer=whitespace' -d 'this is a test'
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
Also, the analyzer can be derived based on a field mapping, for example:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
curl -XGET 'localhost:9200/test/_analyze?field=obj1.field1' -d 'this is a test'
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
Will cause the analysis to happen based on the analyzer configure in the
|
|
|
|
mapping for `obj1.field1` (and if not, the default index analyzer).
|
|
|
|
|
|
|
|
Also, the text can be provided as part of the request body, and not as a
|
|
|
|
parameter.
|