This commit is contained in:
parent
b21e417181
commit
b59ecde041
|
@ -38,7 +38,7 @@ and adds the user-specified base_cost to the result:
|
|||
|
||||
Note that the values are extracted from the `params` map. In context, the aggregation looks like this:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /seats/_search
|
||||
{
|
||||
|
@ -79,8 +79,8 @@ GET /seats/_search
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:seats]
|
||||
|
||||
<1> The `buckets_path` points to two aggregations (`min_cost`, `max_cost`) and adds `min`/`max` variables
|
||||
to the `params` map
|
||||
<2> The user-specified `base_cost` is also added to the script's `params` map
|
|
@ -39,7 +39,7 @@ params.max + params.base_cost > 10
|
|||
Note that the values are extracted from the `params` map. The script is in the form of an expression
|
||||
that returns `true` or `false`. In context, the aggregation looks like this:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /seats/_search
|
||||
{
|
||||
|
@ -74,8 +74,8 @@ GET /seats/_search
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:seats]
|
||||
|
||||
<1> The `buckets_path` points to the max aggregations (`max_cost`) and adds `max` variables
|
||||
to the `params` map
|
||||
<2> The user-specified `base_cost` is also added to the `params` map
|
||||
|
|
|
@ -41,7 +41,7 @@ the request URL.
|
|||
|
||||
. Create {ref}/mapping.html[mappings] for the sample data:
|
||||
+
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT /seats
|
||||
{
|
||||
|
@ -62,7 +62,6 @@ PUT /seats
|
|||
}
|
||||
----
|
||||
+
|
||||
// CONSOLE
|
||||
|
||||
. Run the <<painless-ingest-processor-context, ingest processor context>>
|
||||
example. This sets up a script ingest processor used on each document as the
|
||||
|
|
|
@ -59,7 +59,7 @@ params['_source']['actors'].length; <1>
|
|||
|
||||
Submit the following request:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
GET seats/_search
|
||||
{
|
||||
|
@ -80,5 +80,4 @@ GET seats/_search
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
|
@ -41,7 +41,7 @@ Defining cost as a script parameter enables the cost to be configured
|
|||
in the script query request. For example, the following request finds
|
||||
all available theatre seats for evening performances that are under $18.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
GET seats/_search
|
||||
{
|
||||
|
@ -61,5 +61,4 @@ GET seats/_search
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
|
@ -178,7 +178,7 @@ ctx.datetime = dt.getLong(ChronoField.INSTANT_SECONDS)*1000L; <15>
|
|||
|
||||
Submit the following request:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT /_ingest/pipeline/seats
|
||||
{
|
||||
|
@ -192,4 +192,3 @@ PUT /_ingest/pipeline/seats
|
|||
]
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
|
@ -52,7 +52,7 @@ Math.min(params['num_terms'], params['min_actors_to_see'])
|
|||
The following request finds seats to performances with at least
|
||||
two of the three specified actors.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
GET seats/_search
|
||||
{
|
||||
|
@ -71,6 +71,5 @@ GET seats/_search
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
||||
|
||||
|
|
|
@ -36,7 +36,7 @@ To run this example, first follow the steps in
|
|||
The following query finds all unsold seats, with lower 'row' values
|
||||
scored higher.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /seats/_search
|
||||
{
|
||||
|
@ -54,5 +54,4 @@ GET /seats/_search
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:seats]
|
|
@ -34,7 +34,7 @@ To run this example, first follow the steps in
|
|||
|
||||
To sort results by the length of the `theatre` field, submit the following query:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
GET /_search
|
||||
{
|
||||
|
@ -57,5 +57,4 @@ GET /_search
|
|||
}
|
||||
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[setup:seats]
|
|
@ -61,7 +61,7 @@ To run this example, first follow the steps in
|
|||
The following query finds all seats in a specific section that have not been
|
||||
sold and lowers the price by 2:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /seats/_update_by_query
|
||||
{
|
||||
|
@ -91,5 +91,4 @@ POST /seats/_update_by_query
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:seats]
|
|
@ -62,7 +62,7 @@ To run this example, first follow the steps in
|
|||
The following query updates a document to be sold, and sets the cost
|
||||
to the actual price paid after discounts:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /seats/_update/3
|
||||
{
|
||||
|
@ -75,5 +75,4 @@ POST /seats/_update/3
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:seats]
|
|
@ -18,7 +18,7 @@ The standard <<painless-api-reference, Painless API>> is available.
|
|||
|
||||
*Example*
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
POST _watcher/watch/_execute
|
||||
{
|
||||
|
@ -65,7 +65,6 @@ POST _watcher/watch/_execute
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
||||
|
||||
<1> The Java Stream API is used in the condition. This API allows manipulation of
|
||||
|
@ -78,7 +77,7 @@ on the value of the seats sold for the plays in the data set. The script aggrega
|
|||
the total sold seats for each play and returns true if there is at least one play
|
||||
that has sold over $50,000.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
POST _watcher/watch/_execute
|
||||
{
|
||||
|
@ -123,7 +122,6 @@ POST _watcher/watch/_execute
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
||||
|
||||
This example uses a nearly identical condition as the previous example. The
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
POST _watcher/watch/_execute
|
||||
{
|
||||
|
@ -99,12 +99,11 @@ POST _watcher/watch/_execute
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
||||
|
||||
The following example shows the use of metadata and transforming dates into a readable format.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
POST _watcher/watch/_execute
|
||||
{
|
||||
|
@ -157,5 +156,4 @@ POST _watcher/watch/_execute
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
|
@ -18,7 +18,7 @@ The standard <<painless-api-reference, Painless API>> is available.
|
|||
|
||||
*Example*
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
POST _watcher/watch/_execute
|
||||
{
|
||||
|
@ -75,7 +75,6 @@ POST _watcher/watch/_execute
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
||||
|
||||
<1> The Java Stream API is used in the transform. This API allows manipulation of
|
||||
|
@ -88,7 +87,7 @@ the elements of the list in a pipeline.
|
|||
The following action transform changes each value in the mod_log action into a `String`.
|
||||
This transform does not change the values in the unmod_log action.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
POST _watcher/watch/_execute
|
||||
{
|
||||
|
@ -142,7 +141,6 @@ POST _watcher/watch/_execute
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip: requires setup from other pages]
|
||||
|
||||
This example uses the streaming API in a very similar manner. The differences below are
|
||||
|
|
|
@ -728,7 +728,7 @@ examples into an Elasticsearch cluster:
|
|||
|
||||
. Create {ref}/mapping.html[mappings] for the sample data.
|
||||
+
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT /messages
|
||||
{
|
||||
|
@ -748,11 +748,9 @@ PUT /messages
|
|||
}
|
||||
----
|
||||
+
|
||||
// CONSOLE
|
||||
+
|
||||
. Load the sample data.
|
||||
+
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
POST /_bulk
|
||||
{ "index" : { "_index" : "messages", "_id" : "1" } }
|
||||
|
@ -776,8 +774,6 @@ POST /_bulk
|
|||
{ "index" : { "_index" : "messages", "_id" : "10" } }
|
||||
{ "priority": 2, "datetime": "2019-07-23T23:39:54Z", "message": "m10" }
|
||||
----
|
||||
+
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
===== Day-of-the-Week Bucket Aggregation Example
|
||||
|
@ -788,7 +784,7 @@ as part of the
|
|||
<<painless-bucket-script-agg-context, bucket script aggregation context>> to
|
||||
display the number of messages from each day-of-the-week.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
GET /messages/_search?pretty=true
|
||||
{
|
||||
|
@ -801,7 +797,6 @@ GET /messages/_search?pretty=true
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
===== Morning/Evening Bucket Aggregation Example
|
||||
|
@ -812,7 +807,7 @@ as part of the
|
|||
<<painless-bucket-script-agg-context, bucket script aggregation context>> to
|
||||
display the number of messages received in the morning versus the evening.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
GET /messages/_search?pretty=true
|
||||
{
|
||||
|
@ -825,7 +820,6 @@ GET /messages/_search?pretty=true
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
===== Age of a Message Script Field Example
|
||||
|
@ -835,7 +829,7 @@ The following example uses a
|
|||
<<painless-field-context, field context>> to display the elapsed time between
|
||||
"now" and when a message was received.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
GET /_search?pretty=true
|
||||
{
|
||||
|
@ -854,7 +848,6 @@ GET /_search?pretty=true
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
The following shows the script broken into multiple lines:
|
||||
|
|
|
@ -16,7 +16,7 @@ utility method, `Debug.explain` which throws the exception for you. For
|
|||
example, you can use {ref}/search-explain.html[`_explain`] to explore the
|
||||
context available to a {ref}/query-dsl-script-query.html[script query].
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------------------------------------
|
||||
PUT /hockey/_doc/1?refresh
|
||||
{"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1]}
|
||||
|
@ -30,7 +30,6 @@ POST /hockey/_explain/1
|
|||
}
|
||||
}
|
||||
---------------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/_explain\/1/_explain\/1?error_trace=false/ catch:/painless_explain_error/]
|
||||
// The test system sends error_trace=true by default for easier debugging so
|
||||
// we have to override it to get a normal shaped response
|
||||
|
@ -56,14 +55,13 @@ Which shows that the class of `doc.first` is
|
|||
You can use the same trick to see that `_source` is a `LinkedHashMap`
|
||||
in the `_update` API:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------------------------------------
|
||||
POST /hockey/_update/1
|
||||
{
|
||||
"script": "Debug.explain(ctx._source)"
|
||||
}
|
||||
---------------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued s/_update\/1/_update\/1?error_trace=false/ catch:/painless_explain_error/]
|
||||
|
||||
The response looks like:
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
|
||||
To illustrate how Painless works, let's load some hockey stats into an Elasticsearch index:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
PUT hockey/_bulk?refresh
|
||||
{"index":{"_id":1}}
|
||||
|
@ -29,7 +29,6 @@ PUT hockey/_bulk?refresh
|
|||
{"index":{"_id":11}}
|
||||
{"first":"joe","last":"colborne","goals":[3,18,13],"assists":[6,20,24],"gp":[26,67,82],"born":"1990/01/30"}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
// TESTSETUP
|
||||
|
||||
[float]
|
||||
|
@ -39,7 +38,7 @@ Document values can be accessed from a `Map` named `doc`.
|
|||
|
||||
For example, the following script calculates a player's total goals. This example uses a strongly typed `int` and a `for` loop.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
GET hockey/_search
|
||||
{
|
||||
|
@ -61,11 +60,10 @@ GET hockey/_search
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Alternatively, you could do the same thing using a script field instead of a function score:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
GET hockey/_search
|
||||
{
|
||||
|
@ -88,12 +86,11 @@ GET hockey/_search
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The following example uses a Painless script to sort the players by their combined first and last names. The names are accessed using
|
||||
`doc['first'].value` and `doc['last'].value`.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
GET hockey/_search
|
||||
{
|
||||
|
@ -112,7 +109,6 @@ GET hockey/_search
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
[float]
|
||||
|
@ -132,7 +128,7 @@ You can also easily update fields. You access the original source for a field as
|
|||
|
||||
First, let's look at the source data for a player by submitting the following request:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
GET hockey/_search
|
||||
{
|
||||
|
@ -147,11 +143,10 @@ GET hockey/_search
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
To change player 1's last name to `hockey`, simply set `ctx._source.last` to the new value:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/_update/1
|
||||
{
|
||||
|
@ -164,12 +159,11 @@ POST hockey/_update/1
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
You can also add fields to a document. For example, this script adds a new field that contains
|
||||
the player's nickname, _hockey_.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/_update/1
|
||||
{
|
||||
|
@ -186,7 +180,6 @@ POST hockey/_update/1
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
[float]
|
||||
[[modules-scripting-painless-dates]]
|
||||
|
@ -199,7 +192,7 @@ in a script, leave out the `get` prefix and continue with lowercasing the
|
|||
rest of the method name. For example, the following returns every hockey
|
||||
player's birth year:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
GET hockey/_search
|
||||
{
|
||||
|
@ -212,7 +205,6 @@ GET hockey/_search
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
[float]
|
||||
[[modules-scripting-painless-regex]]
|
||||
|
@ -241,7 +233,7 @@ text matches, `false` otherwise.
|
|||
Using the find operator (`=~`) you can update all hockey players with "b" in
|
||||
their last name:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/_update_by_query
|
||||
{
|
||||
|
@ -257,12 +249,11 @@ POST hockey/_update_by_query
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Using the match operator (`==~`) you can update all the hockey players whose
|
||||
names start with a consonant and end with a vowel:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/_update_by_query
|
||||
{
|
||||
|
@ -278,12 +269,11 @@ POST hockey/_update_by_query
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
You can use the `Pattern.matcher` directly to get a `Matcher` instance and
|
||||
remove all of the vowels in all of their last names:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/_update_by_query
|
||||
{
|
||||
|
@ -293,13 +283,12 @@ POST hockey/_update_by_query
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
`Matcher.replaceAll` is just a call to Java's `Matcher`'s
|
||||
http://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#replaceAll-java.lang.String-[replaceAll]
|
||||
method so it supports `$1` and `\1` for replacements:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/_update_by_query
|
||||
{
|
||||
|
@ -309,7 +298,6 @@ POST hockey/_update_by_query
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
If you need more control over replacements you can call `replaceAll` on a
|
||||
`CharSequence` with a `Function<Matcher, String>` that builds the replacement.
|
||||
|
@ -321,7 +309,7 @@ replacement is rude and will likely break the replacement process.
|
|||
|
||||
This will make all of the vowels in the hockey player's last names upper case:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/_update_by_query
|
||||
{
|
||||
|
@ -334,12 +322,11 @@ POST hockey/_update_by_query
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Or you can use the `CharSequence.replaceFirst` to make the first vowel in their
|
||||
last names upper case:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/_update_by_query
|
||||
{
|
||||
|
@ -352,8 +339,6 @@ POST hockey/_update_by_query
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
Note: all of the `_update_by_query` examples above could really do with a
|
||||
`query` to limit the data that they pull back. While you *could* use a
|
||||
|
|
|
@ -62,7 +62,7 @@ http://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html[UnicodeSet]
|
|||
Here are two examples, the default usage and a customised character filter:
|
||||
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
|
@ -95,7 +95,6 @@ PUT icu_sample
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Uses the default `nfkc_cf` normalization.
|
||||
<2> Uses the customized `nfd_normalizer` token filter, which is set to use `nfc` normalization with decomposition.
|
||||
|
@ -110,7 +109,7 @@ but adds better support for some Asian languages by using a dictionary-based
|
|||
approach to identify words in Thai, Lao, Chinese, Japanese, and Korean, and
|
||||
using custom rules to break Myanmar and Khmer text into syllables.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
|
@ -127,7 +126,6 @@ PUT icu_sample
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
===== Rules customization
|
||||
|
||||
|
@ -151,7 +149,7 @@ As a demonstration of how the rule files can be used, save the following user fi
|
|||
|
||||
Then create an analyzer to use this rule file as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
|
@ -181,7 +179,6 @@ GET icu_sample/_analyze
|
|||
"text": "Elasticsearch. Wow!"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The above `analyze` request returns the following:
|
||||
|
||||
|
@ -219,7 +216,7 @@ You should probably prefer the <<analysis-icu-normalization-charfilter,Normaliza
|
|||
|
||||
Here are two examples, the default usage and a customised token filter:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
|
@ -251,7 +248,6 @@ PUT icu_sample
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Uses the default `nfkc_cf` normalization.
|
||||
<2> Uses the customized `nfc_normalizer` token filter, which is set to use `nfc` normalization.
|
||||
|
@ -265,7 +261,7 @@ Case folding of Unicode characters based on `UTR#30`, like the
|
|||
on steroids. It registers itself as the `icu_folding` token filter and is
|
||||
available to all indices:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
|
@ -285,7 +281,6 @@ PUT icu_sample
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The ICU folding token filter already does Unicode normalization, so there is
|
||||
no need to use Normalize character or token filter as well.
|
||||
|
@ -299,7 +294,7 @@ to note that both upper and lowercase forms should be specified, and that
|
|||
these filtered character are not lowercased which is why we add the
|
||||
`lowercase` filter as well:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
|
@ -326,7 +321,6 @@ PUT icu_sample
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
[[analysis-icu-collation]]
|
||||
|
@ -352,7 +346,7 @@ which is a best-effort attempt at language-neutral sorting.
|
|||
Below is an example of how to set up a field for sorting German names in
|
||||
``phonebook'' order:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -385,7 +379,6 @@ GET _search <3>
|
|||
}
|
||||
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> The `name` field uses the `standard` analyzer, and so support full text queries.
|
||||
<2> The `name.sort` field is an `icu_collation_keyword` field that will preserve the name as
|
||||
|
@ -507,7 +500,7 @@ rulesets are not yet supported.
|
|||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT icu_sample
|
||||
{
|
||||
|
@ -552,7 +545,6 @@ GET icu_sample/_analyze
|
|||
}
|
||||
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> This transforms transliterates characters to Latin, and separates accents
|
||||
from their base characters, removes the accents, and then puts the
|
||||
|
|
|
@ -103,7 +103,7 @@ dictionary to `$ES_HOME/config/userdict_ja.txt`:
|
|||
You can also inline the rules directly in the tokenizer definition using
|
||||
the `user_dictionary_rules` option:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT nori_sample
|
||||
{
|
||||
|
@ -128,7 +128,6 @@ PUT nori_sample
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
--
|
||||
|
||||
`nbest_cost`/`nbest_examples`::
|
||||
|
@ -155,7 +154,7 @@ If both parameters are used, the largest number of both is applied.
|
|||
|
||||
Then create an analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
|
@ -187,7 +186,6 @@ GET kuromoji_sample/_analyze
|
|||
"text": "東京スカイツリー"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The above `analyze` request returns the following:
|
||||
|
||||
|
@ -217,7 +215,7 @@ The above `analyze` request returns the following:
|
|||
The `kuromoji_baseform` token filter replaces terms with their
|
||||
BaseFormAttribute. This acts as a lemmatizer for verbs and adjectives. Example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
|
@ -243,7 +241,6 @@ GET kuromoji_sample/_analyze
|
|||
"text": "飲み"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
which responds with:
|
||||
|
||||
|
@ -274,7 +271,7 @@ part-of-speech tags. It accepts the following setting:
|
|||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
|
@ -309,7 +306,6 @@ GET kuromoji_sample/_analyze
|
|||
"text": "寿司がおいしいね"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Which responds with:
|
||||
|
||||
|
@ -348,7 +344,7 @@ to `true`. The default when defining a custom `kuromoji_readingform`, however,
|
|||
is `false`. The only reason to use the custom form is if you need the
|
||||
katakana reading form:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
|
@ -392,7 +388,6 @@ GET kuromoji_sample/_analyze
|
|||
"text": "寿司" <2>
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Returns `スシ`.
|
||||
<2> Returns `sushi`.
|
||||
|
@ -412,7 +407,7 @@ This token filter accepts the following setting:
|
|||
is `4`).
|
||||
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
|
@ -450,7 +445,6 @@ GET kuromoji_sample/_analyze
|
|||
"text": "サーバー" <2>
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Returns `コピー`.
|
||||
<2> Return `サーバ`.
|
||||
|
@ -465,7 +459,7 @@ the predefined `_japanese_` stopwords list. If you want to use a different
|
|||
predefined list, then use the
|
||||
{ref}/analysis-stop-tokenfilter.html[`stop` token filter] instead.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
|
@ -500,7 +494,6 @@ GET kuromoji_sample/_analyze
|
|||
"text": "ストップは消える"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The above request returns:
|
||||
|
||||
|
@ -524,7 +517,7 @@ The above request returns:
|
|||
The `kuromoji_number` token filter normalizes Japanese numbers (kansūji)
|
||||
to regular Arabic decimal numbers in half-width characters. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
|
@ -550,7 +543,6 @@ GET kuromoji_sample/_analyze
|
|||
"text": "一〇〇〇"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Which results in:
|
||||
|
||||
|
|
|
@ -88,7 +88,7 @@ C샤프
|
|||
|
||||
Then create an analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT nori_sample
|
||||
{
|
||||
|
@ -119,7 +119,6 @@ GET nori_sample/_analyze
|
|||
"text": "세종시" <1>
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Sejong city
|
||||
|
||||
|
@ -161,7 +160,7 @@ The above `analyze` request returns the following:
|
|||
You can also inline the rules directly in the tokenizer definition using
|
||||
the `user_dictionary_rules` option:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT nori_sample
|
||||
{
|
||||
|
@ -186,14 +185,13 @@ PUT nori_sample
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
--
|
||||
|
||||
The `nori_tokenizer` sets a number of additional attributes per token that are used by token filters
|
||||
to modify the stream.
|
||||
You can view all these additional attributes with the following request:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET _analyze
|
||||
{
|
||||
|
@ -203,7 +201,6 @@ GET _analyze
|
|||
"explain": true
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> A tree with deep roots
|
||||
|
||||
|
@ -329,7 +326,7 @@ and defaults to:
|
|||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT nori_sample
|
||||
{
|
||||
|
@ -363,7 +360,6 @@ GET nori_sample/_analyze
|
|||
"text": "여섯 용이" <2>
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Korean numerals should be removed (`NR`)
|
||||
<2> Six dragons
|
||||
|
@ -395,7 +391,7 @@ Which responds with:
|
|||
|
||||
The `nori_readingform` token filter rewrites tokens written in Hanja to their Hangul form.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT nori_sample
|
||||
{
|
||||
|
@ -419,7 +415,6 @@ GET nori_sample/_analyze
|
|||
"text": "鄕歌" <1>
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> A token written in Hanja: Hyangga
|
||||
|
||||
|
|
|
@ -27,7 +27,7 @@ The `phonetic` token filter takes the following settings:
|
|||
token. Accepts `true` (default) and `false`. Not supported by
|
||||
`beider_morse` encoding.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT phonetic_sample
|
||||
{
|
||||
|
@ -61,7 +61,6 @@ GET phonetic_sample/_analyze
|
|||
"text": "Joe Bloggs" <1>
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Returns: `J`, `joe`, `BLKS`, `bloggs`
|
||||
|
||||
|
|
|
@ -27,7 +27,7 @@ NOTE: The `smartcn_word` token filter and `smartcn_sentence` have been deprecate
|
|||
The `smartcn` analyzer could be reimplemented as a `custom` analyzer that can
|
||||
then be extended and configured as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT smartcn_example
|
||||
{
|
||||
|
@ -46,7 +46,6 @@ PUT smartcn_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: smartcn_example, first: smartcn, second: rebuilt_smartcn}\nendyaml\n/]
|
||||
|
||||
[[analysis-smartcn_stop]]
|
||||
|
@ -58,7 +57,7 @@ This filter only supports the predefined `_smartcn_` stopwords list.
|
|||
If you want to use a different predefined list, then use the
|
||||
{ref}/analysis-stop-tokenfilter.html[`stop` token filter] instead.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT smartcn_example
|
||||
{
|
||||
|
@ -95,7 +94,6 @@ GET smartcn_example/_analyze
|
|||
"text": "哈喽,我们是 Elastic 我们是 Elastic Stack(Elasticsearch、Kibana、Beats 和 Logstash)的开发公司。从股票行情到 Twitter 消息流,从 Apache 日志到 WordPress 博文,我们可以帮助人们体验搜索的强大力量,帮助他们以截然不同的方式探索和分析数据"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The above request returns:
|
||||
|
||||
|
|
|
@ -22,7 +22,7 @@ which are not configurable.
|
|||
The `polish` analyzer could be reimplemented as a `custom` analyzer that can
|
||||
then be extended and configured differently as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /stempel_example
|
||||
{
|
||||
|
@ -42,7 +42,6 @@ PUT /stempel_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: stempel_example, first: polish, second: rebuilt_stempel}\nendyaml\n/]
|
||||
|
||||
[[analysis-polish-stop]]
|
||||
|
@ -54,7 +53,7 @@ the predefined `_polish_` stopwords list. If you want to use a different
|
|||
predefined list, then use the
|
||||
{ref}/analysis-stop-tokenfilter.html[`stop` token filter] instead.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /polish_stop_example
|
||||
{
|
||||
|
@ -90,7 +89,6 @@ GET polish_stop_example/_analyze
|
|||
"text": "Gdzie kucharek sześć, tam nie ma co jeść."
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The above request returns:
|
||||
|
||||
|
|
|
@ -355,11 +355,10 @@ sudo dpkg -i elasticsearch-{version}.deb
|
|||
|
||||
Check that Elasticsearch is running:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
GET /
|
||||
----
|
||||
// CONSOLE
|
||||
|
||||
This command should give you a JSON result:
|
||||
|
||||
|
|
|
@ -32,7 +32,7 @@ include::install_remove.asciidoc[]
|
|||
|
||||
For example, this:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT _ingest/pipeline/attachment
|
||||
{
|
||||
|
@ -51,7 +51,6 @@ PUT my_index/_doc/my_id?pipeline=attachment
|
|||
}
|
||||
GET my_index/_doc/my_id
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Returns this:
|
||||
|
||||
|
@ -81,7 +80,7 @@ Returns this:
|
|||
|
||||
To specify only some fields to be extracted:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT _ingest/pipeline/attachment
|
||||
{
|
||||
|
@ -96,7 +95,6 @@ PUT _ingest/pipeline/attachment
|
|||
]
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
NOTE: Extracting contents from binary data is a resource intensive operation and
|
||||
consumes a lot of resources. It is highly recommended to run pipelines
|
||||
|
@ -115,7 +113,7 @@ setting.
|
|||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT _ingest/pipeline/attachment
|
||||
{
|
||||
|
@ -136,7 +134,6 @@ PUT my_index/_doc/my_id?pipeline=attachment
|
|||
}
|
||||
GET my_index/_doc/my_id
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Returns this:
|
||||
|
||||
|
@ -164,7 +161,7 @@ Returns this:
|
|||
// TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
|
||||
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT _ingest/pipeline/attachment
|
||||
{
|
||||
|
@ -186,7 +183,6 @@ PUT my_index/_doc/my_id_2?pipeline=attachment
|
|||
}
|
||||
GET my_index/_doc/my_id_2
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Returns this:
|
||||
|
||||
|
@ -247,7 +243,7 @@ of the attachments field and insert
|
|||
the properties into the document so the following `foreach`
|
||||
processor is used:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT _ingest/pipeline/attachment
|
||||
{
|
||||
|
@ -281,7 +277,6 @@ PUT my_index/_doc/my_id?pipeline=attachment
|
|||
}
|
||||
GET my_index/_doc/my_id
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Returns this:
|
||||
|
||||
|
|
|
@ -22,7 +22,7 @@ The `annotated-text` tokenizes text content as per the more common `text` field
|
|||
"limitations" below) but also injects any marked-up annotation tokens directly into
|
||||
the search index:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -35,7 +35,6 @@ PUT my_index
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
Such a mapping would allow marked-up text eg wikipedia articles to be indexed as both text
|
||||
and structured tokens. The annotations use a markdown-like syntax using URL encoding of
|
||||
|
@ -110,7 +109,7 @@ We can now perform searches for annotations using regular `term` queries that do
|
|||
the provided search values. Annotations are a more precise way of matching as can be seen
|
||||
in this example where a search for `Beck` will not match `Jeff Beck` :
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
# Example documents
|
||||
PUT my_index/_doc/1
|
||||
|
@ -133,7 +132,6 @@ GET my_index/_search
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> As well as tokenising the plain text into single words e.g. `beck`, here we
|
||||
inject the single token value `Beck` at the same position as `beck` in the token stream.
|
||||
|
@ -164,7 +162,7 @@ entity IDs woven into text.
|
|||
These IDs can be embedded as annotations in an annotated_text field but it often makes
|
||||
sense to include them in dedicated structured fields to support discovery via aggregations:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -185,11 +183,10 @@ PUT my_index
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
Applications would then typically provide content and discover it as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
# Example documents
|
||||
PUT my_index/_doc/1
|
||||
|
@ -215,7 +212,6 @@ GET my_index/_search
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Note the `my_twitter_handles` contains a list of the annotation values
|
||||
also used in the unstructured text. (Note the annotated_text syntax requires escaping).
|
||||
|
@ -265,7 +261,7 @@ they don't name clash with text tokens e.g.
|
|||
The `annotated-text` plugin includes a custom highlighter designed to mark up search hits
|
||||
in a way which is respectful of the original markup:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
# Example documents
|
||||
PUT my_index/_doc/1
|
||||
|
@ -290,7 +286,7 @@ GET my_index/_search
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> The `annotated` highlighter type is designed for use with annotated_text fields
|
||||
|
||||
The annotated highlighter is based on the `unified` highlighter and supports the same
|
||||
|
|
|
@ -14,7 +14,7 @@ include::install_remove.asciidoc[]
|
|||
The `murmur3` is typically used within a multi-field, so that both the original
|
||||
value and its hash are stored in the index:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -32,13 +32,12 @@ PUT my_index
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
Such a mapping would allow to refer to `my_field.hash` in order to get hashes
|
||||
of the values of the `my_field` field. This is only useful in order to run
|
||||
`cardinality` aggregations:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
# Example documents
|
||||
PUT my_index/_doc/1
|
||||
|
@ -62,7 +61,6 @@ GET my_index/_search
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Counting unique values on the `my_field.hash` field
|
||||
|
||||
|
|
|
@ -13,7 +13,7 @@ include::install_remove.asciidoc[]
|
|||
|
||||
In order to enable the `_size` field, set the mapping as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -24,12 +24,11 @@ PUT my_index
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
|
||||
The value of the `_size` field is accessible in queries, aggregations, scripts,
|
||||
and when sorting:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------
|
||||
# Example documents
|
||||
PUT my_index/_doc/1
|
||||
|
@ -73,7 +72,6 @@ GET my_index/_search
|
|||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
<1> Querying on the `_size` field
|
||||
|
|
|
@ -183,7 +183,7 @@ include::repository-shared-settings.asciidoc[]
|
|||
|
||||
Some examples, using scripts:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
# The simplest one
|
||||
PUT _snapshot/my_backup1
|
||||
|
@ -221,7 +221,6 @@ PUT _snapshot/my_backup4
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have azure setup while testing this]
|
||||
|
||||
Example using Java:
|
||||
|
|
|
@ -101,7 +101,7 @@ For example, if you added a `gcs.client.my_alternate_client.credentials_file`
|
|||
setting in the keystore, you can configure a repository to use those credentials
|
||||
like this:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_gcs_repository
|
||||
{
|
||||
|
@ -112,7 +112,6 @@ PUT _snapshot/my_gcs_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have gcs setup while testing this]
|
||||
|
||||
The `credentials_file` settings are {ref}/secure-settings.html#reloadable-secure-settings[reloadable].
|
||||
|
@ -133,7 +132,7 @@ called `default`, but can be customized with the repository setting `client`.
|
|||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_gcs_repository
|
||||
{
|
||||
|
@ -144,7 +143,6 @@ PUT _snapshot/my_gcs_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have gcs setup while testing this]
|
||||
|
||||
Some settings are sensitive and must be stored in the
|
||||
|
@ -199,7 +197,7 @@ is stored in Google Cloud Storage.
|
|||
|
||||
These can be specified when creating the repository. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_gcs_repository
|
||||
{
|
||||
|
@ -210,7 +208,6 @@ PUT _snapshot/my_gcs_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have gcs set up while testing this]
|
||||
|
||||
The following settings are supported:
|
||||
|
|
|
@ -25,7 +25,7 @@ plugin folder and point `HADOOP_HOME` variable to it; this should minimize the a
|
|||
Once installed, define the configuration for the `hdfs` repository through the
|
||||
{ref}/modules-snapshots.html[REST API]:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_hdfs_repository
|
||||
{
|
||||
|
@ -37,7 +37,6 @@ PUT _snapshot/my_hdfs_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have hdfs set up while testing this]
|
||||
|
||||
The following settings are supported:
|
||||
|
@ -144,7 +143,7 @@ Once your keytab files are in place and your cluster is started, creating a secu
|
|||
add the name of the principal that you will be authenticating as in the repository settings under the
|
||||
`security.principal` option:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_hdfs_repository
|
||||
{
|
||||
|
@ -156,13 +155,12 @@ PUT _snapshot/my_hdfs_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have hdfs set up while testing this]
|
||||
|
||||
If you are using different service principals for each node, you can use the `_HOST` pattern in your principal
|
||||
name. Elasticsearch will automatically replace the pattern with the hostname of the node at runtime:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_hdfs_repository
|
||||
{
|
||||
|
@ -174,7 +172,6 @@ PUT _snapshot/my_hdfs_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have hdfs set up while testing this]
|
||||
|
||||
[[repository-hdfs-security-authorization]]
|
||||
|
|
|
@ -21,7 +21,7 @@ http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html
|
|||
IAM Role] credentials for authentication. The only mandatory setting is the
|
||||
bucket name:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_s3_repository
|
||||
{
|
||||
|
@ -31,7 +31,6 @@ PUT _snapshot/my_s3_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have s3 setup while testing this]
|
||||
|
||||
|
||||
|
@ -43,7 +42,7 @@ The settings have the form `s3.client.CLIENT_NAME.SETTING_NAME`. By default,
|
|||
`s3` repositories use a client named `default`, but this can be modified using
|
||||
the <<repository-s3-repository,repository setting>> `client`. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_s3_repository
|
||||
{
|
||||
|
@ -54,7 +53,6 @@ PUT _snapshot/my_s3_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have S3 setup while testing this]
|
||||
|
||||
Most client settings can be added to the `elasticsearch.yml` configuration file
|
||||
|
@ -210,7 +208,7 @@ or supported.
|
|||
The `s3` repository type supports a number of settings to customize how data is
|
||||
stored in S3. These can be specified when creating the repository. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_s3_repository
|
||||
{
|
||||
|
@ -221,7 +219,6 @@ PUT _snapshot/my_s3_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have S3 set up while testing this]
|
||||
|
||||
The following settings are supported:
|
||||
|
@ -310,7 +307,7 @@ by the repository settings taking precedence over client settings.
|
|||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT _snapshot/my_s3_repository
|
||||
{
|
||||
|
@ -322,7 +319,6 @@ PUT _snapshot/my_s3_repository
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
// TEST[skip:we don't have s3 set up while testing this]
|
||||
|
||||
This sets up a repository that uses all client settings from the client
|
||||
|
|
|
@ -44,7 +44,7 @@ Note that setting will be applied for newly created indices.
|
|||
|
||||
It can also be set on a per-index basis at index creation time:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -53,4 +53,3 @@ PUT my_index
|
|||
}
|
||||
}
|
||||
----
|
||||
// CONSOLE
|
||||
|
|
|
@ -38,7 +38,7 @@ to the inverted index:
|
|||
Each <<text,`text`>> field in a mapping can specify its own
|
||||
<<analyzer,`analyzer`>>:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
-------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -52,7 +52,6 @@ PUT my_index
|
|||
}
|
||||
}
|
||||
-------------------------
|
||||
// CONSOLE
|
||||
|
||||
At index time, if no `analyzer` has been specified, it looks for an analyzer
|
||||
in the index settings called `default`. Failing that, it defaults to using
|
||||
|
|
|
@ -6,7 +6,7 @@ of them, however, support configuration options to alter their behaviour. For
|
|||
instance, the <<analysis-standard-analyzer,`standard` analyzer>> can be configured
|
||||
to support a list of stop words:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -49,7 +49,6 @@ POST my_index/_analyze
|
|||
}
|
||||
|
||||
--------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> We define the `std_english` analyzer to be based on the `standard`
|
||||
analyzer, but configured to remove the pre-defined list of English stopwords.
|
||||
|
|
|
@ -51,7 +51,7 @@ Token Filters::
|
|||
* <<analysis-lowercase-tokenfilter,Lowercase Token Filter>>
|
||||
* <<analysis-asciifolding-tokenfilter,ASCII-Folding Token Filter>>
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -80,7 +80,6 @@ POST my_index/_analyze
|
|||
"text": "Is this <b>déjà vu</b>?"
|
||||
}
|
||||
--------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Setting `type` to `custom` tells Elasticsearch that we are defining a custom analyzer.
|
||||
Compare this to how <<configuring-analyzers,built-in analyzers can be configured>>:
|
||||
|
@ -154,7 +153,7 @@ Token Filters::
|
|||
|
||||
Here is an example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -204,7 +203,6 @@ POST my_index/_analyze
|
|||
"text": "I'm a :) person, and you?"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Assigns the index a default custom analyzer, `my_custom_analyzer`. This
|
||||
analyzer uses a custom tokenizer, character filter, and token filter that
|
||||
|
|
|
@ -12,7 +12,7 @@ configured, stop words will also be removed.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -20,7 +20,6 @@ POST _analyze
|
|||
"text": "Yes yes, Gödel said this sentence is consistent and."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -83,7 +82,7 @@ about stop word configuration.
|
|||
In this example, we configure the `fingerprint` analyzer to use the
|
||||
pre-defined list of English stop words:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -105,7 +104,6 @@ POST my_index/_analyze
|
|||
"text": "Yes yes, Gödel said this sentence is consistent and."
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -154,7 +152,7 @@ it, usually by adding token filters. This would recreate the built-in
|
|||
`fingerprint` analyzer and you can use it as a starting point for further
|
||||
customization:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /fingerprint_example
|
||||
{
|
||||
|
@ -174,5 +172,4 @@ PUT /fingerprint_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: fingerprint_example, first: fingerprint, second: rebuilt_fingerprint}\nendyaml\n/]
|
||||
|
|
|
@ -7,7 +7,7 @@ string as a single token.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -15,7 +15,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -65,7 +64,7 @@ into tokens, but just in case you need it, this would recreate the
|
|||
built-in `keyword` analyzer and you can use it as a starting point
|
||||
for further customization:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /keyword_example
|
||||
{
|
||||
|
@ -82,6 +81,6 @@ PUT /keyword_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: keyword_example, first: keyword, second: rebuilt_keyword}\nendyaml\n/]
|
||||
|
||||
<1> You'd add any token filters here.
|
||||
|
|
|
@ -76,7 +76,7 @@ the `keyword_marker` token filter from the custom analyzer configuration.
|
|||
|
||||
The `arabic` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /arabic_example
|
||||
{
|
||||
|
@ -113,9 +113,9 @@ PUT /arabic_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"arabic_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: arabic_example, first: arabic, second: rebuilt_arabic}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -126,7 +126,7 @@ PUT /arabic_example
|
|||
|
||||
The `armenian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /armenian_example
|
||||
{
|
||||
|
@ -161,9 +161,9 @@ PUT /armenian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"armenian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: armenian_example, first: armenian, second: rebuilt_armenian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -174,7 +174,7 @@ PUT /armenian_example
|
|||
|
||||
The `basque` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /basque_example
|
||||
{
|
||||
|
@ -209,9 +209,9 @@ PUT /basque_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"basque_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: basque_example, first: basque, second: rebuilt_basque}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -222,7 +222,7 @@ PUT /basque_example
|
|||
|
||||
The `bengali` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /bengali_example
|
||||
{
|
||||
|
@ -260,9 +260,9 @@ PUT /bengali_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"bengali_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: bengali_example, first: bengali, second: rebuilt_bengali}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -273,7 +273,7 @@ PUT /bengali_example
|
|||
|
||||
The `brazilian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /brazilian_example
|
||||
{
|
||||
|
@ -308,9 +308,9 @@ PUT /brazilian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"brazilian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: brazilian_example, first: brazilian, second: rebuilt_brazilian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -321,7 +321,7 @@ PUT /brazilian_example
|
|||
|
||||
The `bulgarian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /bulgarian_example
|
||||
{
|
||||
|
@ -356,9 +356,9 @@ PUT /bulgarian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"bulgarian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: bulgarian_example, first: bulgarian, second: rebuilt_bulgarian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -369,7 +369,7 @@ PUT /bulgarian_example
|
|||
|
||||
The `catalan` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /catalan_example
|
||||
{
|
||||
|
@ -410,9 +410,9 @@ PUT /catalan_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"catalan_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: catalan_example, first: catalan, second: rebuilt_catalan}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -426,7 +426,7 @@ for CJK text than the `cjk` analyzer. Experiment with your text and queries.
|
|||
|
||||
The `cjk` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /cjk_example
|
||||
{
|
||||
|
@ -459,9 +459,9 @@ PUT /cjk_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"cjk_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: cjk_example, first: cjk, second: rebuilt_cjk}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters. The default stop words are
|
||||
*almost* the same as the `_english_` set, but not exactly
|
||||
|
@ -472,7 +472,7 @@ PUT /cjk_example
|
|||
|
||||
The `czech` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /czech_example
|
||||
{
|
||||
|
@ -507,9 +507,9 @@ PUT /czech_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"czech_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: czech_example, first: czech, second: rebuilt_czech}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -520,7 +520,7 @@ PUT /czech_example
|
|||
|
||||
The `danish` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /danish_example
|
||||
{
|
||||
|
@ -555,9 +555,9 @@ PUT /danish_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"danish_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: danish_example, first: danish, second: rebuilt_danish}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -568,7 +568,7 @@ PUT /danish_example
|
|||
|
||||
The `dutch` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /dutch_example
|
||||
{
|
||||
|
@ -613,9 +613,9 @@ PUT /dutch_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"dutch_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: dutch_example, first: dutch, second: rebuilt_dutch}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -626,7 +626,7 @@ PUT /dutch_example
|
|||
|
||||
The `english` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /english_example
|
||||
{
|
||||
|
@ -666,9 +666,9 @@ PUT /english_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"english_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: english_example, first: english, second: rebuilt_english}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -679,7 +679,7 @@ PUT /english_example
|
|||
|
||||
The `finnish` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /finnish_example
|
||||
{
|
||||
|
@ -714,9 +714,9 @@ PUT /finnish_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"finnish_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: finnish_example, first: finnish, second: rebuilt_finnish}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -727,7 +727,7 @@ PUT /finnish_example
|
|||
|
||||
The `french` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /french_example
|
||||
{
|
||||
|
@ -772,9 +772,9 @@ PUT /french_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"french_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: french_example, first: french, second: rebuilt_french}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -785,7 +785,7 @@ PUT /french_example
|
|||
|
||||
The `galician` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /galician_example
|
||||
{
|
||||
|
@ -820,9 +820,9 @@ PUT /galician_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"galician_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: galician_example, first: galician, second: rebuilt_galician}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -833,7 +833,7 @@ PUT /galician_example
|
|||
|
||||
The `german` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /german_example
|
||||
{
|
||||
|
@ -869,9 +869,9 @@ PUT /german_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"german_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: german_example, first: german, second: rebuilt_german}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -882,7 +882,7 @@ PUT /german_example
|
|||
|
||||
The `greek` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /greek_example
|
||||
{
|
||||
|
@ -921,9 +921,9 @@ PUT /greek_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"greek_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: greek_example, first: greek, second: rebuilt_greek}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -934,7 +934,7 @@ PUT /greek_example
|
|||
|
||||
The `hindi` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /hindi_example
|
||||
{
|
||||
|
@ -972,9 +972,9 @@ PUT /hindi_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"hindi_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: hindi_example, first: hindi, second: rebuilt_hindi}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -985,7 +985,7 @@ PUT /hindi_example
|
|||
|
||||
The `hungarian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /hungarian_example
|
||||
{
|
||||
|
@ -1020,9 +1020,9 @@ PUT /hungarian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"hungarian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: hungarian_example, first: hungarian, second: rebuilt_hungarian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1034,7 +1034,7 @@ PUT /hungarian_example
|
|||
|
||||
The `indonesian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /indonesian_example
|
||||
{
|
||||
|
@ -1069,9 +1069,9 @@ PUT /indonesian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"indonesian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: indonesian_example, first: indonesian, second: rebuilt_indonesian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1082,7 +1082,7 @@ PUT /indonesian_example
|
|||
|
||||
The `irish` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /irish_example
|
||||
{
|
||||
|
@ -1133,9 +1133,9 @@ PUT /irish_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"irish_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: irish_example, first: irish, second: rebuilt_irish}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1146,7 +1146,7 @@ PUT /irish_example
|
|||
|
||||
The `italian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /italian_example
|
||||
{
|
||||
|
@ -1192,9 +1192,9 @@ PUT /italian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"italian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: italian_example, first: italian, second: rebuilt_italian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1205,7 +1205,7 @@ PUT /italian_example
|
|||
|
||||
The `latvian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /latvian_example
|
||||
{
|
||||
|
@ -1240,9 +1240,9 @@ PUT /latvian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"latvian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: latvian_example, first: latvian, second: rebuilt_latvian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1253,7 +1253,7 @@ PUT /latvian_example
|
|||
|
||||
The `lithuanian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /lithuanian_example
|
||||
{
|
||||
|
@ -1288,9 +1288,9 @@ PUT /lithuanian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"lithuanian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: lithuanian_example, first: lithuanian, second: rebuilt_lithuanian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1301,7 +1301,7 @@ PUT /lithuanian_example
|
|||
|
||||
The `norwegian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /norwegian_example
|
||||
{
|
||||
|
@ -1336,9 +1336,9 @@ PUT /norwegian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"norwegian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: norwegian_example, first: norwegian, second: rebuilt_norwegian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1349,7 +1349,7 @@ PUT /norwegian_example
|
|||
|
||||
The `persian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /persian_example
|
||||
{
|
||||
|
@ -1384,8 +1384,8 @@ PUT /persian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: persian_example, first: persian, second: rebuilt_persian}\nendyaml\n/]
|
||||
|
||||
<1> Replaces zero-width non-joiners with an ASCII space.
|
||||
<2> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
|
@ -1395,7 +1395,7 @@ PUT /persian_example
|
|||
|
||||
The `portuguese` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /portuguese_example
|
||||
{
|
||||
|
@ -1430,9 +1430,9 @@ PUT /portuguese_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"portuguese_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: portuguese_example, first: portuguese, second: rebuilt_portuguese}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1443,7 +1443,7 @@ PUT /portuguese_example
|
|||
|
||||
The `romanian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /romanian_example
|
||||
{
|
||||
|
@ -1478,9 +1478,9 @@ PUT /romanian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"romanian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: romanian_example, first: romanian, second: rebuilt_romanian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1492,7 +1492,7 @@ PUT /romanian_example
|
|||
|
||||
The `russian` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /russian_example
|
||||
{
|
||||
|
@ -1527,9 +1527,9 @@ PUT /russian_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"russian_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: russian_example, first: russian, second: rebuilt_russian}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1540,7 +1540,7 @@ PUT /russian_example
|
|||
|
||||
The `sorani` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /sorani_example
|
||||
{
|
||||
|
@ -1577,9 +1577,9 @@ PUT /sorani_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"sorani_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: sorani_example, first: sorani, second: rebuilt_sorani}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1590,7 +1590,7 @@ PUT /sorani_example
|
|||
|
||||
The `spanish` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /spanish_example
|
||||
{
|
||||
|
@ -1625,9 +1625,9 @@ PUT /spanish_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"spanish_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: spanish_example, first: spanish, second: rebuilt_spanish}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1638,7 +1638,7 @@ PUT /spanish_example
|
|||
|
||||
The `swedish` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /swedish_example
|
||||
{
|
||||
|
@ -1673,9 +1673,9 @@ PUT /swedish_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"swedish_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: swedish_example, first: swedish, second: rebuilt_swedish}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1686,7 +1686,7 @@ PUT /swedish_example
|
|||
|
||||
The `turkish` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /turkish_example
|
||||
{
|
||||
|
@ -1726,9 +1726,9 @@ PUT /turkish_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"turkish_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: turkish_example, first: turkish, second: rebuilt_turkish}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> This filter should be removed unless there are words which should
|
||||
|
@ -1739,7 +1739,7 @@ PUT /turkish_example
|
|||
|
||||
The `thai` analyzer could be reimplemented as a `custom` analyzer as follows:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /thai_example
|
||||
{
|
||||
|
@ -1765,8 +1765,8 @@ PUT /thai_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/"thai_keywords",//]
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: thai_example, first: thai, second: rebuilt_thai}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
|
|
|
@ -22,7 +22,7 @@ Read more about http://www.regular-expressions.info/catastrophic.html[pathologic
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -30,7 +30,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -174,7 +173,7 @@ about stop word configuration.
|
|||
In this example, we configure the `pattern` analyzer to split email addresses
|
||||
on non-word characters or on underscores (`\W|_`), and to lower-case the result:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -197,7 +196,6 @@ POST my_index/_analyze
|
|||
"text": "John_Smith@foo-bar.com"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> The backslashes in the pattern need to be escaped when specifying the
|
||||
pattern as a JSON string.
|
||||
|
@ -262,7 +260,7 @@ The above example produces the following terms:
|
|||
|
||||
The following more complicated example splits CamelCase text into tokens:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -284,7 +282,6 @@ GET my_index/_analyze
|
|||
"text": "MooseX::FTPClass2_beta"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -381,7 +378,7 @@ it, usually by adding token filters. This would recreate the built-in
|
|||
`pattern` analyzer and you can use it as a starting point for further
|
||||
customization:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /pattern_example
|
||||
{
|
||||
|
@ -405,7 +402,6 @@ PUT /pattern_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: pattern_example, first: pattern, second: rebuilt_pattern}\nendyaml\n/]
|
||||
<1> The default pattern is `\W+` which splits on non-word characters
|
||||
and this is where you'd change it.
|
||||
|
|
|
@ -7,7 +7,7 @@ character which is not a letter. All terms are lower cased.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -15,7 +15,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -132,7 +131,7 @@ it as a `custom` analyzer and modify it, usually by adding token filters.
|
|||
This would recreate the built-in `simple` analyzer and you can use it as
|
||||
a starting point for further customization:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /simple_example
|
||||
{
|
||||
|
@ -149,6 +148,5 @@ PUT /simple_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: simple_example, first: simple, second: rebuilt_simple}\nendyaml\n/]
|
||||
<1> You'd add any token filters here.
|
||||
|
|
|
@ -10,7 +10,7 @@ for most languages.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -18,7 +18,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -148,7 +147,7 @@ In this example, we configure the `standard` analyzer to have a
|
|||
`max_token_length` of 5 (for demonstration purposes), and to use the
|
||||
pre-defined list of English stop words:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -171,7 +170,6 @@ POST my_index/_analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -279,7 +277,7 @@ parameters then you need to recreate it as a `custom` analyzer and modify
|
|||
it, usually by adding token filters. This would recreate the built-in
|
||||
`standard` analyzer and you can use it as a starting point:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /standard_example
|
||||
{
|
||||
|
@ -297,6 +295,5 @@ PUT /standard_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: standard_example, first: standard, second: rebuilt_standard}\nendyaml\n/]
|
||||
<1> You'd add any token filters after `lowercase`.
|
||||
|
|
|
@ -8,7 +8,7 @@ but adds support for removing stop words. It defaults to using the
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -16,7 +16,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -127,7 +126,7 @@ about stop word configuration.
|
|||
In this example, we configure the `stop` analyzer to use a specified list of
|
||||
words as stop words:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -149,7 +148,6 @@ POST my_index/_analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -244,7 +242,7 @@ it, usually by adding token filters. This would recreate the built-in
|
|||
`stop` analyzer and you can use it as a starting point for further
|
||||
customization:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /stop_example
|
||||
{
|
||||
|
@ -268,8 +266,8 @@ PUT /stop_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: stop_example, first: stop, second: rebuilt_stop}\nendyaml\n/]
|
||||
|
||||
<1> The default stopwords can be overridden with the `stopwords`
|
||||
or `stopwords_path` parameters.
|
||||
<2> You'd add any token filters after `english_stop`.
|
||||
|
|
|
@ -7,7 +7,7 @@ whitespace character.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -15,7 +15,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -125,7 +124,7 @@ recreate it as a `custom` analyzer and modify it, usually by adding
|
|||
token filters. This would recreate the built-in `whitespace` analyzer
|
||||
and you can use it as a starting point for further customization:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------
|
||||
PUT /whitespace_example
|
||||
{
|
||||
|
@ -142,6 +141,6 @@ PUT /whitespace_example
|
|||
}
|
||||
}
|
||||
----------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: whitespace_example, first: whitespace, second: rebuilt_whitespace}\nendyaml\n/]
|
||||
|
||||
<1> You'd add any token filters here.
|
||||
|
|
|
@ -8,7 +8,7 @@ replaces HTML entities with their decoded value (e.g. replacing `&` with
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -17,7 +17,7 @@ POST _analyze
|
|||
"text": "<p>I'm so <b>happy</b>!</p>"
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> The <<analysis-keyword-tokenizer,`keyword` tokenizer>> returns a single term.
|
||||
|
||||
/////////////////////
|
||||
|
@ -70,7 +70,7 @@ The `html_strip` character filter accepts the following parameter:
|
|||
In this example, we configure the `html_strip` character filter to leave `<b>`
|
||||
tags in place:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -98,7 +98,6 @@ POST my_index/_analyze
|
|||
"text": "<p>I'm so <b>happy</b>!</p>"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -31,7 +31,7 @@ Either the `mappings` or `mappings_path` parameter must be provided.
|
|||
In this example, we configure the `mapping` character filter to replace Arabic
|
||||
numerals with their Latin equivalents:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -72,7 +72,6 @@ POST my_index/_analyze
|
|||
"text": "My license plate is ٢٥٠١٥"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -104,7 +103,7 @@ The above example produces the following term:
|
|||
Keys and values can be strings with multiple characters. The following
|
||||
example replaces the `:)` and `:(` emoticons with a text equivalent:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -137,7 +136,6 @@ POST my_index/_analyze
|
|||
"text": "I'm delighted about it :("
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
/////////////////////
|
||||
|
|
|
@ -47,7 +47,7 @@ In this example, we configure the `pattern_replace` character filter to
|
|||
replace any embedded dashes in numbers with underscores, i.e `123-456-789` ->
|
||||
`123_456_789`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -78,7 +78,6 @@ POST my_index/_analyze
|
|||
"text": "My credit card is 123-456-789"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\$1//]
|
||||
// the test framework doesn't like the $1 so we just throw it away rather than
|
||||
// try to get it to work properly. At least we are still testing the charfilter.
|
||||
|
@ -98,7 +97,7 @@ This example inserts a space whenever it encounters a lower-case letter
|
|||
followed by an upper-case letter (i.e. `fooBarBaz` -> `foo Bar Baz`), allowing
|
||||
camelCase words to be queried individually:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -140,7 +139,6 @@ POST my_index/_analyze
|
|||
"text": "The fooBarBaz method"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -200,7 +198,7 @@ Querying for `bar` will find the document correctly, but highlighting on the
|
|||
result will produce incorrect highlights, because our character filter changed
|
||||
the length of the original text:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index/_doc/1?refresh
|
||||
{
|
||||
|
@ -221,7 +219,6 @@ GET my_index/_search
|
|||
}
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
The output from the above is:
|
||||
|
|
|
@ -21,7 +21,7 @@ to get one is by building a custom one. Custom normalizers take a list of char
|
|||
<<analysis-charfilters, character filters>> and a list of
|
||||
<<analysis-tokenfilters,token filters>>.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------
|
||||
PUT index
|
||||
{
|
||||
|
@ -55,4 +55,3 @@ PUT index
|
|||
}
|
||||
}
|
||||
--------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -5,7 +5,7 @@ terms produced by an analyzer. A built-in analyzer (or combination of built-in
|
|||
tokenizer, token filters, and character filters) can be specified inline in
|
||||
the request:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
-------------------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -20,7 +20,6 @@ POST _analyze
|
|||
"text": "Is this déja vu?"
|
||||
}
|
||||
-------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
|
||||
|
@ -39,7 +38,7 @@ highlighting search snippets).
|
|||
Alternatively, a <<analysis-custom-analyzer,`custom` analyzer>> can be
|
||||
referred to when running the `analyze` API on a specific index:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
-------------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -79,7 +78,6 @@ GET my_index/_analyze <3>
|
|||
"text": "Is this déjà vu?"
|
||||
}
|
||||
-------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> Define a `custom` analyzer called `std_folded`.
|
||||
<2> The field `my_text` uses the `std_folded` analyzer.
|
||||
|
|
|
@ -6,7 +6,7 @@ and symbolic Unicode characters which are not in the first 127 ASCII
|
|||
characters (the "Basic Latin" Unicode block) into their ASCII
|
||||
equivalents, if one exists. Example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /asciifold_example
|
||||
{
|
||||
|
@ -22,13 +22,12 @@ PUT /asciifold_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Accepts `preserve_original` setting which defaults to false but if true
|
||||
will keep the original token as well as emit the folded token. For
|
||||
example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /asciifold_example
|
||||
{
|
||||
|
@ -50,4 +49,3 @@ PUT /asciifold_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -14,7 +14,7 @@ Bigrams are generated for characters in `han`, `hiragana`, `katakana` and
|
|||
`hangul`, but bigrams can be disabled for particular scripts with the
|
||||
`ignored_scripts` parameter. All non-CJK input is passed through unmodified.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /cjk_bigram_example
|
||||
{
|
||||
|
@ -41,4 +41,3 @@ PUT /cjk_bigram_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -39,7 +39,7 @@ Note, `common_words` or `common_words_path` field is required.
|
|||
|
||||
Here is an example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /common_grams_example
|
||||
{
|
||||
|
@ -70,11 +70,10 @@ PUT /common_grams_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
You can see the output by using e.g. the `_analyze` endpoint:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /common_grams_example/_analyze
|
||||
{
|
||||
|
@ -82,7 +81,6 @@ POST /common_grams_example/_analyze
|
|||
"text" : "the quick brown is a fox"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
And the response will be:
|
||||
|
|
|
@ -82,7 +82,7 @@ Whether to include only the longest matching subword or not. Defaults to `false
|
|||
|
||||
Here is an example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /compound_word_example
|
||||
{
|
||||
|
@ -113,4 +113,3 @@ PUT /compound_word_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -18,7 +18,7 @@ script:: a predicate script that determines whether or not the filters will be a
|
|||
|
||||
You can set it up like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /condition_example
|
||||
{
|
||||
|
@ -43,14 +43,13 @@ PUT /condition_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> This will only apply the lowercase filter to terms that are less than 5
|
||||
characters in length
|
||||
|
||||
And test it like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /condition_example/_analyze
|
||||
{
|
||||
|
@ -58,7 +57,6 @@ POST /condition_example/_analyze
|
|||
"text" : "What Flapdoodle"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
And it'd respond:
|
||||
|
|
|
@ -11,7 +11,7 @@ case sensitive.
|
|||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /elision_example
|
||||
{
|
||||
|
@ -34,4 +34,3 @@ PUT /elision_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -40,7 +40,7 @@ settings defined in the `elasticsearch.yml`).
|
|||
One can use the hunspell stem filter by configuring it the analysis
|
||||
settings:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /hunspell_example
|
||||
{
|
||||
|
@ -63,7 +63,6 @@ PUT /hunspell_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The hunspell token filter accepts four options:
|
||||
|
||||
|
|
|
@ -17,7 +17,7 @@ if set to `exclude` the specified token types will be removed from the stream
|
|||
|
||||
You can set it up like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /keep_types_example
|
||||
{
|
||||
|
@ -39,11 +39,10 @@ PUT /keep_types_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
And test it like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /keep_types_example/_analyze
|
||||
{
|
||||
|
@ -51,7 +50,6 @@ POST /keep_types_example/_analyze
|
|||
"text" : "this is just 1 a test"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
The response will be:
|
||||
|
@ -77,7 +75,7 @@ Note how only the `<NUM>` token is in the output.
|
|||
|
||||
If the `mode` parameter is set to `exclude` like in the following example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /keep_types_exclude_example
|
||||
{
|
||||
|
@ -100,11 +98,10 @@ PUT /keep_types_exclude_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
And we test it like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /keep_types_exclude_example/_analyze
|
||||
{
|
||||
|
@ -112,7 +109,6 @@ POST /keep_types_exclude_example/_analyze
|
|||
"text" : "hello 101 world"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
The response will be:
|
||||
|
|
|
@ -18,7 +18,7 @@ keep_words_case:: a boolean indicating whether to lower case the words (defaults
|
|||
[float]
|
||||
=== Settings example
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /keep_words_example
|
||||
{
|
||||
|
@ -48,4 +48,3 @@ PUT /keep_words_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -21,7 +21,7 @@ in the text.
|
|||
|
||||
You can configure it like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /keyword_marker_example
|
||||
{
|
||||
|
@ -49,11 +49,10 @@ PUT /keyword_marker_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
And test it with:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /keyword_marker_example/_analyze
|
||||
{
|
||||
|
@ -61,7 +60,6 @@ POST /keyword_marker_example/_analyze
|
|||
"text" : "I like cats"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
And it'd respond:
|
||||
|
@ -97,7 +95,7 @@ And it'd respond:
|
|||
|
||||
As compared to the `normal` analyzer which has `cats` stemmed to `cat`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /keyword_marker_example/_analyze
|
||||
{
|
||||
|
@ -105,7 +103,6 @@ POST /keyword_marker_example/_analyze
|
|||
"text" : "I like cats"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
Response:
|
||||
|
|
|
@ -12,7 +12,7 @@ unnecessary duplicates.
|
|||
Here is an example of using the `keyword_repeat` token filter to
|
||||
preserve both the stemmed and unstemmed version of tokens:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /keyword_repeat_example
|
||||
{
|
||||
|
@ -35,11 +35,10 @@ PUT /keyword_repeat_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
And you can test it with:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /keyword_repeat_example/_analyze
|
||||
{
|
||||
|
@ -47,7 +46,6 @@ POST /keyword_repeat_example/_analyze
|
|||
"text" : "I like cats"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
And it'd respond:
|
||||
|
|
|
@ -16,7 +16,7 @@ is `false`.
|
|||
|
||||
Here is an example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /limit_example
|
||||
{
|
||||
|
@ -39,4 +39,3 @@ PUT /limit_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -8,7 +8,7 @@ Lowercase token filter supports Greek, Irish, and Turkish lowercase token
|
|||
filters through the `language` parameter. Below is a usage example in a
|
||||
custom analyzer
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /lowercase_example
|
||||
{
|
||||
|
@ -36,4 +36,3 @@ PUT /lowercase_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -29,7 +29,7 @@ preserve_original:: if `true` (the default) then emit the original token in
|
|||
|
||||
You can set it up like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /multiplexer_example
|
||||
{
|
||||
|
@ -51,11 +51,10 @@ PUT /multiplexer_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
And test it like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /multiplexer_example/_analyze
|
||||
{
|
||||
|
@ -63,7 +62,6 @@ POST /multiplexer_example/_analyze
|
|||
"text" : "Going HOME"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
And it'd respond:
|
||||
|
|
|
@ -44,7 +44,7 @@ emit the original token: `abc123def456`.
|
|||
This is particularly useful for indexing text like camel-case code, eg
|
||||
`stripHTML` where a user may search for `"strip html"` or `"striphtml"`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT test
|
||||
{
|
||||
|
@ -70,7 +70,6 @@ PUT test
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
When used to analyze the text
|
||||
|
||||
|
@ -85,7 +84,7 @@ this emits the tokens: [ `import`, `static`, `org`, `apache`, `commons`,
|
|||
|
||||
Another example is analyzing email addresses:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT test
|
||||
{
|
||||
|
@ -113,7 +112,6 @@ PUT test
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
When the above analyzer is used on an email address like:
|
||||
|
||||
|
|
|
@ -15,7 +15,7 @@ be emitted. Note that only inline scripts are supported.
|
|||
|
||||
You can set it up like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /condition_example
|
||||
{
|
||||
|
@ -39,13 +39,12 @@ PUT /condition_example
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> This will emit tokens that are more than 5 characters long
|
||||
|
||||
And test it like:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /condition_example/_analyze
|
||||
{
|
||||
|
@ -53,7 +52,6 @@ POST /condition_example/_analyze
|
|||
"text" : "What Flapdoodle"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
And it'd respond:
|
||||
|
|
|
@ -10,7 +10,7 @@ values: `Armenian`, `Basque`, `Catalan`, `Danish`, `Dutch`, `English`,
|
|||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /my_index
|
||||
{
|
||||
|
@ -32,4 +32,3 @@ PUT /my_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -18,7 +18,7 @@ absolute) to a list of mappings.
|
|||
|
||||
Here is an example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /my_index
|
||||
{
|
||||
|
@ -40,7 +40,6 @@ PUT /my_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Where the file looks like:
|
||||
|
||||
|
@ -51,7 +50,7 @@ include::{es-test-dir}/cluster/config/analysis/stemmer_override.txt[]
|
|||
|
||||
You can also define the overrides rules inline:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /my_index
|
||||
{
|
||||
|
@ -76,4 +75,3 @@ PUT /my_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -10,7 +10,7 @@
|
|||
A filter that provides access to (almost) all of the available stemming token
|
||||
filters through a single unified interface. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /my_index
|
||||
{
|
||||
|
@ -32,7 +32,6 @@ PUT /my_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The `language`/`name` parameter controls the stemmer with the following
|
||||
available values (the preferred filters are marked in *bold*):
|
||||
|
|
|
@ -31,7 +31,7 @@ type:
|
|||
|
||||
The `stopwords` parameter accepts either an array of stopwords:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
------------------------------------
|
||||
PUT /my_index
|
||||
{
|
||||
|
@ -47,11 +47,10 @@ PUT /my_index
|
|||
}
|
||||
}
|
||||
------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
or a predefined language-specific list:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
------------------------------------
|
||||
PUT /my_index
|
||||
{
|
||||
|
@ -67,7 +66,6 @@ PUT /my_index
|
|||
}
|
||||
}
|
||||
------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Elasticsearch provides the following predefined list of languages:
|
||||
|
||||
|
|
|
@ -19,7 +19,7 @@ standard <<analysis-synonym-tokenfilter,synonym token filter>>.
|
|||
Synonyms are configured using a configuration file.
|
||||
Here is an example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /test_index
|
||||
{
|
||||
|
@ -43,7 +43,6 @@ PUT /test_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The above configures a `search_synonyms` filter, with a path of
|
||||
`analysis/synonym.txt` (relative to the `config` location). The
|
||||
|
@ -55,7 +54,7 @@ Additional settings are:
|
|||
* `lenient` (defaults to `false`). If `true` ignores exceptions while parsing the synonym configuration. It is important
|
||||
to note that only those synonym rules which cannot get parsed are ignored. For instance consider the following request:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /test_index
|
||||
{
|
||||
|
@ -84,7 +83,7 @@ PUT /test_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
With the above request the word `bar` gets skipped but a mapping `foo => baz` is still added. However, if the mapping
|
||||
being added was "foo, baz => bar" nothing would get added to the synonym list. This is because the target word for the
|
||||
mapping is itself eliminated because it was a stop word. Similarly, if the mapping was "bar, foo, baz" and `expand` was
|
||||
|
@ -115,7 +114,7 @@ include::{es-test-dir}/cluster/config/analysis/synonym.txt[]
|
|||
You can also define synonyms for the filter directly in the
|
||||
configuration file (note use of `synonyms` instead of `synonyms_path`):
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /test_index
|
||||
{
|
||||
|
@ -136,7 +135,6 @@ PUT /test_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
However, it is recommended to define large synonyms set in a file using
|
||||
`synonyms_path`, because specifying them inline increases cluster size unnecessarily.
|
||||
|
@ -147,7 +145,7 @@ However, it is recommended to define large synonyms set in a file using
|
|||
Synonyms based on http://wordnet.princeton.edu/[WordNet] format can be
|
||||
declared using `format`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /test_index
|
||||
{
|
||||
|
@ -170,7 +168,6 @@ PUT /test_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Using `synonyms_path` to define WordNet synonyms in a file is supported
|
||||
as well.
|
||||
|
|
|
@ -5,7 +5,7 @@ The `synonym` token filter allows to easily handle synonyms during the
|
|||
analysis process. Synonyms are configured using a configuration file.
|
||||
Here is an example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /test_index
|
||||
{
|
||||
|
@ -29,7 +29,6 @@ PUT /test_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The above configures a `synonym` filter, with a path of
|
||||
`analysis/synonym.txt` (relative to the `config` location). The
|
||||
|
@ -45,8 +44,7 @@ Additional settings are:
|
|||
to note that only those synonym rules which cannot get parsed are ignored. For instance consider the following request:
|
||||
|
||||
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /test_index
|
||||
{
|
||||
|
@ -75,7 +73,7 @@ PUT /test_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
With the above request the word `bar` gets skipped but a mapping `foo => baz` is still added. However, if the mapping
|
||||
being added was "foo, baz => bar" nothing would get added to the synonym list. This is because the target word for the
|
||||
mapping is itself eliminated because it was a stop word. Similarly, if the mapping was "bar, foo, baz" and `expand` was
|
||||
|
@ -107,7 +105,7 @@ include::{es-test-dir}/cluster/config/analysis/synonym.txt[]
|
|||
You can also define synonyms for the filter directly in the
|
||||
configuration file (note use of `synonyms` instead of `synonyms_path`):
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /test_index
|
||||
{
|
||||
|
@ -128,7 +126,6 @@ PUT /test_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
However, it is recommended to define large synonyms set in a file using
|
||||
`synonyms_path`, because specifying them inline increases cluster size unnecessarily.
|
||||
|
@ -139,7 +136,7 @@ However, it is recommended to define large synonyms set in a file using
|
|||
Synonyms based on http://wordnet.princeton.edu/[WordNet] format can be
|
||||
declared using `format`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT /test_index
|
||||
{
|
||||
|
@ -162,7 +159,6 @@ PUT /test_index
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Using `synonyms_path` to define WordNet synonyms in a file is supported
|
||||
as well.
|
||||
|
|
|
@ -22,7 +22,7 @@ The `char_group` tokenizer accepts one parameter:
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -37,7 +37,6 @@ POST _analyze
|
|||
"text": "The QUICK brown-fox"
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
returns
|
||||
|
||||
|
|
|
@ -18,7 +18,7 @@ languages other than English:
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -26,7 +26,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -142,7 +141,7 @@ The `classic` tokenizer accepts the following parameters:
|
|||
In this example, we configure the `classic` tokenizer to have a
|
||||
`max_token_length` of 5 (for demonstration purposes):
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -169,7 +168,6 @@ POST my_index/_analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -21,7 +21,7 @@ With the default settings, the `edge_ngram` tokenizer treats the initial text as
|
|||
single token and produces N-grams with minimum length `1` and maximum length
|
||||
`2`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -29,7 +29,6 @@ POST _analyze
|
|||
"text": "Quick Fox"
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -101,7 +100,7 @@ In this example, we configure the `edge_ngram` tokenizer to treat letters and
|
|||
digits as tokens, and to produce grams with minimum length `2` and maximum
|
||||
length `10`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -133,7 +132,6 @@ POST my_index/_analyze
|
|||
"text": "2 Quick Foxes."
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -218,7 +216,7 @@ just search for the terms the user has typed in, for instance: `Quick Fo`.
|
|||
|
||||
Below is an example of how to set up a field for _search-as-you-type_:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
-----------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -277,7 +275,6 @@ GET my_index/_search
|
|||
}
|
||||
}
|
||||
-----------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> The `autocomplete` analyzer indexes the terms `[qu, qui, quic, quick, fo, fox, foxe, foxes]`.
|
||||
<2> The `autocomplete_search` analyzer searches for the terms `[quick, fo]`, both of which appear in the index.
|
||||
|
|
|
@ -8,7 +8,7 @@ with token filters to normalise output, e.g. lower-casing email addresses.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -16,7 +16,6 @@ POST _analyze
|
|||
"text": "New York"
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -9,7 +9,7 @@ not separated by spaces.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -17,7 +17,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -14,7 +14,7 @@ efficient as it performs both steps in a single pass.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -22,7 +22,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -17,7 +17,7 @@ With the default settings, the `ngram` tokenizer treats the initial text as a
|
|||
single token and produces N-grams with minimum length `1` and maximum length
|
||||
`2`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -25,7 +25,6 @@ POST _analyze
|
|||
"text": "Quick Fox"
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -206,7 +205,7 @@ difference between `max_gram` and `min_gram`.
|
|||
In this example, we configure the `ngram` tokenizer to treat letters and
|
||||
digits as tokens, and to produce tri-grams (grams of length `3`):
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -238,7 +237,6 @@ POST my_index/_analyze
|
|||
"text": "2 Quick Foxes."
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -14,7 +14,7 @@ Some sample documents are then indexed to represent some file paths
|
|||
for photos inside photo folders of two different users.
|
||||
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT file-path-test
|
||||
{
|
||||
|
@ -85,7 +85,6 @@ POST file-path-test/_doc/5
|
|||
"file_path": "/User/bob/photos/2017/05/16/my_photo1.jpg"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TESTSETUP
|
||||
|
||||
|
||||
|
@ -94,7 +93,7 @@ the example documents, with Bob's documents ranking highest due to `bob` also
|
|||
being one of the terms created by the standard analyzer boosting relevance for
|
||||
Bob's documents.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET file-path-test/_search
|
||||
{
|
||||
|
@ -105,13 +104,11 @@ GET file-path-test/_search
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
It's simple to match or filter documents with file paths that exist within a
|
||||
particular directory using the `file_path.tree` field.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET file-path-test/_search
|
||||
{
|
||||
|
@ -122,7 +119,6 @@ GET file-path-test/_search
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
With the reverse parameter for this tokenizer, it's also possible to match
|
||||
from the other end of the file path, such as individual file names or a deep
|
||||
|
@ -131,7 +127,7 @@ level subdirectory. The following example shows a search for all files named
|
|||
configured to use the reverse parameter in the mapping.
|
||||
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET file-path-test/_search
|
||||
{
|
||||
|
@ -144,14 +140,12 @@ GET file-path-test/_search
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
Viewing the tokens generated with both forward and reverse is instructive
|
||||
in showing the tokens created for the same file path value.
|
||||
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST file-path-test/_analyze
|
||||
{
|
||||
|
@ -165,14 +159,13 @@ POST file-path-test/_analyze
|
|||
"text": "/User/alice/photos/2017/05/16/my_photo1.jpg"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
It's also useful to be able to filter with file paths when combined with other
|
||||
types of searches, such as this example looking for any files paths with `16`
|
||||
that also must be in Alice's photo directory.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET file-path-test/_search
|
||||
{
|
||||
|
@ -188,4 +181,3 @@ GET file-path-test/_search
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
|
|
@ -8,7 +8,7 @@ tree.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -16,7 +16,6 @@ POST _analyze
|
|||
"text": "/one/two/three"
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -90,7 +89,7 @@ The `path_hierarchy` tokenizer accepts the following parameters:
|
|||
In this example, we configure the `path_hierarchy` tokenizer to split on `-`
|
||||
characters, and to replace them with `/`. The first two tokens are skipped:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -119,7 +118,6 @@ POST my_index/_analyze
|
|||
"text": "one-two-three-four-five"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -25,7 +25,7 @@ Read more about http://www.regular-expressions.info/catastrophic.html[pathologic
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -33,7 +33,6 @@ POST _analyze
|
|||
"text": "The foo_bar_size's default is 5."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -122,7 +121,7 @@ The `pattern` tokenizer accepts the following parameters:
|
|||
In this example, we configure the `pattern` tokenizer to break text into
|
||||
tokens when it encounters commas:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -149,7 +148,6 @@ POST my_index/_analyze
|
|||
"text": "comma,separated,values"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -211,7 +209,7 @@ escaped, so the pattern ends up looking like:
|
|||
|
||||
\"((?:\\\\\"|[^\"]|\\\\\")+)\"
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -239,7 +237,6 @@ POST my_index/_analyze
|
|||
"text": "\"value\", \"value with embedded \\\" quote\""
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -34,7 +34,7 @@ The `simple_pattern` tokenizer accepts the following parameters:
|
|||
This example configures the `simple_pattern` tokenizer to produce terms that are
|
||||
three-digit numbers
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -61,7 +61,6 @@ POST my_index/_analyze
|
|||
"text": "fd-786-335-514-x"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -35,7 +35,7 @@ The `simple_pattern_split` tokenizer accepts the following parameters:
|
|||
This example configures the `simple_pattern_split` tokenizer to split the input
|
||||
text on underscores.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -62,7 +62,6 @@ POST my_index/_analyze
|
|||
"text": "an_underscored_phrase"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -9,7 +9,7 @@ for most languages.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -17,7 +17,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -133,7 +132,7 @@ The `standard` tokenizer accepts the following parameters:
|
|||
In this example, we configure the `standard` tokenizer to have a
|
||||
`max_token_length` of 5 (for demonstration purposes):
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -160,7 +159,6 @@ POST my_index/_analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -13,7 +13,7 @@ consider using the {plugins}/analysis-icu-tokenizer.html[ICU Tokenizer] instead.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -21,7 +21,6 @@ POST _analyze
|
|||
"text": "การที่ได้ต้องแสดงว่างานดี"
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -7,7 +7,7 @@ recognises URLs and email addresses as single tokens.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -15,7 +15,6 @@ POST _analyze
|
|||
"text": "Email me at john.smith@global-international.com"
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
@ -89,7 +88,7 @@ The `uax_url_email` tokenizer accepts the following parameters:
|
|||
In this example, we configure the `uax_url_email` tokenizer to have a
|
||||
`max_token_length` of 5 (for demonstration purposes):
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------
|
||||
PUT my_index
|
||||
{
|
||||
|
@ -116,7 +115,6 @@ POST my_index/_analyze
|
|||
"text": "john.smith@global-international.com"
|
||||
}
|
||||
----------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -7,7 +7,7 @@ whitespace character.
|
|||
[float]
|
||||
=== Example output
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
---------------------------
|
||||
POST _analyze
|
||||
{
|
||||
|
@ -15,7 +15,6 @@ POST _analyze
|
|||
"text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
||||
}
|
||||
---------------------------
|
||||
// CONSOLE
|
||||
|
||||
/////////////////////
|
||||
|
||||
|
|
|
@ -70,7 +70,7 @@ calendars than the Gregorian calendar.
|
|||
You must enclose date math index name expressions within angle brackets, and
|
||||
all special characters should be URI encoded. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------------
|
||||
# GET /<logstash-{now/d}>/_search
|
||||
GET /%3Clogstash-%7Bnow%2Fd%7D%3E/_search
|
||||
|
@ -82,7 +82,6 @@ GET /%3Clogstash-%7Bnow%2Fd%7D%3E/_search
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT logstash-2016.09.20\n/]
|
||||
// TEST[s/now/2016.09.20||/]
|
||||
|
||||
|
@ -125,7 +124,7 @@ The following example shows a search request that searches the Logstash indices
|
|||
three days, assuming the indices use the default Logstash index name format,
|
||||
`logstash-yyyy.MM.dd`.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------------
|
||||
# GET /<logstash-{now/d-2d}>,<logstash-{now/d-1d}>,<logstash-{now/d}>/_search
|
||||
GET /%3Clogstash-%7Bnow%2Fd-2d%7D%3E%2C%3Clogstash-%7Bnow%2Fd-1d%7D%3E%2C%3Clogstash-%7Bnow%2Fd%7D%3E/_search
|
||||
|
@ -137,7 +136,6 @@ GET /%3Clogstash-%7Bnow%2Fd-2d%7D%3E%2C%3Clogstash-%7Bnow%2Fd-1d%7D%3E%2C%3Clogs
|
|||
}
|
||||
}
|
||||
----------------------------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT logstash-2016.09.20\nPUT logstash-2016.09.19\nPUT logstash-2016.09.18\n/]
|
||||
// TEST[s/now/2016.09.20||/]
|
||||
|
||||
|
@ -213,11 +211,10 @@ All REST APIs accept a `filter_path` parameter that can be used to reduce
|
|||
the response returned by Elasticsearch. This parameter takes a comma
|
||||
separated list of filters expressed with the dot notation:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_search?q=elasticsearch&filter_path=took,hits.hits._id,hits.hits._score
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:twitter]
|
||||
|
||||
Responds:
|
||||
|
@ -242,11 +239,10 @@ Responds:
|
|||
It also supports the `*` wildcard character to match any field or part
|
||||
of a field's name:
|
||||
|
||||
[source,sh]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cluster/state?filter_path=metadata.indices.*.stat*
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT twitter\n/]
|
||||
|
||||
Responds:
|
||||
|
@ -266,11 +262,10 @@ And the `**` wildcard can be used to include fields without knowing the
|
|||
exact path of the field. For example, we can return the Lucene version
|
||||
of every segment with this request:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cluster/state?filter_path=routing_table.indices.**.state
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT twitter\n/]
|
||||
|
||||
Responds:
|
||||
|
@ -292,11 +287,10 @@ Responds:
|
|||
|
||||
It is also possible to exclude one or more fields by prefixing the filter with the char `-`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_count?filter_path=-_shards
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:twitter]
|
||||
|
||||
Responds:
|
||||
|
@ -312,11 +306,10 @@ And for more control, both inclusive and exclusive filters can be combined in th
|
|||
this case, the exclusive filters will be applied first and the result will be filtered again using the
|
||||
inclusive filters:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cluster/state?filter_path=metadata.indices.*.state,-metadata.indices.logstash-*
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT index-1\nPUT index-2\nPUT index-3\nPUT logstash-2016.01\n/]
|
||||
|
||||
Responds:
|
||||
|
@ -340,7 +333,7 @@ consider combining the already existing `_source` parameter (see
|
|||
<<get-source-filtering,Get API>> for more details) with the `filter_path`
|
||||
parameter like this:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST /library/book?refresh
|
||||
{"title": "Book #1", "rating": 200.1}
|
||||
|
@ -350,7 +343,6 @@ POST /library/book?refresh
|
|||
{"title": "Book #3", "rating": 0.1}
|
||||
GET /_search?filter_path=hits.hits._source&_source=title&sort=rating:desc
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
[source,console-result]
|
||||
--------------------------------------------------
|
||||
|
@ -374,11 +366,10 @@ GET /_search?filter_path=hits.hits._source&_source=title&sort=rating:desc
|
|||
The `flat_settings` flag affects rendering of the lists of settings. When the
|
||||
`flat_settings` flag is `true`, settings are returned in a flat format:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET twitter/_settings?flat_settings=true
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:twitter]
|
||||
|
||||
Returns:
|
||||
|
@ -405,11 +396,10 @@ Returns:
|
|||
When the `flat_settings` flag is `false`, settings are returned in a more
|
||||
human readable structured format:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET twitter/_settings?flat_settings=false
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:twitter]
|
||||
|
||||
Returns:
|
||||
|
@ -570,11 +560,10 @@ stack trace of the error. You can enable that behavior by setting the
|
|||
`error_trace` url parameter to `true`. For example, by default when you send an
|
||||
invalid `size` parameter to the `_search` API:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------------
|
||||
POST /twitter/_search?size=surprise_me
|
||||
----------------------------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/surprise_me/surprise_me&error_trace=false/ catch:bad_request]
|
||||
// Since the test system sends error_trace=true by default we have to override
|
||||
|
||||
|
@ -603,11 +592,10 @@ The response looks like:
|
|||
|
||||
But if you set `error_trace=true`:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
----------------------------------------------------------------------
|
||||
POST /twitter/_search?size=surprise_me&error_trace=true
|
||||
----------------------------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[catch:bad_request]
|
||||
|
||||
The response looks like:
|
||||
|
|
|
@ -24,11 +24,10 @@ the available commands.
|
|||
Each of the commands accepts a query string parameter `v` to turn on
|
||||
verbose output. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/master?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Might respond with:
|
||||
|
||||
|
@ -46,11 +45,10 @@ u_n93zwxThWHi1PDBJAGAg 127.0.0.1 127.0.0.1 u_n93zw
|
|||
Each of the commands accepts a query string parameter `help` which will
|
||||
output its available columns. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/master?help
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Might respond with:
|
||||
|
||||
|
@ -75,11 +73,10 @@ instead.
|
|||
Each of the commands accepts a query string parameter `h` which forces
|
||||
only those columns to appear. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/nodes?h=ip,port,heapPercent,name
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Responds with:
|
||||
|
||||
|
|
|
@ -42,7 +42,7 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
|
||||
////
|
||||
Hidden setup for example:
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT test1
|
||||
{
|
||||
|
@ -65,14 +65,12 @@ PUT test1
|
|||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
////
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/aliases?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
The API returns the following response:
|
||||
|
|
|
@ -41,11 +41,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
[[cat-allocation-api-example]]
|
||||
==== {api-examples-title}
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/allocation?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT test\n{"settings": {"number_of_replicas": 0}}\n/]
|
||||
|
||||
The API returns the following response:
|
||||
|
|
|
@ -50,11 +50,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
The following `count` API request retrieves the document count of a single
|
||||
index, `twitter`.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/count/twitter?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:big_twitter]
|
||||
|
||||
|
||||
|
@ -73,11 +72,10 @@ epoch timestamp count
|
|||
The following `count` API request retrieves the document count of all indices in
|
||||
the cluster.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/count?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:big_twitter]
|
||||
// TEST[s/^/POST test\/test\?refresh\n{"test": "test"}\n/]
|
||||
|
||||
|
|
|
@ -47,7 +47,8 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
|
||||
////
|
||||
Hidden setup snippet to build an index with fielddata so our results are real:
|
||||
[source,js]
|
||||
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT test
|
||||
{
|
||||
|
@ -78,7 +79,6 @@ POST test/_doc?refresh
|
|||
# Perform a search to load the field data
|
||||
POST test/_search?sort=body,soul,mind
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
////
|
||||
|
||||
[[cat-fielddata-api-example-ind]]
|
||||
|
@ -88,11 +88,10 @@ You can specify an individual field in the request body or URL path. The
|
|||
following `fieldata` API request retrieves heap memory size information for the
|
||||
`body` field.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/fielddata?v&fields=body
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
The API returns the following response:
|
||||
|
@ -113,11 +112,10 @@ path. The following `fieldata` API request retrieves heap memory size
|
|||
information for the `body` and `soul` fields.
|
||||
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/fielddata/body,soul?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
The API returns the following response:
|
||||
|
@ -140,11 +138,10 @@ one row per field per node.
|
|||
The following `fieldata` API request retrieves heap memory size
|
||||
information all fields.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/fielddata?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
The API returns the following response:
|
||||
|
|
|
@ -67,11 +67,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
By default, the cat health API returns `HH:MM:SS` and
|
||||
https://en.wikipedia.org/wiki/Unix_time[Unix `epoch`] timestamps. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/health?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT twitter\n{"settings":{"number_of_replicas": 0}}\n/]
|
||||
|
||||
The API returns the following response:
|
||||
|
@ -88,11 +87,10 @@ epoch timestamp cluster status node.total node.data shards pri relo i
|
|||
===== Example without a timestamp
|
||||
You can use the `ts` (timestamps) parameter to disable timestamps. For example:
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/health?v&ts=false
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT twitter\n{"settings":{"number_of_replicas": 0}}\n/]
|
||||
|
||||
The API returns the following response:
|
||||
|
|
|
@ -84,11 +84,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
==== {api-examples-title}
|
||||
|
||||
[[examples]]
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/indices/twi*?v&s=index
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[setup:huge_twitter]
|
||||
// TEST[s/^/PUT twitter2\n{"settings": {"number_of_replicas": 0}}\n/]
|
||||
|
||||
|
|
|
@ -35,11 +35,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
[[cat-master-api-example]]
|
||||
==== {api-examples-title}
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/master?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The API returns the following response:
|
||||
|
||||
|
|
|
@ -65,11 +65,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
[[cat-nodeattrs-api-ex-default]]
|
||||
===== Example with default columns
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/nodeattrs?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/\?v/\?v&s=node,attr/]
|
||||
// Sort the resulting attributes so we can assert on them more easily
|
||||
|
||||
|
@ -97,11 +96,10 @@ The `attr` and `value` columns return custom node attributes, one per line.
|
|||
The following API request returns the `name`, `pid`, `attr`, and `value`
|
||||
columns.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/nodeattrs?v&h=name,pid,attr,value
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[s/,value/,value&s=node,attr/]
|
||||
// Sort the resulting attributes so we can assert on them more easily
|
||||
|
||||
|
|
|
@ -296,11 +296,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
[[cat-nodes-api-ex-default]]
|
||||
===== Example with default columns
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/nodes?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The API returns the following response:
|
||||
|
||||
|
@ -325,11 +324,10 @@ monitoring an entire cluster, particularly large ones.
|
|||
The following API request returns the `id`, `ip`, `port`, `v` (version), and `m`
|
||||
(master) columns.
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/nodes?v&h=id,ip,port,v,m
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The API returns the following response:
|
||||
|
||||
|
|
|
@ -33,11 +33,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
[[cat-pending-tasks-api-example]]
|
||||
==== {api-examples-title}
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
GET /_cat/pending_tasks?v
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The API returns the following response:
|
||||
|
||||
|
|
|
@ -34,11 +34,10 @@ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
|
|||
[[cat-plugins-api-example]]
|
||||
==== {api-examples-title}
|
||||
|
||||
[source,js]
|
||||
[source,console]
|
||||
------------------------------------------------------------------------------
|
||||
GET /_cat/plugins?v&s=component&h=name,component,version,description
|
||||
------------------------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The API returns the following response:
|
||||
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue