Docs: Fix grammar and typos in percolate
Added commas, capitalized "JSON" and "API", capitalized titles, etc. Closes #11023
This commit is contained in:
parent
c53bde5c7b
commit
7f064c592f
|
@ -1,27 +1,27 @@
|
|||
[[search-percolate]]
|
||||
== Percolator
|
||||
|
||||
Traditionally you design documents based on your data and store them into an index and then define queries via the search api
|
||||
in order to retrieve these documents. The percolator works in the opposite direction, first you store queries into an
|
||||
index and then via the percolate api you define documents in order to retrieve these queries.
|
||||
Traditionally you design documents based on your data, store them into an index, and then define queries via the search API
|
||||
in order to retrieve these documents. The percolator works in the opposite direction. First you store queries into an
|
||||
index and then, via the percolate API, you define documents in order to retrieve these queries.
|
||||
|
||||
The reason that queries can be stored comes from the fact that in Elasticsearch both documents and queries are defined in
|
||||
JSON. This allows you to embed queries into documents via the index api. Elasticsearch can extract the query from a
|
||||
document and make it available to the percolate api. Since documents are also defined as json, you can define a document
|
||||
in a request to the percolate api.
|
||||
JSON. This allows you to embed queries into documents via the index API. Elasticsearch can extract the query from a
|
||||
document and make it available to the percolate API. Since documents are also defined as JSON, you can define a document
|
||||
in a request to the percolate API.
|
||||
|
||||
The percolator and most of its features work in realtime, so once a percolate query is indexed it can immediately be used
|
||||
in the percolate api.
|
||||
in the percolate API.
|
||||
|
||||
[IMPORTANT]
|
||||
=====================================
|
||||
|
||||
Field referred to in a percolator query must *already* exist in the mapping
|
||||
Fields referred to in a percolator query must *already* exist in the mapping
|
||||
associated with the index used for percolation.
|
||||
There are two ways to make sure that a field mapping exist:
|
||||
|
||||
* Add or update a mapping via the <<indices-create-index,create index>> or
|
||||
<<indices-put-mapping,put mapping>> apis.
|
||||
<<indices-put-mapping,put mapping>> APIs.
|
||||
* Percolate a document before registering a query. Percolating a document can
|
||||
add field mappings dynamically, in the same way as happens when indexing a
|
||||
document.
|
||||
|
@ -29,7 +29,7 @@ There are two ways to make sure that a field mapping exist:
|
|||
=====================================
|
||||
|
||||
[float]
|
||||
=== Sample usage
|
||||
=== Sample Usage
|
||||
|
||||
Create an index with a mapping for the field `message`:
|
||||
|
||||
|
@ -96,10 +96,10 @@ The above request will yield the following response:
|
|||
<1> The percolate query with id `1` matches our document.
|
||||
|
||||
[float]
|
||||
=== Indexing percolator queries
|
||||
=== Indexing Percolator Queries
|
||||
|
||||
Percolate queries are stored as documents in a specific format and in an arbitrary index under a reserved type with the
|
||||
name `.percolator`. The query itself is placed as is in a json object under the top level field `query`.
|
||||
name `.percolator`. The query itself is placed as is in a JSON object under the top level field `query`.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
@ -127,11 +127,11 @@ percolate documents by specific queries.
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
On top of this also a mapping type can be associated with this query. This allows to control how certain queries
|
||||
like range queries, shape filters and other query & filters that rely on mapping settings get constructed. This is
|
||||
On top of this, also a mapping type can be associated with this query. This allows to control how certain queries
|
||||
like range queries, shape filters, and other query & filters that rely on mapping settings get constructed. This is
|
||||
important since the percolate queries are indexed into the `.percolator` type, and the queries / filters that rely on
|
||||
mapping settings would yield unexpected behaviour. Note by default field names do get resolved in a smart manner,
|
||||
but in certain cases with multiple types this can lead to unexpected behaviour, so being explicit about it will help.
|
||||
mapping settings would yield unexpected behaviour. Note: By default, field names do get resolved in a smart manner,
|
||||
but in certain cases with multiple types this can lead to unexpected behavior, so being explicit about it will help.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
@ -149,11 +149,11 @@ but in certain cases with multiple types this can lead to unexpected behaviour,
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
In the above example the range query gets really parsed into a Lucene numeric range query, based on the settings for
|
||||
In the above example the range query really gets parsed into a Lucene numeric range query, based on the settings for
|
||||
the field `created_at` in the type `tweet`.
|
||||
|
||||
Just as with any other type, the `.percolator` type has a mapping, which you can configure via the mappings apis.
|
||||
The default percolate mapping doesn't index the query field and only stores it.
|
||||
Just as with any other type, the `.percolator` type has a mapping, which you can configure via the mappings APIs.
|
||||
The default percolate mapping doesn't index the query field, only stores it.
|
||||
|
||||
Because `.percolate` is a type it also has a mapping. By default the following mapping is active:
|
||||
|
||||
|
@ -171,9 +171,9 @@ Because `.percolate` is a type it also has a mapping. By default the following m
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
If needed this mapping can be modified with the update mapping api.
|
||||
If needed, this mapping can be modified with the update mapping API.
|
||||
|
||||
In order to un-register a percolate query the delete api can be used. So if the previous added query needs to be deleted
|
||||
In order to un-register a percolate query the delete API can be used. So if the previous added query needs to be deleted
|
||||
the following delete requests needs to be executed:
|
||||
|
||||
[source,js]
|
||||
|
@ -182,14 +182,14 @@ curl -XDELETE localhost:9200/my-index/.percolator/1
|
|||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
=== Percolate api
|
||||
=== Percolate API
|
||||
|
||||
The percolate api executes in a distributed manner, meaning it executes on all shards an index points to.
|
||||
The percolate API executes in a distributed manner, meaning it executes on all shards an index points to.
|
||||
|
||||
.Required options
|
||||
* `index` - The index that contains the `.percolator` type. This can also be an alias.
|
||||
* `type` - The type of the document to be percolated. The mapping of that type is used to parse document.
|
||||
* `doc` - The actual document to percolate. Unlike the other two options this needs to be specified in the request body. Note this isn't required when percolating an existing document.
|
||||
* `doc` - The actual document to percolate. Unlike the other two options this needs to be specified in the request body. Note: This isn't required when percolating an existing document.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
@ -206,8 +206,8 @@ curl -XGET 'localhost:9200/twitter/tweet/_percolate' -d '{
|
|||
that the percolate request only gets executed on the shard where the routing value is partitioned to. This means that
|
||||
the percolate request only gets executed on one shard instead of all shards. Multiple values can be specified as a
|
||||
comma separated string, in that case the request can be be executed on more than one shard.
|
||||
* `preference` - Controls which shard replicas are preferred to execute the request on. Works the same as in the search api.
|
||||
* `ignore_unavailable` - Controls if missing concrete indices should silently be ignored. Same as is in the search api.
|
||||
* `preference` - Controls which shard replicas are preferred to execute the request on. Works the same as in the search API.
|
||||
* `ignore_unavailable` - Controls if missing concrete indices should silently be ignored. Same as is in the search API.
|
||||
* `percolate_format` - If `ids` is specified then the matches array in the percolate response will contain a string
|
||||
array of the matching ids instead of an array of objects. This can be useful to reduce the amount of data being send
|
||||
back to the client. Obviously if there are to percolator queries with same id from different indices there is no way
|
||||
|
@ -223,7 +223,7 @@ occurred for the filter to included the latest percolate queries.
|
|||
* `track_scores` - Whether the `_score` is included for each match. The `_score` is based on the query and represents
|
||||
how the query matched the *percolate query's metadata*, *not* how the document (that is being percolated) matched
|
||||
the query. The `query` option is required for this option. Defaults to `false`.
|
||||
* `sort` - Define a sort specification like in the search api. Currently only sorting `_score` reverse (default relevancy)
|
||||
* `sort` - Define a sort specification like in the search API. Currently only sorting `_score` reverse (default relevancy)
|
||||
is supported. Other sort fields will throw an exception. The `size` and `query` option are required for this setting. Like
|
||||
`track_score` the score is based on the query and represents how the query matched to the percolate query's metadata
|
||||
and *not* how the document being percolated matched to the query.
|
||||
|
@ -232,23 +232,23 @@ look at the aggregation documentation on how to define aggregations.
|
|||
* `highlight` - Allows highlight definitions to be included. The document being percolated is being highlight for each
|
||||
matching query. This allows you to see how each match is highlighting the document being percolated. See highlight
|
||||
documentation on how to define highlights. The `size` option is required for highlighting, the performance of highlighting
|
||||
in the percolate api depends of how many matches are being highlighted.
|
||||
in the percolate API depends of how many matches are being highlighted.
|
||||
|
||||
[float]
|
||||
=== Dedicated percolator index
|
||||
=== Dedicated Percolator Index
|
||||
|
||||
Percolate queries can be added to any index. Instead of adding percolate queries to the index the data resides in,
|
||||
these queries can also be added to a dedicated index. The advantage of this is that this dedicated percolator index
|
||||
can have its own index settings (For example the number of primary and replicas shards). If you choose to have a dedicated
|
||||
can have its own index settings (For example the number of primary and replica shards). If you choose to have a dedicated
|
||||
percolate index, you need to make sure that the mappings from the normal index are also available on the percolate index.
|
||||
Otherwise percolate queries can be parsed incorrectly.
|
||||
|
||||
[float]
|
||||
=== Filtering Executed Queries
|
||||
|
||||
Filtering allows to reduce the number of queries, any filter that the search api supports, (expect the ones mentioned in important notes)
|
||||
can also be used in the percolate api. The filter only works on the metadata fields. The `query` field isn't indexed by
|
||||
default. Based on the query we indexed before the following filter can be defined:
|
||||
Filtering allows to reduce the number of queries, any filter that the search API supports, (except the ones mentioned in important notes)
|
||||
can also be used in the percolate API. The filter only works on the metadata fields. The `query` field isn't indexed by
|
||||
default. Based on the query we indexed before, the following filter can be defined:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
@ -265,9 +265,9 @@ curl -XGET localhost:9200/test/type1/_percolate -d '{
|
|||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
=== Percolator count api
|
||||
=== Percolator Count API
|
||||
|
||||
The count percolate api, only keeps track of the number of matches and doesn't keep track of the actual matches
|
||||
The count percolate API, only keeps track of the number of matches and doesn't keep track of the actual matches
|
||||
Example:
|
||||
|
||||
[source,js]
|
||||
|
@ -291,10 +291,10 @@ Response:
|
|||
|
||||
|
||||
[float]
|
||||
=== Percolating an existing document
|
||||
=== Percolating an Existing Document
|
||||
|
||||
In order to percolate in newly indexed document, the percolate existing document can be used. Based on the response
|
||||
from an index request the `_id` and other meta information can be used to immediately percolate the newly added
|
||||
In order to percolate a newly indexed document, the percolate existing document can be used. Based on the response
|
||||
from an index request, the `_id` and other meta information can be used to immediately percolate the newly added
|
||||
document.
|
||||
|
||||
.Supported options for percolating an existing document on top of existing percolator options:
|
||||
|
@ -307,8 +307,8 @@ document.
|
|||
* `percolate_preference` - Which shard to prefer when executing the percolate request.
|
||||
* `version` - Enables a version check. If the fetched document's version isn't equal to the specified version then the request fails with a version conflict and the percolation request is aborted.
|
||||
|
||||
Internally the percolate api will issue a get request for fetching the`_source` of the document to percolate.
|
||||
For this feature to work the `_source` for documents to be percolated need to be stored.
|
||||
Internally the percolate API will issue a GET request for fetching the `_source` of the document to percolate.
|
||||
For this feature to work, the `_source` for documents to be percolated needs to be stored.
|
||||
|
||||
[float]
|
||||
==== Example
|
||||
|
@ -326,20 +326,20 @@ Index response:
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
Percolating an existing document:
|
||||
Percolating an Existing Document:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XGET 'localhost:9200/my-index1/message/1/_percolate'
|
||||
--------------------------------------------------
|
||||
|
||||
The response is the same as with the regular percolate api.
|
||||
The response is the same as with the regular percolate API.
|
||||
|
||||
[float]
|
||||
=== Multi percolate api
|
||||
=== Multi Percolate API
|
||||
|
||||
The multi percolate api allows to bundle multiple percolate requests into a single request, similar to what the multi
|
||||
search api does to search requests. The request body format is line based. Each percolate request item takes two lines,
|
||||
The multi percolate API allows to bundle multiple percolate requests into a single request, similar to what the multi
|
||||
search API does to search requests. The request body format is line based. Each percolate request item takes two lines,
|
||||
the first line is the header and the second line is the body.
|
||||
|
||||
The header can contain any parameter that normally would be set via the request path or query string parameters.
|
||||
|
@ -369,7 +369,7 @@ Request:
|
|||
curl -XGET 'localhost:9200/twitter/tweet/_mpercolate' --data-binary @requests.txt; echo
|
||||
--------------------------------------------------
|
||||
|
||||
The index twitter is the default index and the type tweet is the default type and will be used in the case a header
|
||||
The index `twitter` is the default index, and the type `tweet` is the default type and will be used in the case a header
|
||||
doesn't specify an index or type.
|
||||
|
||||
requests.txt:
|
||||
|
@ -388,7 +388,7 @@ requests.txt:
|
|||
{}
|
||||
--------------------------------------------------
|
||||
|
||||
For a percolate existing document item (headers with the `id` field), the response can be an empty json object.
|
||||
For a percolate existing document item (headers with the `id` field), the response can be an empty JSON object.
|
||||
All the required options are set in the header.
|
||||
|
||||
Response:
|
||||
|
@ -473,22 +473,22 @@ Each item represents a percolate response, the order of the items maps to the or
|
|||
were specified. In case a percolate request failed, the item response is substituted with an error message.
|
||||
|
||||
[float]
|
||||
=== How it works under the hood
|
||||
=== How it Works Under the Hood
|
||||
|
||||
When indexing a document that contains a query in an index and the `.percolator` type the query part of the documents gets
|
||||
When indexing a document that contains a query in an index and the `.percolator` type, the query part of the documents gets
|
||||
parsed into a Lucene query and is kept in memory until that percolator document is removed or the index containing the
|
||||
`.percolator` type get removed. So all the active percolator queries are kept in memory.
|
||||
`.percolator` type gets removed. So, all the active percolator queries are kept in memory.
|
||||
|
||||
At percolate time the document specified in the request gets parsed into a Lucene document and is stored in a in-memory
|
||||
At percolate time, the document specified in the request gets parsed into a Lucene document and is stored in a in-memory
|
||||
Lucene index. This in-memory index can just hold this one document and it is optimized for that. Then all the queries
|
||||
that are registered to the index that the percolate request is targeted for are going to be executed on this single document
|
||||
that are registered to the index that the percolate request is targeted for, are going to be executed on this single document
|
||||
in-memory index. This happens on each shard the percolate request needs to execute.
|
||||
|
||||
By using `routing`, `filter` or `query` features the amount of queries that need to be executed can be reduced and thus
|
||||
the time the percolate api needs to run can be decreased.
|
||||
the time the percolate API needs to run can be decreased.
|
||||
|
||||
[float]
|
||||
=== Important notes
|
||||
=== Important Notes
|
||||
|
||||
Because the percolator API is processing one document at a time, it doesn't support queries and filters that run
|
||||
against child documents such as `has_child`, `has_parent` and `top_children`.
|
||||
|
@ -497,16 +497,16 @@ The `wildcard` and `regexp` query natively use a lot of memory and because the p
|
|||
this can easily take up the available memory in the heap space. If possible try to use a `prefix` query or ngramming to
|
||||
achieve the same result (with way less memory being used).
|
||||
|
||||
The delete-by-query api doesn't work to unregister a query, it only deletes the percolate documents from disk. In order
|
||||
The delete-by-query API doesn't work to unregister a query, it only deletes the percolate documents from disk. In order
|
||||
to update the registered queries in memory the index needs be closed and opened.
|
||||
|
||||
[float]
|
||||
=== Forcing unmapped fields to be handled as string
|
||||
=== Forcing Unmapped Fields to be Handled as Strings
|
||||
|
||||
In certain cases it is unknown what kind of percolator queries do get registered and if no field mapping exist for fields
|
||||
In certain cases it is unknown what kind of percolator queries do get registered, and if no field mapping exists for fields
|
||||
that are referred by percolator queries then adding a percolator query fails. This means the mapping needs to be updated
|
||||
to have the field with the appropriate settings and then the percolator query can be added. But sometimes it is sufficient
|
||||
to have the field with the appropriate settings, and then the percolator query can be added. But sometimes it is sufficient
|
||||
if all unmapped fields are handled as if these were default string fields. In those cases one can configure the
|
||||
`index.percolator.map_unmapped_fields_as_string` setting to `true` (default to `false`) and then if a field referred in
|
||||
a percolator query does not exist, it will be handled as a default string field, so adding the percolator query doesn't
|
||||
a percolator query does not exist, it will be handled as a default string field so that adding the percolator query doesn't
|
||||
fail.
|
||||
|
|
Loading…
Reference in New Issue