2017-01-05 10:10:34 -05:00
|
|
|
[[modules-cross-cluster-search]]
|
2020-06-01 14:55:26 -04:00
|
|
|
== Search across clusters
|
2017-01-05 10:10:34 -05:00
|
|
|
|
2019-08-15 13:23:25 -04:00
|
|
|
*{ccs-cap}* lets you run a single search request against one or more
|
|
|
|
<<modules-remote-clusters,remote clusters>>. For example, you can use a {ccs} to
|
|
|
|
filter and analyze log data stored on clusters in different data centers.
|
|
|
|
|
|
|
|
IMPORTANT: {ccs-cap} requires <<modules-remote-clusters, remote clusters>>.
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2020-02-24 09:54:33 -05:00
|
|
|
[[ccs-supported-apis]]
|
2020-06-01 14:55:26 -04:00
|
|
|
=== Supported APIs
|
2020-02-24 09:54:33 -05:00
|
|
|
|
|
|
|
The following APIs support {ccs}:
|
|
|
|
|
|
|
|
* <<search-search,Search>>
|
|
|
|
* <<search-multi-search,Multi search>>
|
|
|
|
* <<search-template,Search template>>
|
|
|
|
* <<multi-search-template,Multi search template>>
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2019-08-15 13:23:25 -04:00
|
|
|
[[ccs-example]]
|
2020-06-01 14:55:26 -04:00
|
|
|
=== {ccs-cap} examples
|
2017-01-05 10:10:34 -05:00
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2019-08-15 13:23:25 -04:00
|
|
|
[[ccs-remote-cluster-setup]]
|
2020-06-01 14:55:26 -04:00
|
|
|
==== Remote cluster setup
|
2017-01-05 10:10:34 -05:00
|
|
|
|
2019-08-15 13:23:25 -04:00
|
|
|
To perform a {ccs}, you must have at least one remote cluster configured.
|
|
|
|
|
|
|
|
The following <<cluster-update-settings,cluster update settings>> API request
|
|
|
|
adds three remote clusters:`cluster_one`, `cluster_two`, and `cluster_three`.
|
2017-04-04 07:34:16 -04:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-04-04 07:34:16 -04:00
|
|
|
--------------------------------
|
|
|
|
PUT _cluster/settings
|
|
|
|
{
|
|
|
|
"persistent": {
|
2018-09-05 20:43:44 -04:00
|
|
|
"cluster": {
|
2017-04-04 07:34:16 -04:00
|
|
|
"remote": {
|
|
|
|
"cluster_one": {
|
|
|
|
"seeds": [
|
|
|
|
"127.0.0.1:9300"
|
|
|
|
]
|
|
|
|
},
|
|
|
|
"cluster_two": {
|
|
|
|
"seeds": [
|
|
|
|
"127.0.0.1:9301"
|
|
|
|
]
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
},
|
|
|
|
"cluster_three": {
|
|
|
|
"seeds": [
|
|
|
|
"127.0.0.1:9302"
|
|
|
|
]
|
2017-04-04 07:34:16 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
// TEST[setup:host]
|
2019-08-15 13:23:25 -04:00
|
|
|
// TEST[s/127.0.0.1:930\d+/\${transport_host}/]
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2019-08-15 13:23:25 -04:00
|
|
|
[[ccs-search-remote-cluster]]
|
2020-06-01 14:55:26 -04:00
|
|
|
==== Search a single remote cluster
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
|
2020-02-24 09:54:33 -05:00
|
|
|
The following <<search-search,search>> API request searches the
|
2020-08-03 13:31:19 -04:00
|
|
|
`my-index-000001` index on a single remote cluster, `cluster_one`.
|
2017-01-05 10:10:34 -05:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-01-05 10:10:34 -05:00
|
|
|
--------------------------------------------------
|
2020-08-03 13:31:19 -04:00
|
|
|
GET /cluster_one:my-index-000001/_search
|
2017-01-05 10:10:34 -05:00
|
|
|
{
|
2017-06-16 21:14:34 -04:00
|
|
|
"query": {
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"match": {
|
2020-08-03 13:31:19 -04:00
|
|
|
"user.id": "kimchy"
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
}
|
2020-07-28 13:58:20 -04:00
|
|
|
},
|
2020-08-03 13:31:19 -04:00
|
|
|
"_source": ["user.id", "message", "http.response.status_code"]
|
2017-01-05 10:10:34 -05:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
// TEST[continued]
|
2020-08-03 13:31:19 -04:00
|
|
|
// TEST[setup:my_index]
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
|
2019-08-15 13:23:25 -04:00
|
|
|
The API returns the following response:
|
|
|
|
|
2019-09-06 16:09:09 -04:00
|
|
|
[source,console-result]
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"took": 150,
|
|
|
|
"timed_out": false,
|
|
|
|
"_shards": {
|
|
|
|
"total": 1,
|
|
|
|
"successful": 1,
|
|
|
|
"failed": 0,
|
|
|
|
"skipped": 0
|
|
|
|
},
|
|
|
|
"_clusters": {
|
|
|
|
"total": 1,
|
|
|
|
"successful": 1,
|
|
|
|
"skipped": 0
|
|
|
|
},
|
|
|
|
"hits": {
|
2018-12-05 13:49:06 -05:00
|
|
|
"total" : {
|
|
|
|
"value": 1,
|
|
|
|
"relation": "eq"
|
|
|
|
},
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"max_score": 1,
|
|
|
|
"hits": [
|
|
|
|
{
|
2020-08-03 13:31:19 -04:00
|
|
|
"_index": "cluster_one:my-index-000001", <1>
|
2017-12-14 11:47:53 -05:00
|
|
|
"_type": "_doc",
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"_id": "0",
|
|
|
|
"_score": 1,
|
|
|
|
"_source": {
|
2020-08-03 13:31:19 -04:00
|
|
|
"user": {
|
|
|
|
"id": "kimchy"
|
|
|
|
},
|
|
|
|
"message": "GET /search HTTP/1.1 200 1070000",
|
|
|
|
"http": {
|
|
|
|
"response":
|
|
|
|
{
|
|
|
|
"status_code": 200
|
|
|
|
}
|
|
|
|
}
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// TESTRESPONSE[s/"took": 150/"took": "$body.took"/]
|
|
|
|
// TESTRESPONSE[s/"max_score": 1/"max_score": "$body.hits.max_score"/]
|
|
|
|
// TESTRESPONSE[s/"_score": 1/"_score": "$body.hits.hits.0._score"/]
|
|
|
|
|
2019-08-15 13:23:25 -04:00
|
|
|
<1> The search response body includes the name of the remote cluster in the
|
|
|
|
`_index` parameter.
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2019-08-15 13:23:25 -04:00
|
|
|
[[ccs-search-multi-remote-cluster]]
|
2020-06-01 14:55:26 -04:00
|
|
|
==== Search multiple remote clusters
|
2017-01-05 10:10:34 -05:00
|
|
|
|
2020-08-03 13:31:19 -04:00
|
|
|
The following <<search,search>> API request searches the `my-index-000001` index on
|
2019-08-15 13:23:25 -04:00
|
|
|
three clusters:
|
|
|
|
|
|
|
|
* Your local cluster
|
|
|
|
* Two remote clusters, `cluster_one` and `cluster_two`
|
2017-01-05 10:10:34 -05:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
2017-01-05 10:10:34 -05:00
|
|
|
--------------------------------------------------
|
2020-08-03 13:31:19 -04:00
|
|
|
GET /my-index-000001,cluster_one:my-index-000001,cluster_two:my-index-000001/_search
|
2017-01-05 10:10:34 -05:00
|
|
|
{
|
2017-06-16 21:14:34 -04:00
|
|
|
"query": {
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"match": {
|
2020-08-03 13:31:19 -04:00
|
|
|
"user.id": "kimchy"
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
}
|
2020-07-28 13:58:20 -04:00
|
|
|
},
|
2020-08-03 13:31:19 -04:00
|
|
|
"_source": ["user.id", "message", "http.response.status_code"]
|
2017-01-05 10:10:34 -05:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
// TEST[continued]
|
2017-01-05 10:10:34 -05:00
|
|
|
|
2019-08-15 13:23:25 -04:00
|
|
|
The API returns the following response:
|
2017-01-05 10:10:34 -05:00
|
|
|
|
2019-09-06 16:09:09 -04:00
|
|
|
[source,console-result]
|
2017-01-05 10:10:34 -05:00
|
|
|
--------------------------------------------------
|
2017-05-04 21:01:14 -04:00
|
|
|
{
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"took": 150,
|
|
|
|
"timed_out": false,
|
2019-08-15 13:23:25 -04:00
|
|
|
"num_reduce_phases": 4,
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"_shards": {
|
2019-08-15 13:23:25 -04:00
|
|
|
"total": 3,
|
|
|
|
"successful": 3,
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"failed": 0,
|
|
|
|
"skipped": 0
|
2017-01-05 10:10:34 -05:00
|
|
|
},
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"_clusters": {
|
2019-08-15 13:23:25 -04:00
|
|
|
"total": 3,
|
|
|
|
"successful": 3,
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"skipped": 0
|
|
|
|
},
|
|
|
|
"hits": {
|
2018-12-05 13:49:06 -05:00
|
|
|
"total" : {
|
2019-08-15 13:23:25 -04:00
|
|
|
"value": 3,
|
2018-12-05 13:49:06 -05:00
|
|
|
"relation": "eq"
|
|
|
|
},
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"max_score": 1,
|
|
|
|
"hits": [
|
2017-01-05 10:10:34 -05:00
|
|
|
{
|
2020-08-03 13:31:19 -04:00
|
|
|
"_index": "my-index-000001", <1>
|
2017-12-14 11:47:53 -05:00
|
|
|
"_type": "_doc",
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"_id": "0",
|
2019-02-15 07:44:55 -05:00
|
|
|
"_score": 2,
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"_source": {
|
2020-08-03 13:31:19 -04:00
|
|
|
"user": {
|
|
|
|
"id": "kimchy"
|
|
|
|
},
|
|
|
|
"message": "GET /search HTTP/1.1 200 1070000",
|
|
|
|
"http": {
|
|
|
|
"response":
|
|
|
|
{
|
|
|
|
"status_code": 200
|
|
|
|
}
|
|
|
|
}
|
2017-01-05 10:10:34 -05:00
|
|
|
}
|
|
|
|
},
|
|
|
|
{
|
2020-08-03 13:31:19 -04:00
|
|
|
"_index": "cluster_one:my-index-000001", <2>
|
2019-08-15 13:23:25 -04:00
|
|
|
"_type": "_doc",
|
|
|
|
"_id": "0",
|
|
|
|
"_score": 1,
|
|
|
|
"_source": {
|
2020-08-03 13:31:19 -04:00
|
|
|
"user": {
|
|
|
|
"id": "kimchy"
|
|
|
|
},
|
|
|
|
"message": "GET /search HTTP/1.1 200 1070000",
|
|
|
|
"http": {
|
|
|
|
"response":
|
|
|
|
{
|
|
|
|
"status_code": 200
|
|
|
|
}
|
|
|
|
}
|
2019-08-15 13:23:25 -04:00
|
|
|
}
|
|
|
|
},
|
|
|
|
{
|
2020-08-03 13:31:19 -04:00
|
|
|
"_index": "cluster_two:my-index-000001", <3>
|
2017-12-14 11:47:53 -05:00
|
|
|
"_type": "_doc",
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"_id": "0",
|
2019-02-15 07:44:55 -05:00
|
|
|
"_score": 1,
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
"_source": {
|
2020-08-03 13:31:19 -04:00
|
|
|
"user": {
|
|
|
|
"id": "kimchy"
|
|
|
|
},
|
|
|
|
"message": "GET /search HTTP/1.1 200 1070000",
|
|
|
|
"http": {
|
|
|
|
"response":
|
|
|
|
{
|
|
|
|
"status_code": 200
|
|
|
|
}
|
|
|
|
}
|
2017-01-05 10:10:34 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
// TESTRESPONSE[s/"took": 150/"took": "$body.took"/]
|
|
|
|
// TESTRESPONSE[s/"max_score": 1/"max_score": "$body.hits.max_score"/]
|
|
|
|
// TESTRESPONSE[s/"_score": 1/"_score": "$body.hits.hits.0._score"/]
|
|
|
|
// TESTRESPONSE[s/"_score": 2/"_score": "$body.hits.hits.1._score"/]
|
|
|
|
|
2019-08-15 13:23:25 -04:00
|
|
|
<1> This document's `_index` parameter doesn't include a cluster name. This
|
|
|
|
means the document came from the local cluster.
|
|
|
|
<2> This document came from `cluster_one`.
|
|
|
|
<3> This document came from `cluster_two`.
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2019-08-15 13:23:25 -04:00
|
|
|
[[skip-unavailable-clusters]]
|
2020-06-01 14:55:26 -04:00
|
|
|
=== Skip unavailable clusters
|
2019-08-15 13:23:25 -04:00
|
|
|
|
|
|
|
By default, a {ccs} returns an error if *any* cluster in the request is
|
|
|
|
unavailable.
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
|
2019-08-15 13:23:25 -04:00
|
|
|
To skip an unavailable cluster during a {ccs}, set the
|
|
|
|
<<skip-unavailable,`skip_unavailable`>> cluster setting to `true`.
|
|
|
|
|
|
|
|
The following <<cluster-update-settings,cluster update settings>> API request
|
|
|
|
changes `cluster_two`'s `skip_unavailable` setting to `true`.
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
|
2019-09-09 12:35:50 -04:00
|
|
|
[source,console]
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
--------------------------------
|
|
|
|
PUT _cluster/settings
|
|
|
|
{
|
|
|
|
"persistent": {
|
2019-08-15 13:23:25 -04:00
|
|
|
"cluster.remote.cluster_two.skip_unavailable": true
|
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.
This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.
Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.
The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:
"_clusters" : {
"total" : 3,
"successful" : 2,
"skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.
The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 05:41:47 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------
|
|
|
|
// TEST[continued]
|
|
|
|
|
2019-08-15 13:23:25 -04:00
|
|
|
If `cluster_two` is disconnected or unavailable during a {ccs}, {es} won't
|
|
|
|
include matching documents from that cluster in the final results.
|
2019-01-31 09:12:14 -05:00
|
|
|
|
2019-10-21 12:13:44 -04:00
|
|
|
[discrete]
|
|
|
|
[[ccs-gateway-seed-nodes]]
|
2020-06-01 14:55:26 -04:00
|
|
|
=== Selecting gateway and seed nodes in sniff mode
|
2019-10-21 12:13:44 -04:00
|
|
|
|
2020-03-09 12:49:41 -04:00
|
|
|
For remote clusters using the <<sniff-mode,sniff connection>> mode, gateway and
|
|
|
|
seed nodes need to be accessible from the local cluster via your network.
|
2019-10-21 12:13:44 -04:00
|
|
|
|
2020-03-09 12:49:41 -04:00
|
|
|
By default, any non-<<master-node,master-eligible>> node can act as a
|
|
|
|
gateway node. If wanted, you can define the gateway nodes for a cluster by
|
|
|
|
setting `cluster.remote.node.attr.gateway` to `true`.
|
2019-10-21 12:13:44 -04:00
|
|
|
|
|
|
|
For {ccs}, we recommend you use gateway nodes that are capable of serving as
|
|
|
|
<<coordinating-node,coordinating nodes>> for search requests. If
|
|
|
|
wanted, the seed nodes for a cluster can be a subset of these gateway nodes.
|
|
|
|
|
2020-03-09 12:49:41 -04:00
|
|
|
[discrete]
|
|
|
|
[[ccs-proxy-mode]]
|
2020-06-02 11:11:28 -04:00
|
|
|
=== {ccs-cap} in proxy mode
|
2020-03-09 12:49:41 -04:00
|
|
|
|
|
|
|
<<proxy-mode,Proxy mode>> remote cluster connections support {ccs}. All remote
|
|
|
|
connections connect to the configured `proxy_address`. Any desired connection
|
|
|
|
routing to gateway or <<coordinating-node,coordinating nodes>> must
|
|
|
|
be implemented by the intermediate proxy at this configured address.
|
|
|
|
|
2019-10-21 12:13:44 -04:00
|
|
|
[discrete]
|
|
|
|
[[ccs-network-delays]]
|
2020-06-01 14:55:26 -04:00
|
|
|
=== How {ccs} handles network delays
|
2019-10-21 12:13:44 -04:00
|
|
|
|
2019-08-15 10:59:58 -04:00
|
|
|
Because {ccs} involves sending requests to remote clusters, any network delays
|
|
|
|
can impact search speed. To avoid slow searches, {ccs} offers two options for
|
|
|
|
handling network delays:
|
2019-01-31 09:12:14 -05:00
|
|
|
|
2019-08-15 10:59:58 -04:00
|
|
|
<<ccs-min-roundtrips,Minimize network roundtrips>>::
|
|
|
|
By default, {es} reduces the number of network roundtrips between remote
|
|
|
|
clusters. This reduces the impact of network delays on search speed. However,
|
|
|
|
{es} can't reduce network roundtrips for large search requests, such as those
|
2020-07-31 12:40:40 -04:00
|
|
|
including a <<scroll-search-results, scroll>> or
|
2019-08-15 10:59:58 -04:00
|
|
|
<<request-body-search-inner-hits,inner hits>>.
|
|
|
|
+
|
|
|
|
See <<ccs-min-roundtrips>> to learn how this option works.
|
|
|
|
|
2020-02-24 09:54:33 -05:00
|
|
|
<<ccs-unmin-roundtrips, Don't minimize network roundtrips>>:: For search
|
|
|
|
requests that include a scroll or inner hits, {es} sends multiple outgoing and
|
|
|
|
ingoing requests to each remote cluster. You can also choose this option by
|
|
|
|
setting the <<ccs-minimize-roundtrips,`ccs_minimize_roundtrips`>> parameter to
|
|
|
|
`false`. While typically slower, this approach may work well for networks with
|
|
|
|
low latency.
|
2019-08-15 10:59:58 -04:00
|
|
|
+
|
|
|
|
See <<ccs-unmin-roundtrips>> to learn how this option works.
|
|
|
|
|
2020-03-09 12:49:41 -04:00
|
|
|
[discrete]
|
2019-08-15 10:59:58 -04:00
|
|
|
[[ccs-min-roundtrips]]
|
2020-06-01 14:55:26 -04:00
|
|
|
==== Minimize network roundtrips
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
Here's how {ccs} works when you minimize network roundtrips.
|
|
|
|
|
|
|
|
. You send a {ccs} request to your local cluster. A coordinating node in that
|
|
|
|
cluster receives and parses the request.
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-min-roundtrip-client-request.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
. The coordinating node sends a single search request to each cluster, including
|
2020-02-19 09:14:22 -05:00
|
|
|
the local cluster. Each cluster performs the search request independently,
|
|
|
|
applying its own cluster-level settings to the request.
|
2019-08-15 10:59:58 -04:00
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-min-roundtrip-cluster-search.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
. Each remote cluster sends its search results back to the coordinating node.
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-min-roundtrip-cluster-results.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
. After collecting results from each cluster, the coordinating node returns the
|
|
|
|
final results in the {ccs} response.
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-min-roundtrip-client-response.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
2020-03-09 12:49:41 -04:00
|
|
|
[discrete]
|
2019-08-15 10:59:58 -04:00
|
|
|
[[ccs-unmin-roundtrips]]
|
2020-06-01 14:55:26 -04:00
|
|
|
==== Don't minimize network roundtrips
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
Here's how {ccs} works when you don't minimize network roundtrips.
|
|
|
|
|
|
|
|
. You send a {ccs} request to your local cluster. A coordinating node in that
|
|
|
|
cluster receives and parses the request.
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-min-roundtrip-client-request.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
. The coordinating node sends a <<search-shards,search shards>> API request to
|
|
|
|
each remote cluster.
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-min-roundtrip-cluster-search.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
. Each remote cluster sends its response back to the coordinating node.
|
|
|
|
This response contains information about the indices and shards the {ccs}
|
|
|
|
request will be executed on.
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-min-roundtrip-cluster-results.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
. The coordinating node sends a search request to each shard, including those in
|
|
|
|
its own cluster. Each shard performs the search request independently.
|
|
|
|
+
|
2020-02-19 09:14:22 -05:00
|
|
|
[WARNING]
|
|
|
|
====
|
|
|
|
When network roundtrips aren't minimized, the search is executed as if all data
|
|
|
|
were in the coordinating node's cluster. We recommend updating cluster-level
|
|
|
|
settings that limit searches, such as `action.search.shard_count.limit`,
|
|
|
|
`pre_filter_shard_size`, and `max_concurrent_shard_requests`, to account for
|
|
|
|
this. If these limits are too low, the search may be rejected.
|
|
|
|
====
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-dont-min-roundtrip-shard-search.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
. Each shard sends its search results back to the coordinating node.
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-dont-min-roundtrip-shard-results.svg[]
|
2019-08-15 10:59:58 -04:00
|
|
|
|
|
|
|
. After collecting results from each cluster, the coordinating node returns the
|
|
|
|
final results in the {ccs} response.
|
|
|
|
+
|
2019-11-21 09:05:33 -05:00
|
|
|
image:images/ccs/ccs-min-roundtrip-client-response.svg[]
|
2020-07-20 10:41:53 -04:00
|
|
|
|
|
|
|
[discrete]
|
|
|
|
[[ccs-supported-configurations]]
|
|
|
|
=== Supported configurations
|
|
|
|
|
|
|
|
Generally, <<gateway-nodes-selection, cross cluster search>> can search remote
|
|
|
|
clusters that are one major version ahead or behind the coordinating node's
|
|
|
|
version. Cross cluster search can also search remote clusters that are being
|
|
|
|
<<rolling-upgrades, upgraded>> so long as both the "upgrade from" and
|
|
|
|
"upgrade to" version are compatible with the gateway node.
|
|
|
|
|
|
|
|
For example, a coordinating node running {es} 5.6 can search a remote cluster
|
|
|
|
running {es} 6.8, but that cluster can not be upgraded to 7.1. In this case
|
|
|
|
you should first upgrade the coordinating node to 7.1 and then upgrade remote
|
|
|
|
cluster.
|
|
|
|
|
|
|
|
WARNING: Running multiple versions of {es} in the same cluster beyond the
|
|
|
|
duration of an upgrade is not supported.
|