[[modules-cross-cluster-search]] == {ccs-cap} The _{ccs}_ feature allows any node to act as a federated client across multiple clusters. A {ccs} node won't join the remote cluster, instead it connects to a remote cluster in a light fashion in order to execute federated search requests. For details on communication and compatibility between different clusters, see <>. [float] === Using {ccs} {ccs-cap} requires <>. [source,js] -------------------------------- PUT _cluster/settings { "persistent": { "cluster": { "remote": { "cluster_one": { "seeds": [ "127.0.0.1:9300" ] }, "cluster_two": { "seeds": [ "127.0.0.1:9301" ] }, "cluster_three": { "seeds": [ "127.0.0.1:9302" ] } } } } } -------------------------------- // CONSOLE // TEST[setup:host] // TEST[s/127.0.0.1:9300/\${transport_host}/] To search the `twitter` index on remote cluster `cluster_one` the index name must be prefixed with the alias of the remote cluster followed by the `:` character: [source,js] -------------------------------------------------- GET /cluster_one:twitter/_search { "query": { "match": { "user": "kimchy" } } } -------------------------------------------------- // CONSOLE // TEST[continued] // TEST[setup:twitter] [source,js] -------------------------------------------------- { "took": 150, "timed_out": false, "_shards": { "total": 1, "successful": 1, "failed": 0, "skipped": 0 }, "_clusters": { "total": 1, "successful": 1, "skipped": 0 }, "hits": { "total" : { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } } -------------------------------------------------- // TESTRESPONSE[s/"took": 150/"took": "$body.took"/] // TESTRESPONSE[s/"max_score": 1/"max_score": "$body.hits.max_score"/] // TESTRESPONSE[s/"_score": 1/"_score": "$body.hits.hits.0._score"/] Indices with the same name on different clusters can also be searched: [source,js] -------------------------------------------------- GET /cluster_one:twitter,twitter/_search { "query": { "match": { "user": "kimchy" } } } -------------------------------------------------- // CONSOLE // TEST[continued] Search results are disambiguated the same way as the indices are disambiguated in the request. Indices with same names are treated as different indices when results are merged. All results retrieved from an index located in a remote cluster are prefixed with their corresponding cluster alias: [source,js] -------------------------------------------------- { "took": 150, "timed_out": false, "num_reduce_phases": 3, "_shards": { "total": 2, "successful": 2, "failed": 0, "skipped": 0 }, "_clusters": { "total": 2, "successful": 2, "skipped": 0 }, "hits": { "total" : { "value": 2, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "twitter", "_type": "_doc", "_id": "0", "_score": 2, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } }, { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } } -------------------------------------------------- // TESTRESPONSE[s/"took": 150/"took": "$body.took"/] // TESTRESPONSE[s/"max_score": 1/"max_score": "$body.hits.max_score"/] // TESTRESPONSE[s/"_score": 1/"_score": "$body.hits.hits.0._score"/] // TESTRESPONSE[s/"_score": 2/"_score": "$body.hits.hits.1._score"/] [float] === Skipping disconnected clusters By default, all remote clusters that are searched via {ccs} need to be available when the search request is executed. Otherwise, the whole request fails; even if some of the clusters are available, no search results are returned. You can use the boolean `skip_unavailable` setting to make remote clusters optional. By default, it is set to `false`. [source,js] -------------------------------- PUT _cluster/settings { "persistent": { "cluster.remote.cluster_two.skip_unavailable": true <1> } } -------------------------------- // CONSOLE // TEST[continued] <1> `cluster_two` is made optional [source,js] -------------------------------------------------- GET /cluster_one:twitter,cluster_two:twitter,twitter/_search <1> { "query": { "match": { "user": "kimchy" } } } -------------------------------------------------- // CONSOLE // TEST[continued] <1> Search against the `twitter` index in `cluster_one`, `cluster_two` and also locally [source,js] -------------------------------------------------- { "took": 150, "timed_out": false, "num_reduce_phases": 3, "_shards": { "total": 2, "successful": 2, "failed": 0, "skipped": 0 }, "_clusters": { <1> "total": 3, "successful": 2, "skipped": 1 }, "hits": { "total" : { "value": 2, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "twitter", "_type": "_doc", "_id": "0", "_score": 2, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } }, { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } } -------------------------------------------------- // TESTRESPONSE[s/"took": 150/"took": "$body.took"/] // TESTRESPONSE[s/"max_score": 1/"max_score": "$body.hits.max_score"/] // TESTRESPONSE[s/"_score": 1/"_score": "$body.hits.hits.0._score"/] // TESTRESPONSE[s/"_score": 2/"_score": "$body.hits.hits.1._score"/] <1> The `clusters` section indicates that one cluster was unavailable and got skipped [float] [[ccs-reduction]] === {ccs-cap} reduction phase {ccs-cap} (CCS) requests can be executed in two ways: - the CCS coordinating node minimizes network round-trips by sending one search request to each cluster. Each cluster performs the search independently, reducing and fetching results. Once the CCS node has received all the responses, it performs another reduction and returns the relevant results back to the user. This strategy is beneficial when there is network latency between the CCS coordinating node and the remote clusters involved, which is typically the case. A single request is sent to each remote cluster, at the cost of retrieving `from` + `size` already fetched results. This is the default strategy, used whenever possible. In case a scroll is provided, or inner hits are requested as part of field collapsing, this strategy is not supported hence network round-trips cannot be minimized and the following strategy is used instead. - the CCS coordinating node sends a <> request to each remote cluster, in order to collect information about their corresponding remote indices involved in the search request and the shards where their data is located. Once each cluster has responded to such request, the search executes as if all shards were part of the same cluster. The coordinating node sends one request to each shard involved, each shard executes the query and returns its own results which are then reduced (and fetched, depending on the <>) by the CCS coordinating node. This strategy may be beneficial whenever there is very low network latency between the CCS coordinating node and the remote clusters involved, as it treats all shards the same, at the cost of sending many requests to each remote cluster, which is problematic in presence of network latency. The <> supports the `ccs_minimize_roundtrips` parameter, which defaults to `true` and can be set to `false` in case minimizing network round-trips is not desirable.