diff --git a/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt b/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt
index caca9264..9db400f1 100644
--- a/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt
+++ b/.github/vale/styles/Vocab/OpenSearch/Words/accept.txt
@@ -75,6 +75,7 @@ Levenshtein
[Mm]ultivalued
[Mm]ultiword
[Nn]amespace
+[Oo]versamples?
pebibyte
[Pp]luggable
[Pp]reconfigure
diff --git a/_search-plugins/search-pipelines/collapse-processor.md b/_search-plugins/search-pipelines/collapse-processor.md
new file mode 100644
index 00000000..cea0a153
--- /dev/null
+++ b/_search-plugins/search-pipelines/collapse-processor.md
@@ -0,0 +1,144 @@
+---
+layout: default
+title: Collapse
+nav_order: 7
+has_children: false
+parent: Search processors
+grand_parent: Search pipelines
+---
+
+# Collapse processor
+
+The `collapse` response processor discards hits that have the same value for a particular field as a previous document in the result set.
+This is similar to passing the `collapse` parameter in a search request, but the response processor is applied to the
+response after fetching from all shards. The `collapse` response processor may be used in conjunction with the `rescore` search
+request parameter or may be applied after a reranking response processor.
+
+Using the `collapse` response processor will likely result in fewer than `size` results being returned because hits are discarded
+from a set whose size is already less than or equal to `size`. To increase the likelihood of returning `size` hits, use the
+`oversample` request processor and `truncate_hits` response processor, as shown in [this example]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/truncate-hits-processor/#oversample-collapse-and-truncate-hits).
+
+## Request fields
+
+The following table lists all request fields.
+
+Field | Data type | Description
+:--- | :--- | :---
+`field` | String | The field whose value will be read from each returned search hit. Only the first hit for each given field value will be returned in the search response. Required.
+`context_prefix` | String | May be used to read the `original_size` variable from a specific scope in order to avoid collisions. Optional.
+`tag` | String | The processor's identifier. Optional.
+`description` | String | A description of the processor. Optional.
+`ignore_failure` | Boolean | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.
+
+## Example
+
+The following example demonstrates using a search pipeline with a `collapse` processor.
+
+### Setup
+
+Create many documents containing a field to use for collapsing:
+
+```json
+POST /_bulk
+{ "create":{"_index":"my_index","_id":1}}
+{ "title" : "document 1", "color":"blue" }
+{ "create":{"_index":"my_index","_id":2}}
+{ "title" : "document 2", "color":"blue" }
+{ "create":{"_index":"my_index","_id":3}}
+{ "title" : "document 3", "color":"red" }
+{ "create":{"_index":"my_index","_id":4}}
+{ "title" : "document 4", "color":"red" }
+{ "create":{"_index":"my_index","_id":5}}
+{ "title" : "document 5", "color":"yellow" }
+{ "create":{"_index":"my_index","_id":6}}
+{ "title" : "document 6", "color":"yellow" }
+{ "create":{"_index":"my_index","_id":7}}
+{ "title" : "document 7", "color":"orange" }
+{ "create":{"_index":"my_index","_id":8}}
+{ "title" : "document 8", "color":"orange" }
+{ "create":{"_index":"my_index","_id":9}}
+{ "title" : "document 9", "color":"green" }
+{ "create":{"_index":"my_index","_id":10}}
+{ "title" : "document 10", "color":"green" }
+```
+{% include copy-curl.html %}
+
+Create a pipeline that only collapses on the `color` field:
+
+```json
+PUT /_search/pipeline/collapse_pipeline
+{
+ "response_processors": [
+ {
+ "collapse" : {
+ "field": "color"
+ }
+ }
+ ]
+}
+```
+{% include copy-curl.html %}
+
+### Using a search pipeline
+
+In this example, you request the top three documents before collapsing on the `color` field. Because the first two documents have the same `color`, the second one is discarded,
+and the request returns the first and third documents:
+
+```json
+POST /my_index/_search?search_pipeline=collapse_pipeline
+{
+ "size": 3
+}
+```
+{% include copy-curl.html %}
+
+
+
+
+ Response
+
+ {: .text-delta}
+
+```json
+{
+ "took" : 2,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 10,
+ "relation" : "eq"
+ },
+ "max_score" : 1.0,
+ "hits" : [
+ {
+ "_index" : "my_index",
+ "_id" : "1",
+ "_score" : 1.0,
+ "_source" : {
+ "title" : "document 1",
+ "color" : "blue"
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "3",
+ "_score" : 1.0,
+ "_source" : {
+ "title" : "document 3",
+ "color" : "red"
+ }
+ }
+ ]
+ },
+ "profile" : {
+ "shards" : [ ]
+ }
+}
+```
+
diff --git a/_search-plugins/search-pipelines/oversample-processor.md b/_search-plugins/search-pipelines/oversample-processor.md
new file mode 100644
index 00000000..698d9572
--- /dev/null
+++ b/_search-plugins/search-pipelines/oversample-processor.md
@@ -0,0 +1,292 @@
+---
+layout: default
+title: Oversample
+nav_order: 17
+has_children: false
+parent: Search processors
+grand_parent: Search pipelines
+---
+
+# Oversample processor
+
+The `oversample` request processor multiplies the `size` parameter of the search request by a specified `sample_factor` (>= 1.0), saving the original value in the `original_size` pipeline variable. The `oversample` processor is designed to work with the [`truncate_hits` response processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/truncate-hits-processor/) but may be used on its own.
+
+## Request fields
+
+The following table lists all request fields.
+
+Field | Data type | Description
+:--- | :--- | :---
+`sample_factor` | Float | The multiplicative factor (>= 1.0) that will be applied to the `size` parameter before processing the search request. Required.
+`context_prefix` | String | May be used to scope the `original_size` variable in order to avoid collisions. Optional.
+`tag` | String | The processor's identifier. Optional.
+`description` | String | A description of the processor. Optional.
+`ignore_failure` | Boolean | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.
+
+
+## Example
+
+The following example demonstrates using a search pipeline with an `oversample` processor.
+
+### Setup
+
+Create an index named `my_index` containing many documents:
+
+```json
+POST /_bulk
+{ "create":{"_index":"my_index","_id":1}}
+{ "doc": { "title" : "document 1" }}
+{ "create":{"_index":"my_index","_id":2}}
+{ "doc": { "title" : "document 2" }}
+{ "create":{"_index":"my_index","_id":3}}
+{ "doc": { "title" : "document 3" }}
+{ "create":{"_index":"my_index","_id":4}}
+{ "doc": { "title" : "document 4" }}
+{ "create":{"_index":"my_index","_id":5}}
+{ "doc": { "title" : "document 5" }}
+{ "create":{"_index":"my_index","_id":6}}
+{ "doc": { "title" : "document 6" }}
+{ "create":{"_index":"my_index","_id":7}}
+{ "doc": { "title" : "document 7" }}
+{ "create":{"_index":"my_index","_id":8}}
+{ "doc": { "title" : "document 8" }}
+{ "create":{"_index":"my_index","_id":9}}
+{ "doc": { "title" : "document 9" }}
+{ "create":{"_index":"my_index","_id":10}}
+{ "doc": { "title" : "document 10" }}
+```
+{% include copy-curl.html %}
+
+### Creating a search pipeline
+
+The following request creates a search pipeline named `my_pipeline` with an `oversample` request processor that requests 50% more hits than specified in `size`:
+
+```json
+PUT /_search/pipeline/my_pipeline
+{
+ "request_processors": [
+ {
+ "oversample" : {
+ "tag" : "oversample_1",
+ "description" : "This processor will multiply `size` by 1.5.",
+ "sample_factor" : 1.5
+ }
+ }
+ ]
+}
+```
+{% include copy-curl.html %}
+
+### Using a search pipeline
+
+Search for documents in `my_index` without a search pipeline:
+
+```json
+POST /my_index/_search
+{
+ "size": 5
+}
+```
+{% include copy-curl.html %}
+
+The response contains five hits:
+
+
+
+ Response
+
+ {: .text-delta}
+
+```json
+{
+ "took" : 3,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 10,
+ "relation" : "eq"
+ },
+ "max_score" : 1.0,
+ "hits" : [
+ {
+ "_index" : "my_index",
+ "_id" : "1",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 1"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "2",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 2"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "3",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 3"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "4",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 4"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "5",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 5"
+ }
+ }
+ }
+ ]
+ }
+}
+```
+
+
+To search with a pipeline, specify the pipeline name in the `search_pipeline` query parameter:
+
+```json
+POST /my_index/_search?search_pipeline=my_pipeline
+{
+ "size": 5
+}
+```
+{% include copy-curl.html %}
+
+The response contains 8 documents (5 * 1.5 = 7.5, rounded up to 8):
+
+
+
+ Response
+
+ {: .text-delta}
+
+```json
+{
+ "took" : 13,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 10,
+ "relation" : "eq"
+ },
+ "max_score" : 1.0,
+ "hits" : [
+ {
+ "_index" : "my_index",
+ "_id" : "1",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 1"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "2",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 2"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "3",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 3"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "4",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 4"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "5",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 5"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "6",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 6"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "7",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 7"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "8",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 8"
+ }
+ }
+ }
+ ]
+ }
+}
+```
+
diff --git a/_search-plugins/search-pipelines/personalize-search-ranking.md b/_search-plugins/search-pipelines/personalize-search-ranking.md
index b73ebb74..c7a7dd8d 100644
--- a/_search-plugins/search-pipelines/personalize-search-ranking.md
+++ b/_search-plugins/search-pipelines/personalize-search-ranking.md
@@ -1,7 +1,7 @@
---
layout: default
title: Personalize search ranking
-nav_order: 40
+nav_order: 18
has_children: false
parent: Search processors
grand_parent: Search pipelines
diff --git a/_search-plugins/search-pipelines/search-processors.md b/_search-plugins/search-pipelines/search-processors.md
index a8f55b73..e82dabc6 100644
--- a/_search-plugins/search-pipelines/search-processors.md
+++ b/_search-plugins/search-pipelines/search-processors.md
@@ -26,6 +26,8 @@ Processor | Description | Earliest available version
[`filter_query`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/filter-query-processor/) | Adds a filtering query that is used to filter requests. | 2.8
[`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) | Sets a default model for neural search at the index or field level. | 2.11
[`script`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/script-processor/) | Adds a script that is run on newly indexed documents. | 2.8
+[`oversample`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/oversample-processor/) | Increases the search request `size` parameter, storing the original value in the pipeline state. | 2.12
+
## Search response processors
@@ -37,6 +39,8 @@ Processor | Description | Earliest available version
:--- | :--- | :---
[`personalize_search_ranking`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/personalize-search-ranking/) | Uses [Amazon Personalize](https://aws.amazon.com/personalize/) to rerank search results (requires setting up the Amazon Personalize service). | 2.9
[`rename_field`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rename-field-processor/)| Renames an existing field. | 2.8
+[`collapse`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/collapse-processor/)| Deduplicates search hits based on a field value, similarly to `collapse` in a search request. | 2.12
+[`truncate_hits`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/truncate-hits-processor/)| Discards search hits after a specified target count is reached. Can undo the effect of the `oversample` request processor. | 2.12
## Search phase results processors
diff --git a/_search-plugins/search-pipelines/truncate-hits-processor.md b/_search-plugins/search-pipelines/truncate-hits-processor.md
new file mode 100644
index 00000000..871879ef
--- /dev/null
+++ b/_search-plugins/search-pipelines/truncate-hits-processor.md
@@ -0,0 +1,516 @@
+---
+layout: default
+title: Truncate hits
+nav_order: 35
+has_children: false
+parent: Search processors
+grand_parent: Search pipelines
+---
+
+# Truncate hits processor
+
+The `truncate_hits` response processor discards returned search hits after a given hit count is reached. The `truncate_hits` processor is designed to work with the [`oversample` request processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/oversample-processor/) but may be used on its own.
+
+The `target_size` parameter (which specifies where to truncate) is optional. If it is not specified, then OpenSearch uses the `original_size` variable set by the
+`oversample` processor (if available).
+
+The following is a common usage pattern:
+
+1. Add the `oversample` processor to a request pipeline to fetch a larger set of results.
+1. In the response pipeline, apply a reranking processor (which may promote results from beyond the originally requested top N) or the `collapse` processor (which may discard results after deduplication).
+1. Apply the `truncate` processor to return (at most) the originally requested number of hits.
+
+## Request fields
+
+The following table lists all request fields.
+
+Field | Data type | Description
+:--- | :--- | :---
+`target_size` | Integer | The maximum number of search hits to return (>=0). If not specified, the processor will try to read the `original_size` variable and will fail if it is not available. Optional.
+`context_prefix` | String | May be used to read the `original_size` variable from a specific scope in order to avoid collisions. Optional.
+`tag` | String | The processor's identifier. Optional.
+`description` | String | A description of the processor. Optional.
+`ignore_failure` | Boolean | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.
+
+## Example
+
+The following example demonstrates using a search pipeline with a `truncate` processor.
+
+### Setup
+
+Create an index named `my_index` containing many documents:
+
+```json
+POST /_bulk
+{ "create":{"_index":"my_index","_id":1}}
+{ "doc": { "title" : "document 1" }}
+{ "create":{"_index":"my_index","_id":2}}
+{ "doc": { "title" : "document 2" }}
+{ "create":{"_index":"my_index","_id":3}}
+{ "doc": { "title" : "document 3" }}
+{ "create":{"_index":"my_index","_id":4}}
+{ "doc": { "title" : "document 4" }}
+{ "create":{"_index":"my_index","_id":5}}
+{ "doc": { "title" : "document 5" }}
+{ "create":{"_index":"my_index","_id":6}}
+{ "doc": { "title" : "document 6" }}
+{ "create":{"_index":"my_index","_id":7}}
+{ "doc": { "title" : "document 7" }}
+{ "create":{"_index":"my_index","_id":8}}
+{ "doc": { "title" : "document 8" }}
+{ "create":{"_index":"my_index","_id":9}}
+{ "doc": { "title" : "document 9" }}
+{ "create":{"_index":"my_index","_id":10}}
+{ "doc": { "title" : "document 10" }}
+```
+{% include copy-curl.html %}
+
+### Creating a search pipeline
+
+The following request creates a search pipeline named `my_pipeline` with a `truncate_hits` response processor that discards hits after the first five:
+
+```json
+PUT /_search/pipeline/my_pipeline
+{
+ "response_processors": [
+ {
+ "truncate_hits" : {
+ "tag" : "truncate_1",
+ "description" : "This processor will discard results after the first 5.",
+ "target_size" : 5
+ }
+ }
+ ]
+}
+```
+{% include copy-curl.html %}
+
+### Using a search pipeline
+
+Search for documents in `my_index` without a search pipeline:
+
+```json
+POST /my_index/_search
+{
+ "size": 8
+}
+```
+{% include copy-curl.html %}
+
+The response contains eight hits:
+
+
+
+ Response
+
+ {: .text-delta}
+
+```json
+{
+ "took" : 13,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 10,
+ "relation" : "eq"
+ },
+ "max_score" : 1.0,
+ "hits" : [
+ {
+ "_index" : "my_index",
+ "_id" : "1",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 1"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "2",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 2"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "3",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 3"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "4",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 4"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "5",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 5"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "6",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 6"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "7",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 7"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "8",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 8"
+ }
+ }
+ }
+ ]
+ }
+}
+```
+
+
+To search with a pipeline, specify the pipeline name in the `search_pipeline` query parameter:
+
+```json
+POST /my_index/_search?search_pipeline=my_pipeline
+{
+ "size": 8
+}
+```
+{% include copy-curl.html %}
+
+The response contains only 5 hits, even though 8 were requested and 10 were available:
+
+
+
+ Response
+
+ {: .text-delta}
+
+```json
+{
+ "took" : 3,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 10,
+ "relation" : "eq"
+ },
+ "max_score" : 1.0,
+ "hits" : [
+ {
+ "_index" : "my_index",
+ "_id" : "1",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 1"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "2",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 2"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "3",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 3"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "4",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 4"
+ }
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "5",
+ "_score" : 1.0,
+ "_source" : {
+ "doc" : {
+ "title" : "document 5"
+ }
+ }
+ }
+ ]
+ }
+}
+```
+
+
+## Oversample, collapse, and truncate hits
+
+The following is a more realistic example in which you use `oversample` to request many candidate documents, use `collapse` to remove documents that duplicate a particular field (to get more diverse results), and then use `truncate` to return the originally requested document count (to avoid returning a large result payload from the cluster).
+
+
+### Setup
+
+Create many documents containing a field that you'll use for collapsing:
+
+```json
+POST /_bulk
+{ "create":{"_index":"my_index","_id":1}}
+{ "title" : "document 1", "color":"blue" }
+{ "create":{"_index":"my_index","_id":2}}
+{ "title" : "document 2", "color":"blue" }
+{ "create":{"_index":"my_index","_id":3}}
+{ "title" : "document 3", "color":"red" }
+{ "create":{"_index":"my_index","_id":4}}
+{ "title" : "document 4", "color":"red" }
+{ "create":{"_index":"my_index","_id":5}}
+{ "title" : "document 5", "color":"yellow" }
+{ "create":{"_index":"my_index","_id":6}}
+{ "title" : "document 6", "color":"yellow" }
+{ "create":{"_index":"my_index","_id":7}}
+{ "title" : "document 7", "color":"orange" }
+{ "create":{"_index":"my_index","_id":8}}
+{ "title" : "document 8", "color":"orange" }
+{ "create":{"_index":"my_index","_id":9}}
+{ "title" : "document 9", "color":"green" }
+{ "create":{"_index":"my_index","_id":10}}
+{ "title" : "document 10", "color":"green" }
+```
+{% include copy-curl.html %}
+
+Create a pipeline that collapses only on the `color` field:
+
+```json
+PUT /_search/pipeline/collapse_pipeline
+{
+ "response_processors": [
+ {
+ "collapse" : {
+ "field": "color"
+ }
+ }
+ ]
+}
+```
+{% include copy-curl.html %}
+
+Create another pipeline that oversamples, collapses, and then truncates results:
+
+```json
+PUT /_search/pipeline/oversampling_collapse_pipeline
+{
+ "request_processors": [
+ {
+ "oversample": {
+ "sample_factor": 3
+ }
+ }
+ ],
+ "response_processors": [
+ {
+ "collapse" : {
+ "field": "color"
+ }
+ },
+ {
+ "truncate_hits": {
+ "description": "Truncates back to the original size before oversample increased it."
+ }
+ }
+ ]
+}
+```
+{% include copy-curl.html %}
+
+### Collapse without oversample
+
+In this example, you request the top three documents before collapsing on the `color` field. Because the first two documents have the same `color`, the second one is discarded, and the request returns the first and third documents:
+
+```json
+POST /my_index/_search?search_pipeline=collapse_pipeline
+{
+ "size": 3
+}
+```
+{% include copy-curl.html %}
+
+
+
+
+ Response
+
+ {: .text-delta}
+
+```json
+{
+ "took" : 2,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 10,
+ "relation" : "eq"
+ },
+ "max_score" : 1.0,
+ "hits" : [
+ {
+ "_index" : "my_index",
+ "_id" : "1",
+ "_score" : 1.0,
+ "_source" : {
+ "title" : "document 1",
+ "color" : "blue"
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "3",
+ "_score" : 1.0,
+ "_source" : {
+ "title" : "document 3",
+ "color" : "red"
+ }
+ }
+ ]
+ },
+ "profile" : {
+ "shards" : [ ]
+ }
+}
+```
+
+
+
+### Oversample, collapse, and truncate
+
+Now you will use the `oversampling_collapse_pipeline`, which requests the top 9 documents (multiplying the size by 3), deduplicates by `color`, and then returns the top 3 hits:
+
+```json
+POST /my_index/_search?search_pipeline=oversampling_collapse_pipeline
+{
+ "size": 3
+}
+```
+{% include copy-curl.html %}
+
+
+
+
+ Response
+
+ {: .text-delta}
+
+```json
+{
+ "took" : 2,
+ "timed_out" : false,
+ "_shards" : {
+ "total" : 1,
+ "successful" : 1,
+ "skipped" : 0,
+ "failed" : 0
+ },
+ "hits" : {
+ "total" : {
+ "value" : 10,
+ "relation" : "eq"
+ },
+ "max_score" : 1.0,
+ "hits" : [
+ {
+ "_index" : "my_index",
+ "_id" : "1",
+ "_score" : 1.0,
+ "_source" : {
+ "title" : "document 1",
+ "color" : "blue"
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "3",
+ "_score" : 1.0,
+ "_source" : {
+ "title" : "document 3",
+ "color" : "red"
+ }
+ },
+ {
+ "_index" : "my_index",
+ "_id" : "5",
+ "_score" : 1.0,
+ "_source" : {
+ "title" : "document 5",
+ "color" : "yellow"
+ }
+ }
+ ]
+ },
+ "profile" : {
+ "shards" : [ ]
+ }
+}
+```
+
+
+