mirror of https://github.com/iSharkFly-Docs/opensearch-docs-cn synced 2025-03-08 06:39:29 +00:00

Add score normalization and combination documentation (#4985 )

* Add search phase results processor

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add hybrid query

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Normalization processor additions

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add more details

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Continue writing

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add more query then fetch details and diagram

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Small rewording

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Leaner left nav headers

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Tech review feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add semantic search tutorial

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Reworded prerequisites

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Removed comma

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Rewording advanced prerequisites

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Changed searching for ML model to shorter request

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update task type in register model response

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Changing example

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added huggingface prefix to model names

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change example responses

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added note about huggingface prefix

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Update _ml-commons-plugin/semantic-search.md

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Implemented doc review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* List weights under parameters

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Remove one-shard warning for normalization processor

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Implemented editorial comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change links

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* More editorial feedback

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change model-serving framework to ML framework

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Use get model API to check model status

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Added neural search description and diagram

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* More editorial comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add link to profile API

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Addressed more tech review comments

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Implemented editorial comments on changes

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>

2023-09-22 17:29:58 -04:00

6.8 KiB

Raw Blame History

layout

title

parent

grand_parent

nav_order

redirect_from

default

Boolean

Compound queries

Query DSL

/opensearch/query-dsl/compound/bool/

/opensearch/query-dsl/bool/

/query-dsl/query-dsl/compound/bool/

Boolean queries

A Boolean (bool) query can combine several query clauses into one advanced query. The clauses are combined with Boolean logic to find matching documents returned in the results.

Use the following query clauses within a bool query:

Clause	Behavior
`must`	Logical `and` operator. The results must match all queries in this clause.
`must_not`	Logical `not` operator. All matches are excluded from the results.
`should`	Logical `or` operator. The results must match at least one of the queries. Matching more `should` clauses increases the document's relevance score. You can set the minimum number of queries that must match using the `minimum_should_match` parameter. If a query contains a `must` or `filter` clause, the default `minimum_should_match` value is 0. Otherwise, the default `minimum_should_match` value is 1.
`filter`	Logical `and` operator that is applied first to reduce your dataset before applying the queries. A query within a filter clause is a yes or no option. If a document matches the query, it is returned in the results; otherwise, it is not. The results of a filter query are generally cached to allow for a faster return. Use the filter query to filter the results based on exact matches, ranges, dates, or numbers.

A Boolean query has the following structure:

GET _search
{
  "query": {
    "bool": {
      "must": [
        {}
      ],
      "must_not": [
        {}
      ],
      "should": [
        {}
      ],
      "filter": {}
    }
  }
}

For example, assume you have the complete works of Shakespeare indexed in an OpenSearch cluster. You want to construct a single query that meets the following requirements:

The text_entry field must contain the word love and should contain either life or grace.
The speaker field must not contain ROMEO.
Filter these results to the play Romeo and Juliet without affecting the relevance score.

These requirements can be combined in the following query:

GET shakespeare/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "text_entry": "love"
          }
        }
      ],
      "should": [
        {
          "match": {
            "text_entry": "life"
          }
        },
        {
          "match": {
            "text_entry": "grace"
          }
        }
      ],
      "minimum_should_match": 1,
      "must_not": [
        {
          "match": {
            "speaker": "ROMEO"
          }
        }
      ],
      "filter": {
        "term": {
          "play_name": "Romeo and Juliet"
        }
      }
    }
  }
}

The response contains matching documents:

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 4,
    "successful": 4,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 11.356054,
    "hits": [
      {
        "_index": "shakespeare",
        "_id": "88020",
        "_score": 11.356054,
        "_source": {
          "type": "line",
          "line_id": 88021,
          "play_name": "Romeo and Juliet",
          "speech_number": 19,
          "line_number": "4.5.61",
          "speaker": "PARIS",
          "text_entry": "O love! O life! not life, but love in death!"
        }
      }
    ]
  }
}

If you want to identify which of these clauses actually caused the matching results, name each query with the _name parameter. To add the _name parameter, change the field name in the match query to an object:

GET shakespeare/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "text_entry": {
              "query": "love",
              "_name": "love-must"
            }
          }
        }
      ],
      "should": [
        {
          "match": {
            "text_entry": {
              "query": "life",
              "_name": "life-should"
            }
          }
        },
        {
          "match": {
            "text_entry": {
              "query": "grace",
              "_name": "grace-should"
            }
          }
        }
      ],
      "minimum_should_match": 1,
      "must_not": [
        {
          "match": {
            "speaker": {
              "query": "ROMEO",
              "_name": "ROMEO-must-not"
            }
          }
        }
      ],
      "filter": {
        "term": {
          "play_name": "Romeo and Juliet"
        }
      }
    }
  }
}

OpenSearch returns a matched_queries array that lists the queries that matched these results:

"matched_queries": [
  "love-must",
  "life-should"
]

If you remove the queries not in this list, you will still see the exact same result. By examining which should clause matched, you can better understand the relevance score of the results.

You can also construct complex Boolean expressions by nesting bool queries. For example, use the following query to find a text_entry field that matches (love OR hate) AND (life OR grace) in the play Romeo and Juliet:

GET shakespeare/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "match": {
                  "text_entry": "love"
                }
              },
              {
                "match": {
                  "text": "hate"
                }
              }
            ]
          }
        },
        {
          "bool": {
            "should": [
              {
                "match": {
                  "text_entry": "life"
                }
              },
              {
                "match": {
                  "text": "grace"
                }
              }
            ]
          }
        }
      ],
      "filter": {
        "term": {
          "play_name": "Romeo and Juliet"
        }
      }
    }
  }
}

The response contains matching documents:

{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 2,
    "successful": 2,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 11.37006,
    "hits": [
      {
        "_index": "shakespeare",
        "_type": "doc",
        "_id": "88020",
        "_score": 11.37006,
        "_source": {
          "type": "line",
          "line_id": 88021,
          "play_name": "Romeo and Juliet",
          "speech_number": 19,
          "line_number": "4.5.61",
          "speaker": "PARIS",
          "text_entry": "O love! O life! not life, but love in death!"
        }
      }
    ]
  }
}

6.8 KiB Raw Blame History

Boolean queries

6.8 KiB

Raw Blame History