2021-05-28 13:48:19 -04:00
---
layout: default
title: Boolean queries
2023-02-15 17:12:50 -05:00
parent: Compound queries
grand_parent: Query DSL
nav_order: 10
2023-03-08 10:53:21 -05:00
redirect_from:
- /opensearch/query-dsl/compound/bool/
2023-05-05 12:02:49 -04:00
- /opensearch/query-dsl/bool/
2023-08-08 09:41:55 -04:00
- /query-dsl/query-dsl/compound/bool/
2021-05-28 13:48:19 -04:00
---
# Boolean queries
2023-07-07 14:03:30 -04:00
A Boolean (`bool`) query can combine several query clauses into one advanced query. The clauses are combined with Boolean logic to find matching documents returned in the results.
2021-05-28 13:48:19 -04:00
2023-07-07 14:03:30 -04:00
Use the following query clauses within a `bool` query:
2021-05-28 13:48:19 -04:00
2023-07-07 14:03:30 -04:00
Clause | Behavior
2021-05-28 13:48:19 -04:00
:--- | :---
2023-07-07 14:03:30 -04:00
`must` | Logical `and` operator. The results must match all queries in this clause.
2022-10-17 16:17:25 -04:00
`must_not` | Logical `not` operator. All matches are excluded from the results.
2023-07-07 14:03:30 -04:00
`should` | Logical `or` operator. The results must match at least one of the queries. Matching more `should` clauses increases the document's relevance score. You can set the minimum number of queries that must match using the [`minimum_should_match` ]({{site.url}}{{site.baseurl}}/query-dsl/query-dsl/minimum-should-match/ ) parameter. If a query contains a `must` or `filter` clause, the default `minimum_should_match` value is 0. Otherwise, the default `minimum_should_match` value is 1.
`filter` | Logical `and` operator that is applied first to reduce your dataset before applying the queries. A query within a filter clause is a yes or no option. If a document matches the query, it is returned in the results; otherwise, it is not. The results of a filter query are generally cached to allow for a faster return. Use the filter query to filter the results based on exact matches, ranges, dates, or numbers.
2022-10-17 16:17:25 -04:00
2023-07-07 14:03:30 -04:00
A Boolean query has the following structure:
2021-05-28 13:48:19 -04:00
```json
GET _search
{
"query": {
"bool": {
"must": [
{}
],
"must_not": [
{}
],
"should": [
{}
],
"filter": {}
}
}
}
```
For example, assume you have the complete works of Shakespeare indexed in an OpenSearch cluster. You want to construct a single query that meets the following requirements:
1. The `text_entry` field must contain the word `love` and should contain either `life` or `grace` .
2. The `speaker` field must not contain `ROMEO` .
2023-07-07 14:03:30 -04:00
3. Filter these results to the play `Romeo and Juliet` without affecting the relevance score.
2021-05-28 13:48:19 -04:00
2023-07-07 14:03:30 -04:00
These requirements can be combined in the following query:
2021-05-28 13:48:19 -04:00
```json
GET shakespeare/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"text_entry": "love"
}
}
],
"should": [
{
"match": {
"text_entry": "life"
}
},
{
"match": {
"text_entry": "grace"
}
}
],
"minimum_should_match": 1,
"must_not": [
{
"match": {
"speaker": "ROMEO"
}
}
],
"filter": {
"term": {
"play_name": "Romeo and Juliet"
}
}
}
}
}
```
2023-07-07 14:03:30 -04:00
The response contains matching documents:
2021-05-28 13:48:19 -04:00
```json
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 4,
"successful": 4,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 11.356054,
"hits": [
{
"_index": "shakespeare",
"_id": "88020",
"_score": 11.356054,
"_source": {
"type": "line",
"line_id": 88021,
"play_name": "Romeo and Juliet",
"speech_number": 19,
"line_number": "4.5.61",
"speaker": "PARIS",
"text_entry": "O love! O life! not life, but love in death!"
}
}
]
}
}
```
If you want to identify which of these clauses actually caused the matching results, name each query with the `_name` parameter.
To add the `_name` parameter, change the field name in the `match` query to an object:
```json
GET shakespeare/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"text_entry": {
"query": "love",
"_name": "love-must"
}
}
}
],
"should": [
{
"match": {
"text_entry": {
"query": "life",
"_name": "life-should"
}
}
},
{
"match": {
"text_entry": {
"query": "grace",
"_name": "grace-should"
}
}
}
],
"minimum_should_match": 1,
"must_not": [
{
"match": {
"speaker": {
"query": "ROMEO",
"_name": "ROMEO-must-not"
}
}
}
],
"filter": {
"term": {
"play_name": "Romeo and Juliet"
}
}
}
}
}
```
OpenSearch returns a `matched_queries` array that lists the queries that matched these results:
```json
"matched_queries": [
"love-must",
"life-should"
]
```
If you remove the queries not in this list, you will still see the exact same result.
2023-07-07 14:03:30 -04:00
By examining which `should` clause matched, you can better understand the relevance score of the results.
2021-05-28 13:48:19 -04:00
2022-09-21 14:37:13 -04:00
You can also construct complex Boolean expressions by nesting `bool` queries.
2023-07-07 14:03:30 -04:00
For example, use the following query to find a `text_entry` field that matches (`love` OR `hate` ) AND (`life` OR `grace` ) in the play `Romeo and Juliet` :
2021-05-28 13:48:19 -04:00
```json
GET shakespeare/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"text_entry": "love"
}
},
{
"match": {
"text": "hate"
}
}
]
}
},
{
"bool": {
"should": [
{
"match": {
"text_entry": "life"
}
},
{
"match": {
"text": "grace"
}
}
]
}
}
],
"filter": {
"term": {
"play_name": "Romeo and Juliet"
}
}
}
}
}
```
2023-07-07 14:03:30 -04:00
The response contains matching documents:
2021-05-28 13:48:19 -04:00
```json
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 11.37006,
"hits": [
{
"_index": "shakespeare",
"_type": "doc",
"_id": "88020",
"_score": 11.37006,
"_source": {
"type": "line",
"line_id": 88021,
"play_name": "Romeo and Juliet",
"speech_number": 19,
"line_number": "4.5.61",
"speaker": "PARIS",
"text_entry": "O love! O life! not life, but love in death!"
}
}
]
}
}
```