[[regexp-syntax]] == Regular expression syntax A {wikipedia}/Regular_expression[regular expression] is a way to match patterns in data using placeholder characters, called operators. {es} supports regular expressions in the following queries: * <> * <> {es} uses https://lucene.apache.org/core/[Apache Lucene]'s regular expression engine to parse these queries. [discrete] [[regexp-reserved-characters]] === Reserved characters Lucene's regular expression engine supports all Unicode characters. However, the following characters are reserved as operators: .... . ? + * | { } [ ] ( ) " \ .... Depending on the <> enabled, the following characters may also be reserved: .... # @ & < > ~ .... To use one of these characters literally, escape it with a preceding backslash or surround it with double quotes. For example: .... \@ # renders as a literal '@' \\ # renders as a literal '\' "john@smith.com" # renders as 'john@smith.com' .... [discrete] [[regexp-standard-operators]] === Standard operators Lucene's regular expression engine does not use the {wikipedia}/Perl_Compatible_Regular_Expressions[Perl Compatible Regular Expressions (PCRE)] library, but it does support the following standard operators. `.`:: + -- Matches any character. For example: .... ab. # matches 'aba', 'abb', 'abz', etc. .... -- `?`:: + -- Repeat the preceding character zero or one times. Often used to make the preceding character optional. For example: .... abc? # matches 'ab' and 'abc' .... -- `+`:: + -- Repeat the preceding character one or more times. For example: .... ab+ # matches 'ab', 'abb', 'abbb', etc. .... -- `*`:: + -- Repeat the preceding character zero or more times. For example: .... ab* # matches 'a', 'ab', 'abb', 'abbb', etc. .... -- `{}`:: + -- Minimum and maximum number of times the preceding character can repeat. For example: .... a{2} # matches 'aa' a{2,4} # matches 'aa', 'aaa', and 'aaaa' a{2,} # matches 'a` repeated two or more times .... -- `|`:: + -- OR operator. The match will succeed if the longest pattern on either the left side OR the right side matches. For example: .... abc|xyz # matches 'abc' and 'xyz' .... -- `( … )`:: + -- Forms a group. You can use a group to treat part of the expression as a single character. For example: .... abc(def)? # matches 'abc' and 'abcdef' but not 'abcd' .... -- `[ … ]`:: + -- Match one of the characters in the brackets. For example: .... [abc] # matches 'a', 'b', 'c' .... Inside the brackets, `-` indicates a range unless `-` is the first character or escaped. For example: .... [a-c] # matches 'a', 'b', or 'c' [-abc] # '-' is first character. Matches '-', 'a', 'b', or 'c' [abc\-] # Escapes '-'. Matches 'a', 'b', 'c', or '-' .... A `^` before a character in the brackets negates the character or range. For example: .... [^abc] # matches any character except 'a', 'b', or 'c' [^a-c] # matches any character except 'a', 'b', or 'c' [^-abc] # matches any character except '-', 'a', 'b', or 'c' [^abc\-] # matches any character except 'a', 'b', 'c', or '-' .... -- [discrete] [[regexp-optional-operators]] === Optional operators You can use the `flags` parameter to enable more optional operators for Lucene's regular expression engine. To enable multiple operators, use a `|` separator. For example, a `flags` value of `COMPLEMENT|INTERVAL` enables the `COMPLEMENT` and `INTERVAL` operators. [discrete] ==== Valid values `ALL` (Default):: Enables all optional operators. `COMPLEMENT`:: + -- Enables the `~` operator. You can use `~` to negate the shortest following pattern. For example: .... a~bc # matches 'adc' and 'aec' but not 'abc' .... -- `INTERVAL`:: + -- Enables the `<>` operators. You can use `<>` to match a numeric range. For example: .... foo<1-100> # matches 'foo1', 'foo2' ... 'foo99', 'foo100' foo<01-100> # matches 'foo01', 'foo02' ... 'foo99', 'foo100' .... -- `INTERSECTION`:: + -- Enables the `&` operator, which acts as an AND operator. The match will succeed if patterns on both the left side AND the right side matches. For example: .... aaa.+&.+bbb # matches 'aaabbb' .... -- `ANYSTRING`:: + -- Enables the `@` operator. You can use `@` to match any entire string. You can combine the `@` operator with `&` and `~` operators to create an "everything except" logic. For example: .... @&~(abc.+) # matches everything except terms beginning with 'abc' .... -- [discrete] [[regexp-unsupported-operators]] === Unsupported operators Lucene's regular expression engine does not support anchor operators, such as `^` (beginning of line) or `$` (end of line). To match a term, the regular expression must match the entire string.