diff --git a/docs/reference/analysis/tokenfilters/pattern-capture-tokenfilter.asciidoc b/docs/reference/analysis/tokenfilters/pattern-capture-tokenfilter.asciidoc index 4091296a76e..7c919b56b98 100644 --- a/docs/reference/analysis/tokenfilters/pattern-capture-tokenfilter.asciidoc +++ b/docs/reference/analysis/tokenfilters/pattern-capture-tokenfilter.asciidoc @@ -82,7 +82,7 @@ curl -XPUT localhost:9200/test/ -d ' "type" : "pattern_capture", "preserve_original" : 1, "patterns" : [ - "(\\w+)", + "([^@]+)", "(\\p{L}+)", "(\\d+)", "@(.+)" @@ -108,9 +108,10 @@ When the above analyzer is used on an email address like: john-smith_123@foo-bar.com -------------------------------------------------- -it would produce the following tokens: [ `john-smith_123`, -`foo-bar.com`, `john`, `smith_123`, `smith`, `123`, `foo`, -`foo-bar.com`, `bar`, `com` ] +it would produce the following tokens: + + john-smith_123@foo-bar.com, john-smith_123, + john, smith, 123, foo-bar.com, foo, bar, com Multiple patterns are required to allow overlapping captures, but also means that patterns are less dense and easier to understand.