Update numbers to reflect 4-byte UTF-8-encoded characters (#27083)

You need 4 bytes for characters outside the BMP, which includes many emoji and a bunch of less-common writing characters too.
2025-03-09 14:34:43 +00:00 · 2017-10-24 09:50:47 +01:00 · 2017-10-24 09:50:47 +01:00 · 559fc5a4de
commit 559fc5a4de
parent bf557fd886
1 changed files with 2 additions and 2 deletions
--- a/docs/reference/mapping/params/ignore-above.asciidoc
+++ b/docs/reference/mapping/params/ignore-above.asciidoc
@ -56,5 +56,5 @@ limit of `32766`.

 NOTE: The value for `ignore_above` is the _character count_, but Lucene counts
 bytes. If you use UTF-8 text with many non-ASCII characters, you may want to
-set the limit to `32766 / 3 = 10922` since UTF-8 characters may occupy at most
-3 bytes.
+set the limit to `32766 / 4 = 8191` since UTF-8 characters may occupy at most
+4 bytes.