Régis Hanol 23aa88d203
FIX: Allow all caps within CJK text (#28018)
This improves the `TextSentinel` so that we don't consider CJK text as being uppercase and thus failing the validator.

It also optimizes the entropy computation by using native ruby `.bytes` to get all the bytes from the text.

It also tweaks the `seems_pronounceable?` and `seems_unpretentious?` check to use the `\p{Alnum}` unicode regexp group to account for non-latin languages.

Reference - https://meta.discourse.org/t/body-seems-unclear-error-when-users-are-typing-in-chinese/88715

Inspired by https://github.com/discourse/discourse/pull/27900

Co-authored-by: Paulo Magalhaes <mentalstring@gmail.com>
2024-07-22 17:35:52 +02:00
..
2024-07-04 10:58:21 +02:00
2023-05-24 08:59:37 +08:00