Commit Graph

38 Commits

Author SHA1 Message Date
Alan Guo Xiang Tan 322a3be2db
DEV: Remove logical OR assignment of constants (#29201)
Constants should always be only assigned once. The logical OR assignment
of a constant is a relic of the past before we used zeitwerk for
autoloading and had bugs where a file could be loaded twice resulting in
constant redefinition warnings.
2024-10-16 10:09:07 +08:00
Régis Hanol 73ce3589ad
PERF: improves TextSentinel's seems_unpretentious check (#28044)
by scanning the text for the first word that is bigger than `max_word_length` instead of extracting (segmenting) all the words, computing their size, and comparing the maximum with `max_word_length`.

Idea from @mentalstring in https://meta.discourse.org/t/body-seems-unclear-error-when-users-are-typing-in-chinese/88715/14
2024-07-23 17:12:29 +02:00
Régis Hanol 23aa88d203
FIX: Allow all caps within CJK text (#28018)
This improves the `TextSentinel` so that we don't consider CJK text as being uppercase and thus failing the validator.

It also optimizes the entropy computation by using native ruby `.bytes` to get all the bytes from the text.

It also tweaks the `seems_pronounceable?` and `seems_unpretentious?` check to use the `\p{Alnum}` unicode regexp group to account for non-latin languages.

Reference - https://meta.discourse.org/t/body-seems-unclear-error-when-users-are-typing-in-chinese/88715

Inspired by https://github.com/discourse/discourse/pull/27900

Co-authored-by: Paulo Magalhaes <mentalstring@gmail.com>
2024-07-22 17:35:52 +02:00
David Taylor 6417173082
DEV: Apply syntax_tree formatting to `lib/*` 2023-01-09 12:10:19 +00:00
Josh Soref 59097b207f
DEV: Correct typos and spelling mistakes (#12812)
Over the years we accrued many spelling mistakes in the code base. 

This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change 

- comments
- test descriptions
- other low risk areas
2021-05-21 11:43:47 +10:00
Joffrey JAFFEUX d16a39dc53
FIX: prevents exception when text input is nil (#12922)
nil was converted to "" and the matching regex would return [] and then be converted to nil with max usage.

Example exception:

```
NoMethodError (undefined method `<=' for nil:NilClass)

lib/text_sentinel.rb:71:in `seems_unpretentious?'
lib/validators/quality_title_validator.rb:13:in `validate_each'
lib/topic_creator.rb:25:in `valid?'
```
2021-05-03 09:21:35 +02:00
Bianca Nenciu a48f7ba61c
FEATURE: Improve errors when title is invalid (#11149)
It used to simply say "title is invalid" without giving any hint what
the problem could be. This commit adds different errors messages for
all caps titles, low entropy titles or titles with very long words.
2020-11-11 15:11:36 +02:00
Sam Saffron 30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Erick Guan 9dd325f805 FIX: skip some checks for CJK locale in TextSentinel (#7322) 2019-04-05 15:07:49 +02:00
Arpit Jalan 25ec077eca rename 'min_private_message_{post/title}_length' to 'min_personal_message_{post/title}_length' 2018-02-01 13:25:29 +05:30
Guo Xiang Tan 5012d46cbd Add rubocop to our build. (#5004) 2017-07-28 10:20:09 +09:00
Arpit Jalan e46204d195 FIX: allow long words if they contain periods 2016-09-13 09:15:05 +05:30
Rafael dos Santos Silva 5915929166 FIX: Unicode aware text sentinel (#4301)
* FIX: Handle unicode text on Text Sentinel

Uses active_support to properly handle unicode text

* Adds test cases to unicode Text Sentinel
2016-07-12 11:08:55 -04:00
Jeff Wong cf86f27415 FEATURE: New setting to allow all caps posts
Adds a setting to ignore text_sentinel's check on all caps content.
2015-11-18 09:50:50 -08:00
Robin Ward 4ab9ef3497 FIX: Allow long words if they contain periods 2015-05-19 13:10:25 -04:00
Sam 1f59375c82 rename max_word_length to title_max_word_length 2015-04-02 16:46:53 +11:00
Nathan Nontell d95172cb5d Allow TextSentinel#seems_unpretentious? to accept words joined with dashes or forward slashes. (Issue 1133) 2013-09-16 09:45:57 -04:00
Régis Hanol 1a38f66c7e removed warning about already existing constants 2013-08-30 17:28:18 +02:00
Neil Lalonde f611a5d898 If min entropy setting is greater than min post/body length setting, then use a sensible min entropy value instead 2013-08-28 11:04:28 -04:00
Robin Ward e6cd835928 Merge pull request #1040 from vipulnsward/remove_and
Remove unnecessary anding with true
2013-06-18 06:37:33 -07:00
Vipul A M 3167a152b2 Remove unnecessary anding with true 2013-06-18 11:19:10 +05:30
Neil Lalonde 876a570e3a Fix to prevent check for all upper case for non-ascii messages 2013-06-17 12:22:50 -04:00
Sam f7de9f17d5 refactor validators
add a new setting for min pm body length
use that setting for flags
scale entropy check down for pms
2013-06-13 18:18:43 +10:00
Michael Brown bb77d2c38b More entropy for foreign titles
* Treat strings with non-ASCII characters as having more entropy
2013-06-07 14:47:07 -04:00
Robin Ward 7d763a6f0c Bad merge. Oddly not caught by autospec. 2013-05-27 10:56:55 -04:00
Robin Ward e1781240a6 Merge branch 'refactoring' of git://github.com/mattvanhorn/discourse
Conflicts:
	lib/text_sentinel.rb
2013-05-27 10:42:20 -04:00
Chris Hunt 13c4266c74 Allow Chinese characters in Topic titles 2013-05-26 13:56:42 -07:00
Matt Van Horn c9fcee8490 simplify, clarify TextSentinel
codeclimate pointed this out. I agree it is better
to simplify and reveal intentions.
2013-05-24 13:36:33 -07:00
Matt Van Horn 806255b3c4 refactor Topic validation
introduce a couple of custom validators
fix minor discrepancies in tests
copy I18n error message keys to default location
clean up validation invocation
move some responsibilities out of validator into class
2013-05-22 22:31:52 -07:00
Régis Hanol c5cf8be864 auto replace rules in titles 2013-04-10 11:00:50 +02:00
Régis Hanol 239cbd2d58 enforce coding convention
replaced every `and` by `&&` and every `or` by `||`
2013-03-05 01:42:44 +01:00
Gosha Arinich cafc75b238 remove trailing whitespaces ❤️ 2013-02-26 07:31:35 +03:00
Jeremy Banks eb2a5e4654 Merge branch 'origin/master'
Conflicts:
	lib/text_sentinel.rb
2013-02-18 21:41:20 -05:00
Jeremy Banks 6af69f7e77 Do not strip leading and trailing whitespace from raw posts. 2013-02-15 20:58:33 -05:00
Alexander 6c4ae05454 Removes iconv dependency
Fixes #100
2013-02-15 13:36:19 -08:00
Robin Ward d740d7b25f Fix for foreign language titles: Only enforce upper case rule on english alphabet. 2013-02-14 16:09:57 -05:00
Robin Ward 12d3c3b66b Enforce entropy on flag text 2013-02-08 17:01:43 -05:00
Robin Ward 40da901e5d Introduction of TextSentinel to enforce title and body quality. 2013-02-06 20:53:34 -05:00