discourse

Commit Graph

Author	SHA1	Message	Date
Bianca Nenciu	d6534bdb11	DEV: Fix test (#21283 ) Apostrophe-like characters (for example, ’ and ') are transformed to the ASCII apostrophe (') regardless of search_ignore_accents.	2023-05-04 17:04:26 +03:00
Sam	c63551d227	FEATURE: search_rank_sort_priorities modifier (#21329 ) This new modifier can be used by plugins to modify search ordering. Specifically plugins such as discourse_solved can amend search ordering so solved topics bump to the top. Also correct edge case where low and high sort priority categories did not order correctly when it came to closed/archived	2023-05-02 16:36:36 +10:00
Ted Johansson	25a226279a	DEV: Replace #pluck_first freedom patch with AR #pick in core (#19893 ) The #pluck_first freedom patch, first introduced by @danielwaterworth has served us well, and is used widely throughout both core and plugins. It seems to have been a common enough use case that Rails 6 introduced it's own method #pick with the exact same implementation. This allows us to retire the freedom patch and switch over to the built-in ActiveRecord method. There is no replacement for #pluck_first!, but a quick search shows we are using this in a very limited capacity, and in some cases incorrectly (by assuming a nil return rather than an exception), which can quite easily be replaced with #pick plus some extra handling.	2023-02-13 12:39:45 +08:00
Sam	5d28cb709a	FIX: de-prioritize archived topics (#20161 ) Previously due to an error archived topics were more prominent in search than closed topics. This amends our internal logic to ensure archived topics are bumped down the list.	2023-02-03 13:23:27 +11:00
Sam	1dba1aca27	FIX: add support for PG 14 and up (#20137 ) Previously to_tsquery would split terms and join with & In PG 14 terms are split and use <-> which means followed directly by. In PG 13: discourse_test=# SELECT to_tsquery('english', '''hello world'''); to_tsquery --------------------- 'hello' & 'world' (1 row) In PG 14: discourse_test=# SELECT to_tsquery('english', '''hello world'''); to_tsquery --------------------- 'hello' <-> 'world' (1 row) Change is very unobtrosive, we simply amend our to_tsquery to behave like it used to behave and make no use of the `<->` operator More detail at: https://akorotkov.github.io/blog/2021/05/22/pg-14-query-parsing/ Note that plainto_tsquery used elsewhere in Discourse keeps the exact same function. This also corrects a faulty test that was passing by a fluke on older version of PG	2023-02-03 08:11:25 +11:00
Sam	c5345d0e54	FEATURE: prioritize_exact_search_title_match hidden setting (#20089 ) The new `prioritize_exact_search_match` can be used to force the search algorithm to prioritize exact term matches in title when ranking results. This is scoped narrowly to titles for cases such as a topic titled: "organisation chart" and a search of "org chart". If we scoped this wider, all discussion about "org chart" would float to the top and leave a very common title de-prioritized. This is a hidden site setting and it has some performance impact due to double ranking. That said, performance impact is somewhat mitigated cause ranking on title alone is a very cheap operation.	2023-01-31 16:34:01 +11:00
Alan Guo Xiang Tan	6934edd97c	DEV: Add hidden site setting to configure search ranking weights (#20086 ) This site setting is mostly experimental at this point.	2023-01-31 08:57:13 +08:00
Sam	5d669d8aa2	Revert "FEATURE: hidden site setting to disable search prefix matching (#20058 )" (#20073 ) This reverts commit `64f7b97d08`. Too many side effects for this setting, we have decided to remove it	2023-01-31 07:39:23 +08:00
Sam	64f7b97d08	FEATURE: hidden site setting to disable search prefix matching (#20058 ) Many users seems surprised by prefix matching in search leading to unexpected results. Over the years we always would return results starting with a search term and not expect exact matches. Meaning a search for `abra` would find `abracadabra` This introduces the Site Setting `enable_search_prefix_matching` which defaults to true. (behavior unchanged) We plan to experiment on select sites with exact matches to see if the results are less surprising	2023-01-30 12:44:40 +08:00
Daniel Waterworth	666536cbd1	DEV: Prefer \A and \z over ^ and $ in regexes (#19936 )	2023-01-20 12:52:49 -06:00
Sérgio Saquetim	0feb9ad341	DEV: Added callback to change the query used to filter groups in search (#19884 ) Added plugin registry that will allow adding callbacks that can change the query that is used to filter groups while running a search.	2023-01-16 15:48:00 -03:00
Bianca Nenciu	fb780c50fd	FIX: Replace all quote-like unicodes with quotes (#19714 ) If unaccent is called with quote-like Unicode characters then it can generate invalid queries because some of the transformed quotes by unaccent are not escaped and to_tsquery fails because of bad input. This commits replaces more quote-like Unicode characters before unaccent is called.	2023-01-09 19:19:51 +02:00
David Taylor	6417173082	DEV: Apply syntax_tree formatting to `lib/*`	2023-01-09 12:10:19 +00:00
Bianca Nenciu	17b7ab0d7b	FIX: Make sure generated tsqueries are valid (#19368 ) The tsquery used for searching is generated using both functions from Ruby and Postgresql (for example, unaccent function). Depending on the term used, it generated an invalid tsquery. For example "can’t" generated "''can''t''" instead of "''can''''t''".	2022-12-12 17:57:20 +02:00
Du Jiajun	41e6b516e5	FIX: Support unicode in search filter @username (#18804 )	2022-11-16 10:42:37 +01:00
Daniel Waterworth	167181f4b7	DEV: Quote values when constructing SQL (#18827 ) All of these cases should already be safe, but still good to quote for "defense in depth".	2022-11-01 14:05:13 -05:00
Bianca Nenciu	9db8f00b3d	FEATURE: Create upload_references table (#16146 ) This table holds associations between uploads and other models. This can be used to prevent removing uploads that are still in use. * DEV: Create upload_references * DEV: Use UploadReference instead of PostUpload * DEV: Use UploadReference for SiteSetting * DEV: Use UploadReference for Badge * DEV: Use UploadReference for Category * DEV: Use UploadReference for CustomEmoji * DEV: Use UploadReference for Group * DEV: Use UploadReference for ThemeField * DEV: Use UploadReference for ThemeSetting * DEV: Use UploadReference for User * DEV: Use UploadReference for UserAvatar * DEV: Use UploadReference for UserExport * DEV: Use UploadReference for UserProfile * DEV: Add method to extract uploads from raw text * DEV: Use UploadReference for Draft * DEV: Use UploadReference for ReviewableQueuedPost * DEV: Use UploadReference for UserProfile's bio_raw * DEV: Do not copy user uploads to upload references * DEV: Copy post uploads again after deploy * DEV: Use created_at and updated_at from uploads table * FIX: Check if upload site setting is empty * DEV: Copy user uploads to upload references * DEV: Make upload extraction less strict	2022-06-09 09:24:30 +10:00
Penar Musaraj	8222810099	FIX: Limits for PM and group header search (#16887 ) When searching for PMs or PMs in a group inbox, results in the header search were not being limited to 5 with a "More" link to the full page search. This PR fixes that. It also simplifies the logic and updates the search API docs to include recently added `in:messages` and `group_messages:groupname` options.	2022-05-24 11:31:24 -04:00
Martin Brennan	fcc2e7ebbf	FEATURE: Promote polymorphic bookmarks to default and migrate (#16729 ) This commit migrates all bookmarks to be polymorphic (using the bookmarkable_id and bookmarkable_type) columns. It also deletes all the old code guarded behind the use_polymorphic_bookmarks setting and changes that setting to true for all sites and by default for the sake of plugins. No data is deleted in the migrations, the old post_id and for_topic columns for bookmarks will be dropped later on.	2022-05-23 10:07:15 +10:00
Martin Brennan	955d47bbd0	FIX: Use polymorphic bookmarks for in:bookmarks search (#16684 ) This commit makes sure the in:bookmarks post advanced search filter works with polymorphic bookmarks.	2022-05-10 09:08:01 +10:00
Martin Brennan	222c8d9b6a	FEATURE: Polymorphic bookmarks pt. 3 (reminders, imports, exports, refactors) (#16591 ) A bit of a mixed bag, this addresses several edge areas of bookmarks and makes them compatible with polymorphic bookmarks (hidden behind the `use_polymorphic_bookmarks` site setting). The main ones are: * ExportUserArchive compatibility * SyncTopicUserBookmarked job compatibility * Sending different notifications for the bookmark reminders based on the bookmarkable type * Import scripts compatibility * BookmarkReminderNotificationHandler compatibility This PR also refactors the `register_bookmarkable` API so it accepts a class descended from a `BaseBookmarkable` class instead. This was done because we kept having to add more and more lambdas/properties inline and it was very messy, so a factory pattern is cleaner. The classes can be tested independently as well. Some later PRs will address some other areas like the discourse narrative bot, advanced search, reports, and the .ics endpoint for bookmarks.	2022-05-09 09:37:23 +10:00
Penar Musaraj	b266a36967	FEATURE: Add `group_messages:` keyword to advanced search (#16584 )	2022-04-28 10:47:40 -04:00
Penar Musaraj	eebce8f80a	FEATURE: Add in:messages search modifier (#16567 ) This adds `in:messages` as a synonym for `in:personal` and sets it up as our default nomenclature (`in:personal` will still work).	2022-04-26 16:47:01 -04:00
Bianca Nenciu	6eb3d658ca	FIX: Do not wrap unaccent around tsqueries (#16284 ) tsqueries use quotes and having other characters that when unaccented become quotes results in invalid tsqueries.	2022-03-25 19:10:05 +02:00
Bianca Nenciu	34b4b53bac	FEATURE: Use Postgres unaccent to ignore accents (#16100 ) The search_ignore_accents site setting can be used to make the search indexer remove the accents before indexing the content. The unaccent function from PostgreSQL is better than Ruby's unicode_normalize(:nfkd).	2022-03-07 23:03:10 +02:00
Alan Guo Xiang Tan	930f51e175	FEATURE: Split up text segmentation for Chinese and Japanese. * Chinese segmenetation will continue to rely on cppjieba * Japanese segmentation will use our port of TinySegmenter * Korean currently does not rely on segmentation which was dropped in `c677877e4f` * SiteSetting.search_tokenize_chinese_japanese_korean has been split into SiteSetting.search_tokenize_chinese and SiteSetting.search_tokenize_japanese respectively	2022-02-07 09:21:14 +08:00
Alan Guo Xiang Tan	fff8b98485	SECURITY: Advanced group search did not respect visiblity of groups.	2022-01-10 13:49:26 +08:00
Penar Musaraj	d99deaf1ab	FEATURE: show recent searches in quick search panel (#15024 )	2021-11-25 15:44:15 -05:00
Penar Musaraj	20f5474be9	FEATURE: Log only topic/post search queries in search log (#14994 )	2021-11-18 09:21:12 +08:00
Alan Guo Xiang Tan	a03c48b720	FIX: Use the same mode for chinese search when indexing and querying. (#14780 ) The `白名单` term becomes `名单白名单` after it is processed by cppjieba in :query mode. However, `白名单` is not tokenized as such by cppjieba when it appears in a string of text. Therefore, this may lead to failed matches as the search data generated while indexing may not contain all of the terms generated by :query mode. We've decided to maintain parity for now such that both indexing and querying uses the same :mix mode. This may lead to less accurate search but our plan is to properly support CJK search in the future.	2021-11-01 10:14:47 +08:00
Dan Ungureanu	f003e31e2f	PERF: Optimize search in private messages query (#14660 ) * PERF: Remove JOIN on categories for PM search JOIN on categories is not needed when searchin in private messages as PMs are not categorized. * DEV: Use == for string comparison * PERF: Optimize query for allowed topic groups There was a query that checked for all topics a user or their groups were allowed to see. This used UNION between topic_allowed_users and topic_allowed_groups which was very inefficient. That was replaced with a OR condition that checks in either tables more efficiently.	2021-10-26 10:16:38 +03:00
Alan Guo Xiang Tan	6544e3b02a	DEV: Remove useless ordering when searching within a topic. (#14676 ) Searching within a topic currently does not make use of PG search and we're simply doing an `ilike` against the post raw. Furthermore, `Post#post_number` is already unique within a topic so the other ordering will never ever be used. This change simply makes the query cleaner to read.	2021-10-22 10:38:21 +08:00
Penar Musaraj	e9b1d29d8b	UX: Revamp quick search (#14499 ) Co-authored-by: Robin Ward <robin.ward@gmail.com> Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>	2021-10-06 11:42:52 -04:00
Jean	34ff7bfeeb	FEATURE: Hide suspended users from site-wide search to regular users (#14245 )	2021-09-06 09:59:35 -04:00
Bianca Nenciu	fbf7627c8e	FIX: Make search work with sub-sub-categories (#13901 ) Searching in a category looked only one level down, ignoring the site setting max_category_nesting. The user interface did not support the third level of categories and did not display them in the "Categorized" input of the advanced search options.	2021-08-02 14:04:13 +03:00
Penar Musaraj	8a470e508e	UX: Improve quick search suggestions (#13813 )	2021-07-21 14:00:27 -04:00
Josh Soref	59097b207f	DEV: Correct typos and spelling mistakes (#12812 ) Over the years we accrued many spelling mistakes in the code base. This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change - comments - test descriptions - other low risk areas	2021-05-21 11:43:47 +10:00
Krzysztof Kotlarek	e29605b79f	FEATURE: the ability to search users by custom fields (#12762 ) When the admin creates a new custom field they can specify if that field should be searchable or not. That setting is taken into consideration for quick search results.	2021-04-27 15:52:45 +10:00
Sam	5b342ae505	FIX: remove superfluous spaces from CJK blurbs (#12629 ) Previously we used the raw data indexed to generate blurbs even for cases when Chinese/Korean/Japanese text was used. This caused superfluous spaces to show up in excerpts.	2021-04-12 12:46:42 +10:00
Alan Guo Xiang Tan	ebe4896e48	FEATURE: Change very high/low search priority to rank at absolute ends. Prior to this change, we had weights for very_high, high, low and very_low. This means there were 4 weights to tweak and what weights to use for `very_high/high` and `very_low/low` pair was hard to explain. This change makes it such that `very_high` search priority will always ensure that the posts are ranked at the top while `very_low` search priority will ensure that the posts are ranked at the very bottom.	2021-03-09 09:20:37 +08:00
Alan Guo Xiang Tan	4b3f65bb26	FIX: Select earliest post when aggregating posts in a topic for search. This is a revert of `d8c796bc44` and `5bf0a0893b`. Linking to the post within a topic that has the highest rank was confusing users and hard to explain because ranking is determined via the PG ranking function. See the following meta topics for the complaints after we switch to the new ordering: 1. https://meta.discourse.org/t/title-search-not-working-as-expected/157737 2. https://meta.discourse.org/t/search-results-should-prioritize-first-post-in-topic-when-title-matches-search-term/175154	2021-02-05 09:52:53 +08:00
Guo Xiang Tan	d10d296e92	FIX: Search topic title headline being truncated. Need to apply the `HighlightAll` option in order to avoid topic titles from truncated in headlines when displaying search results.	2020-12-22 09:09:47 +08:00
Sam	293b243aeb	FEATURE: special shortcut for searching for own posts (#11541 ) You can now use `@me` to search for posts created by yourself, this is particularly handy if you have a long username. `@me rainbow` will find all posts you created with the word rainbow. Also cleans up test suite so it has no warnings.	2020-12-22 10:46:42 +11:00
Arpit Jalan	29c7655221	FEATURE: allow plugins to preload custom data on search (#11518 ) This commit allows discourse-assign plugin to show assigned users on search result topic list.	2020-12-17 21:59:10 +05:30
Dan Ungureanu	9b7525bb03	FIX: Handle uncaught exception (#11263 ) After the search term is parsed for advanced search filters, the term may become empty. Later, the same term will be passed to Discourse.route_for which will raise an ArgumentError. > URI(nil) ArgumentError: bad argument (expected URI object or URI string)	2020-11-20 11:28:14 +02:00
Roman Rizzi	d815b95935	FEATURE: Search filter for searching all PMs on a site for admin. (#11280 ) Admins can search all PMS on a site by using the `in:all-pms` advanced filter.	2020-11-19 13:56:19 -03:00
Guo Xiang Tan	68fc2a18b1	FIX: Properly handle quotes and backslash in `Search.set_tsquery_weight_filter`	2020-10-23 08:43:34 +08:00
Guo Xiang Tan	2607bb602e	Fix broken spec. Follow-up to `3c678df942`	2020-10-08 10:52:46 +08:00
Sam	3c678df942	PERF: avoid lookbehinds when indexing search (#10862 ) * PERF: avoid lookbehinds when indexing search Previously we used a `EmailCook.url_regexp` this regex used lookbehinds Unfortunately certain strings could lead to pathological behavior causing CPU to skyrocket and regex replace to take a very very long time. EmailCook still needs a fix, but it is less urgent cause it already splits to single lines. That said we will correct that as well in a seperate PR. New implementation is far more naive and relies on the extra spaces search indexer inserts.	2020-10-08 11:40:13 +11:00
Arpit Jalan	f7940b1d20	FEATURE: advanced search option for max posts count (#10761 ) This commit adds an option to search for max posts count and updates the UI for posts count search to show a min/max range in single line.	2020-09-28 21:34:16 +05:30

1 2 3 4 5 ...

327 Commits