discourse

Commit Graph

Author	SHA1	Message	Date
Bianca Nenciu	fb780c50fd	FIX: Replace all quote-like unicodes with quotes (#19714 ) If unaccent is called with quote-like Unicode characters then it can generate invalid queries because some of the transformed quotes by unaccent are not escaped and to_tsquery fails because of bad input. This commits replaces more quote-like Unicode characters before unaccent is called.	2023-01-09 19:19:51 +02:00
David Taylor	6417173082	DEV: Apply syntax_tree formatting to `lib/*`	2023-01-09 12:10:19 +00:00
Bianca Nenciu	17b7ab0d7b	FIX: Make sure generated tsqueries are valid (#19368 ) The tsquery used for searching is generated using both functions from Ruby and Postgresql (for example, unaccent function). Depending on the term used, it generated an invalid tsquery. For example "can’t" generated "''can''t''" instead of "''can''''t''".	2022-12-12 17:57:20 +02:00
Du Jiajun	41e6b516e5	FIX: Support unicode in search filter @username (#18804 )	2022-11-16 10:42:37 +01:00
Daniel Waterworth	167181f4b7	DEV: Quote values when constructing SQL (#18827 ) All of these cases should already be safe, but still good to quote for "defense in depth".	2022-11-01 14:05:13 -05:00
Bianca Nenciu	9db8f00b3d	FEATURE: Create upload_references table (#16146 ) This table holds associations between uploads and other models. This can be used to prevent removing uploads that are still in use. * DEV: Create upload_references * DEV: Use UploadReference instead of PostUpload * DEV: Use UploadReference for SiteSetting * DEV: Use UploadReference for Badge * DEV: Use UploadReference for Category * DEV: Use UploadReference for CustomEmoji * DEV: Use UploadReference for Group * DEV: Use UploadReference for ThemeField * DEV: Use UploadReference for ThemeSetting * DEV: Use UploadReference for User * DEV: Use UploadReference for UserAvatar * DEV: Use UploadReference for UserExport * DEV: Use UploadReference for UserProfile * DEV: Add method to extract uploads from raw text * DEV: Use UploadReference for Draft * DEV: Use UploadReference for ReviewableQueuedPost * DEV: Use UploadReference for UserProfile's bio_raw * DEV: Do not copy user uploads to upload references * DEV: Copy post uploads again after deploy * DEV: Use created_at and updated_at from uploads table * FIX: Check if upload site setting is empty * DEV: Copy user uploads to upload references * DEV: Make upload extraction less strict	2022-06-09 09:24:30 +10:00
Penar Musaraj	8222810099	FIX: Limits for PM and group header search (#16887 ) When searching for PMs or PMs in a group inbox, results in the header search were not being limited to 5 with a "More" link to the full page search. This PR fixes that. It also simplifies the logic and updates the search API docs to include recently added `in:messages` and `group_messages:groupname` options.	2022-05-24 11:31:24 -04:00
Martin Brennan	fcc2e7ebbf	FEATURE: Promote polymorphic bookmarks to default and migrate (#16729 ) This commit migrates all bookmarks to be polymorphic (using the bookmarkable_id and bookmarkable_type) columns. It also deletes all the old code guarded behind the use_polymorphic_bookmarks setting and changes that setting to true for all sites and by default for the sake of plugins. No data is deleted in the migrations, the old post_id and for_topic columns for bookmarks will be dropped later on.	2022-05-23 10:07:15 +10:00
Martin Brennan	955d47bbd0	FIX: Use polymorphic bookmarks for in:bookmarks search (#16684 ) This commit makes sure the in:bookmarks post advanced search filter works with polymorphic bookmarks.	2022-05-10 09:08:01 +10:00
Martin Brennan	222c8d9b6a	FEATURE: Polymorphic bookmarks pt. 3 (reminders, imports, exports, refactors) (#16591 ) A bit of a mixed bag, this addresses several edge areas of bookmarks and makes them compatible with polymorphic bookmarks (hidden behind the `use_polymorphic_bookmarks` site setting). The main ones are: * ExportUserArchive compatibility * SyncTopicUserBookmarked job compatibility * Sending different notifications for the bookmark reminders based on the bookmarkable type * Import scripts compatibility * BookmarkReminderNotificationHandler compatibility This PR also refactors the `register_bookmarkable` API so it accepts a class descended from a `BaseBookmarkable` class instead. This was done because we kept having to add more and more lambdas/properties inline and it was very messy, so a factory pattern is cleaner. The classes can be tested independently as well. Some later PRs will address some other areas like the discourse narrative bot, advanced search, reports, and the .ics endpoint for bookmarks.	2022-05-09 09:37:23 +10:00
Penar Musaraj	b266a36967	FEATURE: Add `group_messages:` keyword to advanced search (#16584 )	2022-04-28 10:47:40 -04:00
Penar Musaraj	eebce8f80a	FEATURE: Add in:messages search modifier (#16567 ) This adds `in:messages` as a synonym for `in:personal` and sets it up as our default nomenclature (`in:personal` will still work).	2022-04-26 16:47:01 -04:00
Bianca Nenciu	6eb3d658ca	FIX: Do not wrap unaccent around tsqueries (#16284 ) tsqueries use quotes and having other characters that when unaccented become quotes results in invalid tsqueries.	2022-03-25 19:10:05 +02:00
Bianca Nenciu	34b4b53bac	FEATURE: Use Postgres unaccent to ignore accents (#16100 ) The search_ignore_accents site setting can be used to make the search indexer remove the accents before indexing the content. The unaccent function from PostgreSQL is better than Ruby's unicode_normalize(:nfkd).	2022-03-07 23:03:10 +02:00
Alan Guo Xiang Tan	930f51e175	FEATURE: Split up text segmentation for Chinese and Japanese. * Chinese segmenetation will continue to rely on cppjieba * Japanese segmentation will use our port of TinySegmenter * Korean currently does not rely on segmentation which was dropped in `c677877e4f` * SiteSetting.search_tokenize_chinese_japanese_korean has been split into SiteSetting.search_tokenize_chinese and SiteSetting.search_tokenize_japanese respectively	2022-02-07 09:21:14 +08:00
Alan Guo Xiang Tan	fff8b98485	SECURITY: Advanced group search did not respect visiblity of groups.	2022-01-10 13:49:26 +08:00
Penar Musaraj	d99deaf1ab	FEATURE: show recent searches in quick search panel (#15024 )	2021-11-25 15:44:15 -05:00
Penar Musaraj	20f5474be9	FEATURE: Log only topic/post search queries in search log (#14994 )	2021-11-18 09:21:12 +08:00
Alan Guo Xiang Tan	a03c48b720	FIX: Use the same mode for chinese search when indexing and querying. (#14780 ) The `白名单` term becomes `名单白名单` after it is processed by cppjieba in :query mode. However, `白名单` is not tokenized as such by cppjieba when it appears in a string of text. Therefore, this may lead to failed matches as the search data generated while indexing may not contain all of the terms generated by :query mode. We've decided to maintain parity for now such that both indexing and querying uses the same :mix mode. This may lead to less accurate search but our plan is to properly support CJK search in the future.	2021-11-01 10:14:47 +08:00
Dan Ungureanu	f003e31e2f	PERF: Optimize search in private messages query (#14660 ) * PERF: Remove JOIN on categories for PM search JOIN on categories is not needed when searchin in private messages as PMs are not categorized. * DEV: Use == for string comparison * PERF: Optimize query for allowed topic groups There was a query that checked for all topics a user or their groups were allowed to see. This used UNION between topic_allowed_users and topic_allowed_groups which was very inefficient. That was replaced with a OR condition that checks in either tables more efficiently.	2021-10-26 10:16:38 +03:00
Alan Guo Xiang Tan	6544e3b02a	DEV: Remove useless ordering when searching within a topic. (#14676 ) Searching within a topic currently does not make use of PG search and we're simply doing an `ilike` against the post raw. Furthermore, `Post#post_number` is already unique within a topic so the other ordering will never ever be used. This change simply makes the query cleaner to read.	2021-10-22 10:38:21 +08:00
Penar Musaraj	e9b1d29d8b	UX: Revamp quick search (#14499 ) Co-authored-by: Robin Ward <robin.ward@gmail.com> Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>	2021-10-06 11:42:52 -04:00
Jean	34ff7bfeeb	FEATURE: Hide suspended users from site-wide search to regular users (#14245 )	2021-09-06 09:59:35 -04:00
Bianca Nenciu	fbf7627c8e	FIX: Make search work with sub-sub-categories (#13901 ) Searching in a category looked only one level down, ignoring the site setting max_category_nesting. The user interface did not support the third level of categories and did not display them in the "Categorized" input of the advanced search options.	2021-08-02 14:04:13 +03:00
Penar Musaraj	8a470e508e	UX: Improve quick search suggestions (#13813 )	2021-07-21 14:00:27 -04:00
Josh Soref	59097b207f	DEV: Correct typos and spelling mistakes (#12812 ) Over the years we accrued many spelling mistakes in the code base. This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change - comments - test descriptions - other low risk areas	2021-05-21 11:43:47 +10:00
Krzysztof Kotlarek	e29605b79f	FEATURE: the ability to search users by custom fields (#12762 ) When the admin creates a new custom field they can specify if that field should be searchable or not. That setting is taken into consideration for quick search results.	2021-04-27 15:52:45 +10:00
Sam	5b342ae505	FIX: remove superfluous spaces from CJK blurbs (#12629 ) Previously we used the raw data indexed to generate blurbs even for cases when Chinese/Korean/Japanese text was used. This caused superfluous spaces to show up in excerpts.	2021-04-12 12:46:42 +10:00
Alan Guo Xiang Tan	ebe4896e48	FEATURE: Change very high/low search priority to rank at absolute ends. Prior to this change, we had weights for very_high, high, low and very_low. This means there were 4 weights to tweak and what weights to use for `very_high/high` and `very_low/low` pair was hard to explain. This change makes it such that `very_high` search priority will always ensure that the posts are ranked at the top while `very_low` search priority will ensure that the posts are ranked at the very bottom.	2021-03-09 09:20:37 +08:00
Alan Guo Xiang Tan	4b3f65bb26	FIX: Select earliest post when aggregating posts in a topic for search. This is a revert of `d8c796bc44` and `5bf0a0893b`. Linking to the post within a topic that has the highest rank was confusing users and hard to explain because ranking is determined via the PG ranking function. See the following meta topics for the complaints after we switch to the new ordering: 1. https://meta.discourse.org/t/title-search-not-working-as-expected/157737 2. https://meta.discourse.org/t/search-results-should-prioritize-first-post-in-topic-when-title-matches-search-term/175154	2021-02-05 09:52:53 +08:00
Guo Xiang Tan	d10d296e92	FIX: Search topic title headline being truncated. Need to apply the `HighlightAll` option in order to avoid topic titles from truncated in headlines when displaying search results.	2020-12-22 09:09:47 +08:00
Sam	293b243aeb	FEATURE: special shortcut for searching for own posts (#11541 ) You can now use `@me` to search for posts created by yourself, this is particularly handy if you have a long username. `@me rainbow` will find all posts you created with the word rainbow. Also cleans up test suite so it has no warnings.	2020-12-22 10:46:42 +11:00
Arpit Jalan	29c7655221	FEATURE: allow plugins to preload custom data on search (#11518 ) This commit allows discourse-assign plugin to show assigned users on search result topic list.	2020-12-17 21:59:10 +05:30
Dan Ungureanu	9b7525bb03	FIX: Handle uncaught exception (#11263 ) After the search term is parsed for advanced search filters, the term may become empty. Later, the same term will be passed to Discourse.route_for which will raise an ArgumentError. > URI(nil) ArgumentError: bad argument (expected URI object or URI string)	2020-11-20 11:28:14 +02:00
Roman Rizzi	d815b95935	FEATURE: Search filter for searching all PMs on a site for admin. (#11280 ) Admins can search all PMS on a site by using the `in:all-pms` advanced filter.	2020-11-19 13:56:19 -03:00
Guo Xiang Tan	68fc2a18b1	FIX: Properly handle quotes and backslash in `Search.set_tsquery_weight_filter`	2020-10-23 08:43:34 +08:00
Guo Xiang Tan	2607bb602e	Fix broken spec. Follow-up to `3c678df942`	2020-10-08 10:52:46 +08:00
Sam	3c678df942	PERF: avoid lookbehinds when indexing search (#10862 ) * PERF: avoid lookbehinds when indexing search Previously we used a `EmailCook.url_regexp` this regex used lookbehinds Unfortunately certain strings could lead to pathological behavior causing CPU to skyrocket and regex replace to take a very very long time. EmailCook still needs a fix, but it is less urgent cause it already splits to single lines. That said we will correct that as well in a seperate PR. New implementation is far more naive and relies on the extra spaces search indexer inserts.	2020-10-08 11:40:13 +11:00
Arpit Jalan	f7940b1d20	FEATURE: advanced search option for max posts count (#10761 ) This commit adds an option to search for max posts count and updates the UI for posts count search to show a min/max range in single line.	2020-09-28 21:34:16 +05:30
Arpit Jalan	4498c59085	FEATURE: add alias for min_post_count search filter	2020-09-28 16:07:44 +05:30
Arpit Jalan	cdf45f4fe6	Update regex for views search filter.	2020-09-24 17:05:55 +05:30
Arpit Jalan	0c5cd0d1ef	FEATURE: advanced search filters for view count	2020-09-24 15:22:18 +05:30
Bianca Nenciu	4abbe3d361	FEATURE: Make search filters case insensitive (#10715 )	2020-09-23 11:59:42 +03:00
Krzysztof Kotlarek	cb58cbbc2c	FEATURE: allow to extend topic_eager_loads in Search (#10625 ) This additional interface is required by encrypt plugin	2020-09-14 11:58:28 +10:00
Guo Xiang Tan	e6ca1b4326	FIX: Admin search for PMs should only search own PMs. In `c6ceda8c`, a bug was introduced where an admin searching for his own private messages will actually end up searching through all private messages on the site. Follow-up to `c6ceda8c4e`	2020-09-10 11:37:18 +08:00
Dan Ungureanu	38c9c87128	FIX: Add to tags result set only visible tags (#10580 )	2020-09-02 13:24:40 +03:00
Guo Xiang Tan	40c6d90df3	PERF: Create a partial regular post_search_data index on large sites. With the addition of `PostSearchData#private_message`, a partial index consisting of only search data from regular posts can be created. The partial index helps to speed up searches on large sites since PG will not have to do an index scan on the entire search data index which has shown to be a bottle neck.	2020-08-27 13:42:00 +08:00
siriwatknp	1a2800ad07	fix: 🐛 category & tag search regex to support thai character	2020-08-25 16:12:26 +08:00
Guo Xiang Tan	05174df5c0	FIX: Restrict `personal_messages:` advanced search filter to admin. The filter noops if an incorrect username is passed. This filter is not exposed as part of the UI but is only used when an admin transitions from a search within a user's personal messages to the full page search. Follow-up to `4b30799054`.	2020-08-24 13:53:48 +08:00
Guo Xiang Tan	c6ceda8c4e	PERF: Avoid extra subquery when searching within PMs for normal user. Note the following query being generated where the filter for a user's private messages is executed twice. ```sql SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id", (TS_RANK_CD( post_search_data.search_data, TO_TSQUERY('english', '''test'':ABCD'), 0\|32 ) ( CASE categories.search_priority WHEN 2 THEN 0.6 WHEN 3 THEN 0.8 WHEN 4 THEN 1.2 WHEN 5 THEN 1.4 ELSE CASE WHEN topics.closed THEN 0.9 ELSE 1 END END ) ) rank, topics.bumped_at topic_bumped_at FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3) AND (topics.visible) AND (topics.archetype = 'private_message' AND post_search_data.private_message) AND (posts.topic_id IN (SELECT topic_id FROM topic_allowed_users WHERE user_id = 99999 UNION ALL SELECT tg.topic_id FROM topic_allowed_groups tg JOIN group_users gu ON gu.user_id = 99999 AND gu.group_id = tg.group_id )) AND (post_search_data.search_data @@ TO_TSQUERY('english', '''test'':*ABCD')) AND (posts.topic_id IN (SELECT topic_id FROM topic_allowed_users WHERE user_id = 99999 UNION ALL SELECT tg.topic_id FROM topic_allowed_groups tg JOIN group_users gu ON gu.user_id = 99999 AND gu.group_id = tg.group_id )) AND ((categories.id IS NULL) OR (NOT categories.read_restricted) OR (categories.id IN (999999))) ORDER BY rank DESC, topic_bumped_at DESC ```	2020-08-24 13:49:43 +08:00

1 2 3 4 5 ...

316 Commits