discourse-ai

Commit Graph

Author	SHA1	Message	Date
Rafael dos Santos Silva	818b20fb6f	FEATURE: Make embeddings turn-key (#261 ) To ease the administrative burden of enabling the embeddings model, this change introduces automatic backfill when the setting is enabled. It also moves the topic visit embedding creation to a lower priority queue in sidekiq and adds an option to skip embedding computation and persistence when we match on the digest.	2023-10-26 12:07:37 -03:00
Rafael dos Santos Silva	453928e7bb	FIX: Improvment to embeddings index task (#238 )	2023-10-02 16:37:13 -03:00
Sam	615eb8b440	FEATURE: add semantic search with hyde bot (#210 ) In specific scenarios (no special filters or limits) we will also always include 5 semantic results (at least) with every query. This effectively means that all very wide queries will always return 20 results, regardless of how complex they are. Also: FIX: embedding backfill rake task not working We renamed internals, this corrects the implementation	2023-09-07 13:25:26 +10:00
Rafael dos Santos Silva	2c0f535bab	FEATURE: HyDE-powered semantic search. (#136 ) * FEATURE: HyDE-powered semantic search. It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way. We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search. This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying. Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead. * Missing translation and rate limiting --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2023-09-05 11:08:23 -03:00
Rafael dos Santos Silva	703762a7a9	PERF: .find_each instead of .find to save us from memory allocation peaks also Fix embeddings rake task for new db structure	2023-07-13 18:59:25 -03:00
Rafael dos Santos Silva	739b314312	Fixes for embeddings and truncate (#67 )	2023-05-18 09:21:28 +10:00
Rafael dos Santos Silva	f1133f66a6	Updates to embedding rake tasks (#54 ) - Creates embeddings in topic ID order, so it's easier to stop and restart from where we stopped - Update index parameters with current best practices	2023-05-09 13:45:16 -03:00
Roman Rizzi	4e05763a99	FEATURE: Semantic assymetric full-page search (#34 ) Depends on discourse/discourse#20915 Hooks to the full-page-search component using an experimental API and performs an assymetric similarity search using our embeddings database.	2023-03-31 15:29:56 -03:00
Rafael dos Santos Silva	6bdbc0e32d	FIX: Proper flow when a topic doesn't have embeddings (#20 )	2023-03-20 16:44:55 -03:00
Rafael dos Santos Silva	80d662e9e8	FEATURE: Semantic Suggested Topics (#10 )	2023-03-15 17:21:45 -03:00
Roman Rizzi	aa2fca6086	DEV: DiscourseAI -> DiscourseAi rename to have consistent folders and files (#9 )	2023-03-14 16:03:50 -03:00
Rafael dos Santos Silva	510c6487e3	DEV: Preparation work for multiple inference providers (#5 )	2023-03-07 16:14:39 -03:00
Rafael dos Santos Silva	6cf411ec90	add toxicity and sentiment modules	2023-02-22 20:46:53 -03:00

13 Commits