discourse-ai

mirror of https://github.com/discourse/discourse-ai.git synced 2025-02-19 18:04:51 +00:00

Author	SHA1	Message	Date
Rafael dos Santos Silva	2c87bb0d99	FEATURE: Respect search filters in semantic search (#220 ) * FEATURE: Respect search filters in semantic search * lint	2023-09-12 16:16:33 -03:00
Sam	d75e3ca82b	FEATURE: include tag and category context in search (#217 ) Previous to this we just included title/body.. tags and category structure can be very critical for decision making.	2023-09-12 16:09:28 +10:00
Sam	b0310f90d3	FEATURE: add tags and categories to read context (#215 ) Note, we perform permission checks on tag list against anon to ensure we do not disclose information about private tags to the llm which could get extracted.	2023-09-12 11:06:55 +10:00
Roman Rizzi	0828254d61	FIX: Generate embeddings job was broken (#211 ) * FIX: Use correct methods to generate embeddings * FIX: Generate embeddings job was broken	2023-09-07 11:54:43 -03:00
Sam	615eb8b440	FEATURE: add semantic search with hyde bot (#210 ) In specific scenarios (no special filters or limits) we will also always include 5 semantic results (at least) with every query. This effectively means that all very wide queries will always return 20 results, regardless of how complex they are. Also: FIX: embedding backfill rake task not working We renamed internals, this corrects the implementation	2023-09-07 13:25:26 +10:00
Rafael dos Santos Silva	5c50d2aa09	FEATURE: Use stop_sequences for faster HyDE searches with Claude (#203 )	2023-09-06 10:06:31 -03:00
Roman Rizzi	13d63f1f30	FIX: filter allowed categories from semantic search results (#206 )	2023-09-06 10:00:20 -03:00
Roman Rizzi	4d854e9232	FIX: Invalidate semantic search cache entries when hyde or embedding model changes (#202 )	2023-09-05 18:39:39 -03:00
Rafael dos Santos Silva	4b42c09814	FEATURE: Tweak HyDE prompts for better grounding in forum subject and limit response size (#200 ) * FEATURE: Tweak HyDE prompts for better grounding in forum subject and limit response size * fix test * lint	2023-09-05 16:11:07 -03:00
Rafael dos Santos Silva	ee734a340a	FIX: Tag/category suggestion broke in 2c0f535 (#198 )	2023-09-05 14:15:01 -03:00
Rafael dos Santos Silva	2c0f535bab	FEATURE: HyDE-powered semantic search. (#136 ) * FEATURE: HyDE-powered semantic search. It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way. We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search. This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying. Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead. * Missing translation and rate limiting --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2023-09-05 11:08:23 -03:00
Sam	38af2ca63e	FIX: cut completion short after function call is found (#182 ) Previous to this change we would keep completing and throw away result	2023-09-05 10:37:58 +10:00
Rafael dos Santos Silva	4864978495	FEATURE: Return only applicable suggestions in AiHelper category/tags suggestions (#184 )	2023-09-04 14:30:33 -03:00
Rafael dos Santos Silva	3c4a53b2cb	FEATURE: Better link in Claude summaries (#183 ) * FEATURE: Better link in Claude summaries * lint	2023-09-04 12:04:47 -03:00
Sam	e3abbd9f46	FEATURE: add researcher persona (#181 ) The researcher persona has access to Google and can perform various internet research tasks. At the moment it can not read web pages, but that is under consideration	2023-09-04 12:05:27 +10:00
Rafael dos Santos Silva	43e485cbd9	FEATURE: Additional AI suggestion options (#176 )	2023-09-01 17:10:58 -07:00
Sam	181113159b	FIX: setting explorer was exceeding token budget This refactor changes it so we only include minimal data in the system prompt which leaves us lots of tokens for specific searches The new search command allows us to pull in settings on demand Descriptions are include in short search results, and names only in longer results Also: * In dev it is important to tell when calls are made to open ai this adds a console log to increase awareness around token usage * PERF: stop counting tokens so often This changes it so we only count tokens once per response Previously each time we heard back from open ai we would count tokens, leading to uneeded delays * bug fix, commands may reach in for tokenizer * add logging to console for anthropic calls as well * Update lib/shared/inference/openai_completions.rb Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>	2023-09-01 11:48:51 +10:00
Sam	00d69b463e	FEATURE: new site setting explorer persona (#178 ) Also adds ai_bot_enabled_personas so admins can tweak which stock personas are enabled. The new persona has a full listing of all site settings and is able to get context for each setting. This means you can ask it to search through settings for something relevant. Security wise there is no access to actual configuration of settings just to the names / description and implementation. Previously this was part of the forum helper persona however it just clashes too much with other behaviors, isolating it makes it far more powerful. * sneaking this one in, user_emails is a non obvious table in our structure. usually one would assume users has emails so the clarifies a bit better. plus it is a very common table to hit.	2023-08-31 17:02:03 +10:00
Sam	db19e37748	FEATURE: add initial support for personas (#172 ) This splits out a bunch of code that used to live inside bots into a dedicated concept called a Persona. This allows us to start playing with multiple personas for the bot Ships with: artist - for making images sql helper - for helping with data explorer general - for everything and anything Also includes a few fixes that make the generic LLM function implementation more robust	2023-08-30 16:15:03 +10:00
Sam	8fdb88604f	FIX: trim first space when getting a reply from anthropic (#164 ) Anthropic loves sending a pointless leading space with completions this throws off the command framework.	2023-08-29 10:57:36 +10:00
Sam	b14cb864dc	FEATURE: add setting_context experimental command (#160 ) This command can be used to extract information about a discourse site setting directly from source. To operate it needs the rg binary in the container.	2023-08-29 10:43:58 +10:00
Rafael dos Santos Silva	e673b568d9	FEATURE: StableBeluga2 support for AiHelper (#162 ) * FEATURE: StableBeluga2 support for AiHelper * lint	2023-08-25 15:54:51 -03:00
Sam	7d943be7b2	FIX: automatic bot titles missing sometime (#151 ) This fixes 2 big issues: 1. No matter how hard you try, grounding anthropic title prompt is just too hard. This works around by only looking at the last sentence it returns and treating as title 2. Non English locales would be stuck with "generic" title, this ensures every bot message gets a title, using a custom field to track Also, slightly tunes some anthropic prompts.	2023-08-24 07:20:24 +10:00
Sam	f0e1c72aa7	FEATURE: implement command framework for non Open AI (#147 ) Open AI support function calling, this has a very specific shape that other LLMs have not quite adopted. This simulates a command framework using system prompts on LLMs that are not open AI. Features include: - Smart system prompt to steer the LLM - Parameter validation (we ensure all the params are specified correctly) This is being tested on Anthropic at the moment and intial results are promising.	2023-08-23 07:49:36 +10:00
Sam	78f61914c8	FIX: improve token counting (#145 ) Previously we were not counting functions correctly and not accounting for minimum token count per message This corrects both issues and improves documentation internally	2023-08-22 08:36:41 +10:00
Rafael dos Santos Silva	ea5a443588	FEATURE: Try to generate OpenAI Summaries in current language (#146 ) * FEATURE: Try to generate OpenAI Summaries in current language * lint	2023-08-21 15:40:32 -03:00
Sam	b4477ecdcd	FEATURE: support 16k and 32k variants for Azure GPT (#140 ) Azure requires a single HTTP endpoint per type of completion. The settings: `ai_openai_gpt35_16k_url` and `ai_openai_gpt4_32k_url` can be used now to configure the extra endpoints This amends token limit which was off a bit due to function calls and fixes a minor JS issue where we were not testing for a property	2023-08-17 11:00:11 +10:00
Sam	01f833f86e	FEATURE: optional warning attached to all AI bot conversations (#137 ) * FEATURE: optional warning attached to all AI bot conversations This commit introduces `ai_bot_enable_chat_warning` which can be used to warn people prior to starting a chat with the bot. In particular this is useful if moderators are regularly reading chat transcripts as it sets expectations early. By default this is disabled. Also: - Stops making ajax call prior to opening composer - Hides PM title when starting a bot PM Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>	2023-08-17 06:29:58 +10:00
Rafael dos Santos Silva	49f2453c2d	FEATURE: Tweaks to Anthropic Summarization (#138 ) * FEATURE: Tweaks to Anthropic Summarization * fix specs	2023-08-16 15:09:52 -03:00
Rafael dos Santos Silva	0738f67fa4	FIX: Fix embeddings truncation strategy (#139 )	2023-08-16 15:09:41 -03:00
Sam	20c1f2d788	FEATURE: basic progress for image generation (#133 ) previously you would have to wait quite a while to see the prompt this implements a very basic implementation of progress so you can see the API is working. Also: - Fix google progress. - Handle the incredibly rare, zero results from google. - Simplify command so it is less error prone - replace invoke and attache results with a invoke - ensure invoke can only ever be run once - pass in all the information a command needs in constructor - use new pattern throughout - test invocation in isolation	2023-08-14 16:30:12 +10:00
Roman Rizzi	b076e43d67	FEATURE: streaming mode for the FoldContent strategy. (#134 )	2023-08-11 15:08:54 -03:00
Sam	7eedbf29e0	FIX: refine image and read command (#131 ) - Attempt to hint reading is done by sending complete:true - Do not include post_number in result unless it was sent in - Rush visual feedback when a command is run (ensure we always revise) - Include hyperlink in read command description - Stop round tripping to GPT after image generation (speeds up images by a lot) - Add a test for image command	2023-08-09 16:01:48 +10:00
Sam	958dfc360e	FEATURE: experimental read command for bot (#129 ) This command is useful for reading a topics content. It allows us to perform critical analysis or suggest answers. Given 8k token limit in GPT-4 I hardcoded reading to 1500 tokens, but we can follow up and allow larger windows on models that support more tokens. On local testing even in this limited form this can be very useful.	2023-08-09 07:19:56 +10:00
Rafael dos Santos Silva	8318c4374c	FIX: Remove muted from Similar list (#127 ) * FIX: Remove muted from Similar list	2023-08-08 15:44:10 -03:00
Sam	03e689deb7	FIX: Google command was including full payload (#128 ) * FIX: Google command was including full payload Additionally there was no truncating happening meaning you could blow token budget easily on a single search. This made Google search mostly useless and it would mean that after using Google we would revert to a clean slate which is very confusing. * no need for nil there	2023-08-08 15:41:57 +10:00
Sam	7edb57c005	DEV: simplify command framework (#125 ) The command framework had some confusing dispatching where it would dispatch JSON blobs, this meant there was lots of parsing required in every command The refactor handles transforming the args prior to dispatch which makes consuming far simpler This is also general prep to supporting some basic command framework in other llms.	2023-08-04 09:37:58 +10:00
Rafael dos Santos Silva	eb7fff3a55	FEATURE: Add support for StableBeluga and Upstage Llama2 instruct (#126 ) * FEATURE: Add support for StableBeluga and Upstage Llama2 instruct This means we support all models in the top3 of the Open LLM Leaderboard Since some of those models have RoPE, we now have a setting so you can customize the token limit depending which model you use.	2023-08-03 15:29:30 -03:00
Rafael dos Santos Silva	8b157feea5	FEATURE: Compatibility with protected Hugging Face Endpoints (#123 ) * FEATURE: Compatibility with protected Hugging Face Endpoints	2023-08-02 17:00:00 -03:00
Roman Rizzi	58b96eda6c	REFACTOR: Build related topics using TopicQuery. (#124 ) TopicQuery already provides a lot of safeguards and options for filtering topic, and enforcing permissions. It makes sense to rely on it as other plugins like discourse-assign do. As a bonus, we now have access to the current_user while serializing these topics, so users will see things like unread posts count just like we do for the lists.	2023-08-02 16:58:09 -03:00
Sam	602bb843ea	FEATURE: add support for final stable diffusion xl model (#122 )	2023-08-02 16:53:28 -03:00
Roman Rizzi	c8de9495c8	UX: Update related-topics to follow <MoreTopics/> conventions (#118 )	2023-07-31 18:33:37 -03:00
Rafael dos Santos Silva	3e7c99de89	FEATURE: Support for locally infered embeddings in 100 languages (#115 ) * FEATURE: Support for locally infered embeddings in 100 languages * add table	2023-07-27 15:50:03 -03:00
Rafael dos Santos Silva	b25daed60b	FEATURE: Llama2 for summarization (#116 )	2023-07-27 13:55:32 -03:00
Sam	4b0c077ce5	FEATURE: port to use claude-2 for chat bot (#114 ) Claude 1 costs the same and is less good than Claude 2. Make use of Claude 2 in all spots ... This also fixes streaming so it uses the far more efficient streaming protocol.	2023-07-27 11:24:44 +10:00
Rafael dos Santos Silva	e3b4a73267	FEATURE: Cache Related Topics for longer (#110 )	2023-07-18 11:27:06 -03:00
Roman Rizzi	473732c18a	FIX: Return base prompt instead of nil (#106 )	2023-07-13 21:48:25 -03:00
Rafael dos Santos Silva	703762a7a9	PERF: .find_each instead of .find to save us from memory allocation peaks also Fix embeddings rake task for new db structure	2023-07-13 18:59:25 -03:00
Roman Rizzi	5f0c617880	REFACTOR: Cohesive narrative for single-chunk summaries. (#103 ) Single and multi-chunk summaries end using different prompts for the last summary. This change detects when the summarized content fits in a single chunk and uses a slightly different prompt, which leads to more consistent summary formats. This PR also moves the chunk-splitting step to the `FoldContent` strategy as preparation for implementing streamed summaries.	2023-07-13 17:05:41 -03:00
Rafael dos Santos Silva	5e3f4e1b78	FEATURE: Embeddings to main db (#99 ) * FEATURE: Embeddings to main db This commit moves our embeddings store from an external configurable PostgreSQL instance back into the main database. This is done to simplify the setup. There is a migration that will try to import the external embeddings into the main DB if it is configured and there are rows. It removes support from embeddings models that aren't all_mpnet_base_v2 or OpenAI text_embedding_ada_002. However it will now be easier to add new models. It also now takes into account: - topic title - topic category - topic tags - replies (as much as the model allows) We introduce an interface so we can eventually support multiple strategies for handling long topics. This PR severely damages the semantic search performance, but this is a temporary until we can get adapt HyDE to make semantic search use the same embeddings we have for semantic related with good performance. Here we also have some ground work to add post level embeddings, but this will be added in a future PR. Please note that this PR will also block Discourse from booting / updating if this plugin is installed and the pgvector extension isn't available on the PostgreSQL instance Discourse uses.	2023-07-13 12:41:36 -03:00

... 2 3 4 5 6

281 Commits