discourse-ai

Commit Graph

Author	SHA1	Message	Date
Sam	a3c827efcc	FEATURE: allow personas to supply top_p and temperature params (#459 ) * FEATURE: allow personas to supply top_p and temperature params Code assistance generally are more focused at a lower temperature This amends it so SQL Helper runs at 0.2 temperature vs the more common default across LLMs of 1.0. Reduced temperature leads to more focused, concise and predictable answers for the SQL Helper * fix tests * This is not perfect, but far better than what we do today Instead of fishing for 1. Draft sequence 2. Draft body We skip (2), this means the composer "only" needs 1 http request to open, we also want to eliminate (1) but it is a bit of a trickier core change, may figure out how to pull it off (defer it to first draft save) Value of bot drafts < value of opening bot conversations really fast	2024-02-03 07:09:34 +11:00
Roman Rizzi	392e2e8aef	Revert "UX: Validate embeddings settings (#455 )" (#456 ) This reverts commit `85fca89e01`.	2024-02-01 14:06:51 -03:00
Roman Rizzi	85fca89e01	UX: Validate embeddings settings (#455 )	2024-02-01 13:05:38 -03:00
Roman Rizzi	0634b85a81	UX: Validations to LLM-backed features (except AI Bot) (#436 ) * UX: Validations to Llm-backed features (except AI Bot) This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs. Validations are: * You must choose a model before enabling the feature. * You must turn off the feature before setting the model to blank. * You must configure each model settings before being able to select it. * Add provider name to summarization options * vLLM can technically support same models as HF * Check we can talk to the selected model * Check for Bedrock instead of anthropic as a site could have both creds setup	2024-01-29 16:04:25 -03:00
Sam	825f01cfb2	FEATURE: even smoother streaming (#420 ) Account properly for function calls, don't stream through <details> blocks - Rush cooked content back to client - Wait longer (up to 60 seconds) before giving up on streaming - Clean up message bus channels so we don't have leftover data - Make ai streamer much more reusable and much easier to read - If buffer grows quickly, rush update so you are not artificially waiting - Refine prompt interface - Fix lost system message when prompt gets long	2024-01-15 18:51:14 +11:00
Jarek Radosz	6b8a57d957	DEV: Update linting (#423 ) Co-authored-by: Keegan George <kgeorge13@gmail.com>	2024-01-13 00:28:06 +01:00
Roman Rizzi	04eae76f68	REFACTOR: Represent generic prompts with an Object. (#416 ) * REFACTOR: Represent generic prompts with an Object. * Adds a bit more validation for clarity * Rewrite bot title prompt and fix quirk handling --------- Co-authored-by: Sam Saffron <sam.saffron@gmail.com>	2024-01-12 14:36:44 -03:00
Rafael dos Santos Silva	3be76ebd7a	FEATURE: Move the default embeddings model to bge-large-en (#417 )	2024-01-11 14:16:25 -03:00
Sam	05f7808057	FEATURE: more elegant progress (#409 ) Previous to this change it was very hard to tell if completion was stuck or not. This introduces a "dot" that follows the completion and starts flashing after 5 seconds.	2024-01-09 09:20:28 -03:00
Sam	17cc09ec9c	FIX: don't include <details> in context (#406 ) * FIX: don't include <details> in context We need to be careful adding <details> into context of conversations it can cause LLMs to hallucinate results * Fix Gemini multi-turn ctx flattening --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2024-01-05 15:21:14 -03:00
Sam	dd42a4e47b	FIX: array arguments not parsed correctly (#405 ) DALL E command accepts an Array as a tool argument, this was not parsed correctly by the invoker leading to errors generating images with DALL E Side quest ... don't use update! it calls validations and will now fail due to email validation	2024-01-05 14:39:32 +11:00
Roman Rizzi	971e03bdf2	FEATURE: AI Bot Gemini support. (#402 ) It also corrects the syntax around tool support, which was wrong. Gemini doesn't want us to include messages about previous tool invocations, so I had to shuffle around some code to send the response it generated from those invocations instead. For this, I created the "multi_turn" context, which bundles all the context involved in the interaction.	2024-01-04 18:15:34 -03:00
Roman Rizzi	f9d7d7f5f0	DEV: AI bot migration to the Llm pattern. (#343 ) * DEV: AI bot migration to the Llm pattern. We added tool and conversation context support to the Llm service in discourse-ai#366, meaning we met all the conditions to migrate this module. This PR migrates to the new pattern, meaning adding a new bot now requires minimal effort as long as the service supports it. On top of this, we introduce the concept of a "Playground" to separate the PM-specific bits from the completion, allowing us to use the bot in other contexts like chat in the future. Commands are called tools, and we simplified all the placeholder logic to perform updates in a single place, making the flow more one-wayish. * Followup fixes based on testing * Cleanup unused inference code * FIX: text-based tools could be in the middle of a sentence * GPT-4-turbo support * Use new LLM API	2024-01-04 10:44:07 -03:00
Sam	605445831f	FEATURE: try including views/username/likes in search results (#349 ) This is somewhat experimental, but the context of likes/view/username can help the llm find out what content is more important or even common users that produce great content This inflates the amount of tokens somewhat, but given it is all numbers and search columns titles are only included once this is not severe	2023-12-12 12:22:28 +11:00
Sam	a66b1042cc	FEATURE: scale up result count for search depending on model (#346 ) We were limiting to 20 results unconditionally cause we had to make sure search always fit in an 8k context window. Models such as GPT 3.5 Turbo (16k) and GPT 4 Turbo / Claude 2.1 (over 150k) allow us to return a lot more results. This means we have a much richer understanding cause context is far larger. This also allows a persona to tweak this number, in some cases admin may want to be conservative and save on tokens by limiting results This also tweaks the `limit` param which GPT-4 liked to set to tell model only to use it when it needs to (and describes default behavior)	2023-12-11 16:54:16 +11:00
Sam	6380ebd829	FEATURE: allow personas to provide command options (#331 ) Personas now support providing options for commands. This PR introduces a single option "base_query" for the SearchCommand. When supplied all searches the persona will perform will also include the pre-supplied filter. This can allow personas to search a subset of the forum (such as documentation) This system is extensible we can add options to any command trivially.	2023-12-08 08:42:56 +11:00
Sam	6ddc17fd61	DEV: port directory structure to Zeitwerk (#319 ) Previous to this change we relied on explicit loading for a files in Discourse AI. This had a few downsides: - Busywork whenever you add a file (an extra require relative) - We were not keeping to conventions internally ... some places were OpenAI others are OpenAi - Autoloader did not work which lead to lots of full application broken reloads when developing. This moves all of DiscourseAI into a Zeitwerk compatible structure. It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there) To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.	2023-11-29 15:17:46 +11:00
Sam	5a4598a7b4	FEATURE: Azure OpenAI support for DALLE 3 (#313 ) FEATURE: Azure OpenAI support for DALLE 3 Previous to this there was no way to add an inference endpoint for DALLE on Azure cause it requires custom URLs Also: - On save, when editing a persona it would revert priority and enabled - More forgiving parsing in command framework for array function calls - By default generate HD images - they tend to be a bit better - Improve DALL*E prompt which was getting very annoying and always echoing what it is about to do - Add a bit of a sleep between retries on image generation - Fix error handling in image_command	2023-11-27 13:01:05 +11:00
Sam	dff9f33a97	FEATURE: DALL-E-3 persona for image generation (#311 ) * FIX: no selected persona should pick first prioritized one Previously we were looking at `.personaId` but there is only an id attribute so it failed * FEATURE: new DALL-E-3 persona This persona generates images using DALL-E-3 API and is enabled by default Keep in mind that we are still waiting on seeds/gen_id so we can not retain style consistently between turns. This will change as soon as a new Open AI API provides the missing parameters Co-authored-by: Martin Brennan <martin@discourse.org>	2023-11-24 18:08:08 +11:00
Sam	6282b6d21f	FIX: implement tools framework for Anthropic (#307 ) Previous to this changeset we used a custom system for tools/command support for Anthropic. We defined commands by using !command as a signal to execute it Following Anthropic Claude 2.1, there is an official supported syntax (beta) for tools execution. eg: ``` + <function_calls> + <invoke> + <tool_name>image</tool_name> + <parameters> + <prompts> + [ + "an oil painting", + "a cute fluffy orange", + "3 apple's", + "a cat" + ] + </prompts> + </parameters> + </invoke> + </function_calls> ``` This implements the spec per Anthropic, it should be stable enough to also work on other LLMs. Keep in mind that OpenAI is not impacted here at all, as it has its own custom system for function calls. Additionally: - Fixes the title system prompt so it works with latest Anthropic - Uses new spec for "system" messages by Anthropic - Tweak forum helper persona to guide Anthropic a tiny be better Overall results are pretty awesome and Anthropic Claude performs really well now on Discourse	2023-11-24 06:39:56 +11:00
Roman Rizzi	3064d4c288	REFACTOR: Summarization and HyDE now use an LLM abstraction. (#297 ) * DEV: One LLM abstraction to rule them all * REFACTOR: HyDE search uses new LLM abstraction * REFACTOR: Summarization uses the LLM abstraction * Updated documentation and made small fixes. Remove Bedrock claude-2 restriction	2023-11-23 12:58:54 -03:00
Sam	5b5edb22c6	FEATURE: UI to update ai personas on admin page (#290 ) Introduces a UI to manage customizable personas (admin only feature) Part of the change was some extensive internal refactoring: - AIBot now has a persona set in the constructor, once set it never changes - Command now takes in bot as a constructor param, so it has the correct persona and is not generating AIBot objects on the fly - Added a .prettierignore file, due to the way ALE is configured in nvim it is a pre-req for prettier to work - Adds a bunch of validations on the AIPersona model, system personas (artist/creative etc...) are all seeded. We now ensure - name uniqueness, and only allow certain properties to be touched for system personas. - (JS note) the client side design takes advantage of nested routes, the parent route for personas gets all the personas via this.store.findAll("ai-persona") then child routes simply reach into this model to find a particular persona. - (JS note) data is sideloaded into the ai-persona model the meta property supplied from the controller, resultSetMeta - This removes ai_bot_enabled_personas and ai_bot_enabled_chat_commands, both should be controlled from the UI on a per persona basis - Fixes a long standing bug in token accounting ... we were doing to_json.length instead of to_json.to_s.length - Amended it so {commands} are always inserted at the end unconditionally, no need to add it to the template of the system message as it just confuses things - Adds a concept of required_commands to stock personas, these are commands that must be configured for this stock persona to show up. - Refactored tests so we stop requiring inference_stubs, it was very confusing to need it, added to plugin.rb for now which at least is clearer - Migrates the persona selector to gjs --------- Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com> Co-authored-by: Martin Brennan <martin@discourse.org>	2023-11-21 16:56:43 +11:00
Sam	a4f419f54f	FEATURE: basic infrastructure for custom personas (#288 ) - New AiPersona model which can store custom personas - Persona are restricted via group security - They can contain custom system messages - They can support a list of commands optionally To avoid expensive DB calls in the serializer a Multisite friendly Hash was introduced (which can be expired on transaction commit)	2023-11-10 11:39:49 +11:00
Sam	fc65404896	FEATURE: support topic_id and post_id logging in ai audit log (#274 ) This makes it easier to track who is responsible for a completion in logs Note: ai helper and summarization are not yet implemented	2023-11-01 08:41:31 +11:00
Sam	0b62c0fa02	FIX: keep parity of shape for image command (#275 ) Function calling will start hallucinating if you reshape results. Previously we were morphing from: `{ prompts: ["prompt 1", "prompt 2"] }` to `{ prompts: { prompt: "prompt 1", seed: 222}, { ... ` This meant that over a few call sequences function_call starts hallucinating an incorrect shape. This change grounds us even on GPT-3.5	2023-10-31 19:12:25 +11:00
Sam	b06380d9fa	FIX: avoid semicolons at the end of queries for SQL Helper (#268 ) This makes it easier to cut and paste snippets it is producing Also fine tune the prompt in an attempt to hone gpt 3.5 which is very finicky	2023-10-27 16:21:09 +11:00
Sam	6add06af8f	FEATURE: Make artist more creative (#266 ) This allows for 2 big features: 1. Artist can ship up to 4 prompts for image generation 2. Artist can regenerate images cause it is aware of seed This allows for iteration on images maintaining visual style	2023-10-27 14:48:12 +11:00
Sam	1500308437	FEATURE: defer creation of bot users (#258 ) Also fixes it so users without bot in header can send it messages. Previous to this change we would seed all bots with database seeds. This lead to lots of confusion for people who do not enable ai bot. Instead: 1. We do not seed any bots until user enables the ai_bot_enabled setting 2. If it is disabled we will a. If no messages were created by bot - delete it b. Otherwise we will deactivate account	2023-10-23 17:00:58 +11:00
Sam	f65e50bd9e	FIX: allow for blank fields in Google results (#255 ) Under certain cases, for example: ``` there is this japanese band called kirimi, tell me more about them, try searching 3 times and at least 2 times in japanese before answering. ``` Results come back with blank snippets. This adds protection so this is allowed and code does not simply blow up.	2023-10-19 14:44:59 +11:00
Sam	aa463d64f1	FEATURE: Add creative persona (#231 ) This adds a new creative persona that has access to the underlying model and no external integrations. It allows people to use Claude/GPT models in a Discourse agnostic way.	2023-09-27 10:48:38 +10:00
Sam	316ea9624e	FIX: properly truncate !command prompts (#227 ) * FIX: properly truncate !command prompts ### What is going on here? Previous to this change where a command was issued by the LLM it could hallucinate a continuation eg: ``` This is what tags are !tags some nonsense here ``` This change introduces safeguards so `some nonsense here` does not creep in to the prompt history, poisoning the llm results This in effect grounds the llm a lot better and results in the llm forgetting less about results. The change only impacts Claude at the moment, but will also improve stuff for llama 2 in future. Also, this makes it significantly easier to test the bot framework without an llm cause we avoid a whole bunch of complex stubbing * blank is not a valid bot response, do not inject into prompt	2023-09-15 07:02:37 +10:00
Roman Rizzi	f57c1bb0f6	FEATURE: AI Helper endpoint to generate a thumbnail from text. (#224 ) We pass the text to the current LLM and ask them to generate a StableDifussion prompt. We'll use that to generate 4 samples, temporarily creating uploads and returning their short URLs.	2023-09-14 12:53:44 -03:00
Jarek Radosz	1eb70c4f0a	DEV: Fix rspec-expectations warnings (#228 )	2023-09-14 17:50:13 +02:00
Sam	9e94457154	FIX: Made bot more robust (#226 ) * FIX: Made bot more robust This is a collection of small fixes - Display "Searching for: ..." while searching instead of showing found 0 results. - Only allow 5 commands in lang chain - 6 feels like too much - On the 5th command stop informing the engine about functions, so it is forced to complete - Add another 30 tokens of buffer and explain why - Typo in command prompt Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>	2023-09-14 16:46:56 +10:00
Sam	cdd6faa648	FEATURE: add filter support to ai bot semantic search (#222 ) Previously we would bypass semantic search if any filters were present Also shows progress now.	2023-09-13 14:59:45 +10:00
Sam	d75e3ca82b	FEATURE: include tag and category context in search (#217 ) Previous to this we just included title/body.. tags and category structure can be very critical for decision making.	2023-09-12 16:09:28 +10:00
Sam	b0310f90d3	FEATURE: add tags and categories to read context (#215 ) Note, we perform permission checks on tag list against anon to ensure we do not disclose information about private tags to the llm which could get extracted.	2023-09-12 11:06:55 +10:00
Sam	615eb8b440	FEATURE: add semantic search with hyde bot (#210 ) In specific scenarios (no special filters or limits) we will also always include 5 semantic results (at least) with every query. This effectively means that all very wide queries will always return 20 results, regardless of how complex they are. Also: FIX: embedding backfill rake task not working We renamed internals, this corrects the implementation	2023-09-07 13:25:26 +10:00
Sam	38af2ca63e	FIX: cut completion short after function call is found (#182 ) Previous to this change we would keep completing and throw away result	2023-09-05 10:37:58 +10:00
Sam	e3abbd9f46	FEATURE: add researcher persona (#181 ) The researcher persona has access to Google and can perform various internet research tasks. At the moment it can not read web pages, but that is under consideration	2023-09-04 12:05:27 +10:00
Sam	181113159b	FIX: setting explorer was exceeding token budget This refactor changes it so we only include minimal data in the system prompt which leaves us lots of tokens for specific searches The new search command allows us to pull in settings on demand Descriptions are include in short search results, and names only in longer results Also: * In dev it is important to tell when calls are made to open ai this adds a console log to increase awareness around token usage * PERF: stop counting tokens so often This changes it so we only count tokens once per response Previously each time we heard back from open ai we would count tokens, leading to uneeded delays * bug fix, commands may reach in for tokenizer * add logging to console for anthropic calls as well * Update lib/shared/inference/openai_completions.rb Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>	2023-09-01 11:48:51 +10:00
Sam	00d69b463e	FEATURE: new site setting explorer persona (#178 ) Also adds ai_bot_enabled_personas so admins can tweak which stock personas are enabled. The new persona has a full listing of all site settings and is able to get context for each setting. This means you can ask it to search through settings for something relevant. Security wise there is no access to actual configuration of settings just to the names / description and implementation. Previously this was part of the forum helper persona however it just clashes too much with other behaviors, isolating it makes it far more powerful. * sneaking this one in, user_emails is a non obvious table in our structure. usually one would assume users has emails so the clarifies a bit better. plus it is a very common table to hit.	2023-08-31 17:02:03 +10:00
Sam	db19e37748	FEATURE: add initial support for personas (#172 ) This splits out a bunch of code that used to live inside bots into a dedicated concept called a Persona. This allows us to start playing with multiple personas for the bot Ships with: artist - for making images sql helper - for helping with data explorer general - for everything and anything Also includes a few fixes that make the generic LLM function implementation more robust	2023-08-30 16:15:03 +10:00
Sam	8fdb88604f	FIX: trim first space when getting a reply from anthropic (#164 ) Anthropic loves sending a pointless leading space with completions this throws off the command framework.	2023-08-29 10:57:36 +10:00
Sam	b14cb864dc	FEATURE: add setting_context experimental command (#160 ) This command can be used to extract information about a discourse site setting directly from source. To operate it needs the rg binary in the container.	2023-08-29 10:43:58 +10:00
Sam	7d943be7b2	FIX: automatic bot titles missing sometime (#151 ) This fixes 2 big issues: 1. No matter how hard you try, grounding anthropic title prompt is just too hard. This works around by only looking at the last sentence it returns and treating as title 2. Non English locales would be stuck with "generic" title, this ensures every bot message gets a title, using a custom field to track Also, slightly tunes some anthropic prompts.	2023-08-24 07:20:24 +10:00
Sam	f0e1c72aa7	FEATURE: implement command framework for non Open AI (#147 ) Open AI support function calling, this has a very specific shape that other LLMs have not quite adopted. This simulates a command framework using system prompts on LLMs that are not open AI. Features include: - Smart system prompt to steer the LLM - Parameter validation (we ensure all the params are specified correctly) This is being tested on Anthropic at the moment and intial results are promising.	2023-08-23 07:49:36 +10:00
Sam	20c1f2d788	FEATURE: basic progress for image generation (#133 ) previously you would have to wait quite a while to see the prompt this implements a very basic implementation of progress so you can see the API is working. Also: - Fix google progress. - Handle the incredibly rare, zero results from google. - Simplify command so it is less error prone - replace invoke and attache results with a invoke - ensure invoke can only ever be run once - pass in all the information a command needs in constructor - use new pattern throughout - test invocation in isolation	2023-08-14 16:30:12 +10:00
Sam	7eedbf29e0	FIX: refine image and read command (#131 ) - Attempt to hint reading is done by sending complete:true - Do not include post_number in result unless it was sent in - Rush visual feedback when a command is run (ensure we always revise) - Include hyperlink in read command description - Stop round tripping to GPT after image generation (speeds up images by a lot) - Add a test for image command	2023-08-09 16:01:48 +10:00
Sam	958dfc360e	FEATURE: experimental read command for bot (#129 ) This command is useful for reading a topics content. It allows us to perform critical analysis or suggest answers. Given 8k token limit in GPT-4 I hardcoded reading to 1500 tokens, but we can follow up and allow larger windows on models that support more tokens. On local testing even in this limited form this can be very useful.	2023-08-09 07:19:56 +10:00

1 2

68 Commits