discourse-ai

Commit Graph

Author	SHA1	Message	Date
Sam	fc65404896	FEATURE: support topic_id and post_id logging in ai audit log (#274 ) This makes it easier to track who is responsible for a completion in logs Note: ai helper and summarization are not yet implemented	2023-11-01 08:41:31 +11:00
Rafael dos Santos Silva	cbfd8507b1	FIX: Update bedrock endpoint (#272 ) * FIX: Update bedrock endpoint AWS updated their endpoints per https://docs.aws.amazon.com/general/latest/gr/bedrock.html * lint	2023-10-30 19:27:50 -03:00
Sam	6add06af8f	FEATURE: Make artist more creative (#266 ) This allows for 2 big features: 1. Artist can ship up to 4 prompts for image generation 2. Artist can regenerate images cause it is aware of seed This allows for iteration on images maintaining visual style	2023-10-27 14:48:12 +11:00
Sam	426e348c8a	FIX: make stable diffusion multi site friendly (#265 ) Previous to this change image generation did not work on multisite There was a background thread generating the images and it was getting site settings from the default site in the cluster This also removes referer header which is not needed	2023-10-25 11:04:16 +11:00
Sam	9242da545e	FEATURE: support OpenAI-Organization header (#245 ) Per: https://platform.openai.com/docs/api-reference/authentication There is an organization option which is useful for large orgs > For users who belong to multiple organizations, you can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's subscription quota.	2023-10-06 10:23:18 +11:00
Sam	d87adcebea	FEATURE: Claude based scanning and OpenAI retries (#243 ) llm_triage supported claude 2 in triage, this implements it OpenAI rate limits frequently, this introduces some exponential backoff (3 attempts - 3 seconds, 9 and 27) Also reduces temp of classifiers so they have consistent behavior	2023-10-05 09:00:45 +11:00
Rafael dos Santos Silva	84cc369552	FEATURE: Bge-large-en embeddings via Cloudflare Workers AI API (#241 ) * FEATURE: Bge-large-en embeddings via Cloudflare Workers AI API * forgot a file * lint	2023-10-04 13:47:51 -03:00
Rafael dos Santos Silva	102f47c1c4	FEATURE: Allow Anthropic inference via AWS Bedrock (#235 ) If a module LLM model is set to claude-2 and the ai_bedrock variables are all present we will use AWS Bedrock instead of Antrhopic own APIs. This is quite hacky, but will allow us to test the waters with AWS Bedrock early access with every module. This situation of "same module, completely different API" is quite a bit far from what we had in the OpenAI/Azure separation, so it's more food for thought for when we start working on the LLM abstraction layer soon this year.	2023-10-02 12:58:36 -03:00
Sam	316ea9624e	FIX: properly truncate !command prompts (#227 ) * FIX: properly truncate !command prompts ### What is going on here? Previous to this change where a command was issued by the LLM it could hallucinate a continuation eg: ``` This is what tags are !tags some nonsense here ``` This change introduces safeguards so `some nonsense here` does not creep in to the prompt history, poisoning the llm results This in effect grounds the llm a lot better and results in the llm forgetting less about results. The change only impacts Claude at the moment, but will also improve stuff for llama 2 in future. Also, this makes it significantly easier to test the bot framework without an llm cause we avoid a whole bunch of complex stubbing * blank is not a valid bot response, do not inject into prompt	2023-09-15 07:02:37 +10:00
Sam	9e94457154	FIX: Made bot more robust (#226 ) * FIX: Made bot more robust This is a collection of small fixes - Display "Searching for: ..." while searching instead of showing found 0 results. - Only allow 5 commands in lang chain - 6 feels like too much - On the 5th command stop informing the engine about functions, so it is forced to complete - Add another 30 tokens of buffer and explain why - Typo in command prompt Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>	2023-09-14 16:46:56 +10:00
Rafael dos Santos Silva	5c50d2aa09	FEATURE: Use stop_sequences for faster HyDE searches with Claude (#203 )	2023-09-06 10:06:31 -03:00
Rafael dos Santos Silva	2c0f535bab	FEATURE: HyDE-powered semantic search. (#136 ) * FEATURE: HyDE-powered semantic search. It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way. We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search. This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying. Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead. * Missing translation and rate limiting --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2023-09-05 11:08:23 -03:00
Sam	181113159b	FIX: setting explorer was exceeding token budget This refactor changes it so we only include minimal data in the system prompt which leaves us lots of tokens for specific searches The new search command allows us to pull in settings on demand Descriptions are include in short search results, and names only in longer results Also: * In dev it is important to tell when calls are made to open ai this adds a console log to increase awareness around token usage * PERF: stop counting tokens so often This changes it so we only count tokens once per response Previously each time we heard back from open ai we would count tokens, leading to uneeded delays * bug fix, commands may reach in for tokenizer * add logging to console for anthropic calls as well * Update lib/shared/inference/openai_completions.rb Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>	2023-09-01 11:48:51 +10:00
Sam	db19e37748	FEATURE: add initial support for personas (#172 ) This splits out a bunch of code that used to live inside bots into a dedicated concept called a Persona. This allows us to start playing with multiple personas for the bot Ships with: artist - for making images sql helper - for helping with data explorer general - for everything and anything Also includes a few fixes that make the generic LLM function implementation more robust	2023-08-30 16:15:03 +10:00
Sam	7d943be7b2	FIX: automatic bot titles missing sometime (#151 ) This fixes 2 big issues: 1. No matter how hard you try, grounding anthropic title prompt is just too hard. This works around by only looking at the last sentence it returns and treating as title 2. Non English locales would be stuck with "generic" title, this ensures every bot message gets a title, using a custom field to track Also, slightly tunes some anthropic prompts.	2023-08-24 07:20:24 +10:00
Sam	f0e1c72aa7	FEATURE: implement command framework for non Open AI (#147 ) Open AI support function calling, this has a very specific shape that other LLMs have not quite adopted. This simulates a command framework using system prompts on LLMs that are not open AI. Features include: - Smart system prompt to steer the LLM - Parameter validation (we ensure all the params are specified correctly) This is being tested on Anthropic at the moment and intial results are promising.	2023-08-23 07:49:36 +10:00
Sam	b4477ecdcd	FEATURE: support 16k and 32k variants for Azure GPT (#140 ) Azure requires a single HTTP endpoint per type of completion. The settings: `ai_openai_gpt35_16k_url` and `ai_openai_gpt4_32k_url` can be used now to configure the extra endpoints This amends token limit which was off a bit due to function calls and fixes a minor JS issue where we were not testing for a property	2023-08-17 11:00:11 +10:00
Roman Rizzi	b076e43d67	FEATURE: streaming mode for the FoldContent strategy. (#134 )	2023-08-11 15:08:54 -03:00
Rafael dos Santos Silva	eb7fff3a55	FEATURE: Add support for StableBeluga and Upstage Llama2 instruct (#126 ) * FEATURE: Add support for StableBeluga and Upstage Llama2 instruct This means we support all models in the top3 of the Open LLM Leaderboard Since some of those models have RoPE, we now have a setting so you can customize the token limit depending which model you use.	2023-08-03 15:29:30 -03:00
Rafael dos Santos Silva	8b157feea5	FEATURE: Compatibility with protected Hugging Face Endpoints (#123 ) * FEATURE: Compatibility with protected Hugging Face Endpoints	2023-08-02 17:00:00 -03:00
Sam	602bb843ea	FEATURE: add support for final stable diffusion xl model (#122 )	2023-08-02 16:53:28 -03:00
Rafael dos Santos Silva	3e7c99de89	FEATURE: Support for locally infered embeddings in 100 languages (#115 ) * FEATURE: Support for locally infered embeddings in 100 languages * add table	2023-07-27 15:50:03 -03:00
Rafael dos Santos Silva	b25daed60b	FEATURE: Llama2 for summarization (#116 )	2023-07-27 13:55:32 -03:00
Sam	4b0c077ce5	FEATURE: port to use claude-2 for chat bot (#114 ) Claude 1 costs the same and is less good than Claude 2. Make use of Claude 2 in all spots ... This also fixes streaming so it uses the far more efficient streaming protocol.	2023-07-27 11:24:44 +10:00
Rafael dos Santos Silva	5e3f4e1b78	FEATURE: Embeddings to main db (#99 ) * FEATURE: Embeddings to main db This commit moves our embeddings store from an external configurable PostgreSQL instance back into the main database. This is done to simplify the setup. There is a migration that will try to import the external embeddings into the main DB if it is configured and there are rows. It removes support from embeddings models that aren't all_mpnet_base_v2 or OpenAI text_embedding_ada_002. However it will now be easier to add new models. It also now takes into account: - topic title - topic category - topic tags - replies (as much as the model allows) We introduce an interface so we can eventually support multiple strategies for handling long topics. This PR severely damages the semantic search performance, but this is a temporary until we can get adapt HyDE to make semantic search use the same embeddings we have for semantic related with good performance. Here we also have some ground work to add post level embeddings, but this will be added in a future PR. Please note that this PR will also block Discourse from booting / updating if this plugin is installed and the pgvector extension isn't available on the PostgreSQL instance Discourse uses.	2023-07-13 12:41:36 -03:00
Rafael dos Santos Silva	9d10a152b9	FEATURE: Claude 2 for summarization and AIHelper (#101 )	2023-07-13 12:32:08 -03:00
Roman Rizzi	1b568f2391	FIX: Claude's max_tookens_to_sample is a required field (#97 )	2023-06-27 14:42:33 -03:00
Roman Rizzi	9a79afcdbf	DEV: Better strategies for summarization (#88 ) * DEV: Better strategies for summarization The strategy responsibility needs to be "Given a collection of texts, I know how to summarize them most efficiently, using the minimum amount of requests and maximizing token usage". There are different token limits for each model, so it all boils down to two different strategies: Fold all these texts into a single one, doing the summarization in chunks, and then build a summary from those. Build it by combining texts in a single prompt, and truncate it according to your token limits. While the latter is less than ideal, we need it for "bart-large-cnn-samsum" and "flan-t5-base-samsum", both with low limits. The rest will rely on folding. * Expose summarized chunks to users	2023-06-27 12:26:33 -03:00
Sam	d1ab79e82f	FEATURE: Add Azure cognitive service support (#93 ) The new site settings: ai_openai_gpt35_url : distribution for GPT 16k ai_openai_gpt4_url: distribution for GPT 4 ai_openai_embeddings_url: distribution for ada2 If untouched we will simply use OpenAI endpoints. Azure requires 1 URL per model, OpenAI allows a single URL to serve multiple models. Hence the new settings.	2023-06-21 10:39:51 +10:00
Sam	70c158cae1	FEATURE: add full bot support for GPT 3.5 (#87 ) Given latest GPT 3.5 16k which is both better steered and supports functions we can now support rich bot integration. Clunky system message based steering is removed and instead we use the function framework provided by Open AI	2023-06-20 08:45:31 +10:00
Rafael dos Santos Silva	e457c687ca	FIX: OpenAI Tokenizer was failing to truncate mid emojis (#91 ) * FIX: OpenAI Tokenizer was failing to truncate mid emojis * Update spec/shared/tokenizer.rb Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com> --------- Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>	2023-06-16 15:15:36 -03:00
Sam	840968630e	FEATURE: disable smart commands on Claude and GPT 3.5 (#84 ) For the time being smart commands only work consistently on GPT 4. Avoid using any smart commands on the earlier models. Additionally adds better error handling to Claude which sometimes streams partial json and slightly tunes the search command.	2023-06-01 09:10:33 +10:00
Rafael dos Santos Silva	b213fe7f94	FIX: Give up trying to reuse the DB connection and rely on pgbouncer (#79 )	2023-05-23 15:12:59 -03:00
Rafael dos Santos Silva	262ed4753e	FEATURE: Basic StableDiffusion text2img support (#72 )	2023-05-20 09:38:08 +10:00
Rafael dos Santos Silva	739b314312	Fixes for embeddings and truncate (#67 )	2023-05-18 09:21:28 +10:00
Rafael dos Santos Silva	3c9513e754	Refinements to embeddings and tokenizers (#61 ) * Refinements to embeddings and tokenizers * lint * Truncate with tokenizers for summary * fix	2023-05-15 15:10:42 -03:00
Rafael dos Santos Silva	66bf4c74c6	FEATURE: Handle invalid media in NSFW module (#57 ) * FEATURE: Handle invalid media in NSFW module * fix lint	2023-05-11 15:35:39 -03:00
Roman Rizzi	7e3cb0ea16	FEATURE: Multi-model support for the AI Bot module. (#56 ) We'll create one bot user for each available model. When listed in the `ai_bot_enabled_chat_bots` setting, they will reply. This PR lets us use Claude-v1 in stream mode.	2023-05-11 10:03:03 -03:00
Sam	e76fc77189	fixes (#53 ) * Minor... use username suggester in case username already exists * FIX: ensure we truncate long prompts Previously we 1. Used raw length instead of token counts for counting length 2. We totally dropped a prompt if it was too long New implementation will truncate "raw" if it gets too long maintaining meaning.	2023-05-06 07:31:53 -03:00
Roman Rizzi	71b105a1bb	FEATURE: Introduce the ai-bot module (#52 ) This module lets you chat with our GPT bot inside a PM. The bot only replies to members of the groups listed on the ai_bot_allowed_groups setting and only if you invite it to participate in the PM.	2023-05-05 15:28:31 -03:00
Sam	2cd60a4b3b	FEATURE: add a table to audit OpenAI usage (#45 ) Still need to build a job to purge logs	2023-04-26 11:44:29 +10:00
Sam	057fbe1ce6	FEATURE: add internal support for streaming mode (#42 ) Also adds some tests around completions and supports additional params such as top_p, temperature and max_tokens This also migrates off Faraday to using Net::HTTP directly	2023-04-21 16:54:25 +10:00
Rafael dos Santos Silva	9783e3b025	FEATURE: Add a basic tokenizer API (#37 ) * FEATURE: Add a basic tokenizer API * Add tests * lint	2023-04-19 11:55:59 -03:00
Rafael dos Santos Silva	bb0b829634	FEATURE: Anthropic Claude for AIHelper and Summarization modules (#39 )	2023-04-10 11:04:42 -03:00
Rafael dos Santos Silva	5549e4d5b3	FEATURE: Chat channel summarization. (#32 ) * start summary module * chat channel summarization * FEATURE: modal for channel summarization --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2023-04-04 11:24:09 -03:00
Rafael dos Santos Silva	b942a18298	FEATURE: Support for GPT-4 in AI Helper module (#29 )	2023-03-28 23:22:34 -03:00
Roman Rizzi	4c960970fa	DEV: Log information about errors from the completions OpenAI API (#26 )	2023-03-22 16:00:28 -03:00
Sam	1d14f7ffaf	FEATURE: Add a markdown table AI helper (#25 )	2023-03-22 13:16:29 -03:00
Roman Rizzi	39f7f1f29e	FEATURE: Prompts can consist of multiple messages. (#21 ) A prompt with multiple messages leads to better results, as the AI can learn for given examples. Alongside this change, we provide a better default proofreading prompt.	2023-03-21 12:04:59 -03:00
Roman Rizzi	fea9041ee1	DEV: Use 10s timeout when using the completions API (#19 )	2023-03-20 16:43:51 -03:00

1 2

63 Commits