discourse-ai

mirror of https://github.com/discourse/discourse-ai.git synced 2025-02-19 18:04:51 +00:00

Author	SHA1	Message	Date
Sam	e817b7dc11	FEATURE: improve tool support (#904 ) This re-implements tool support in DiscourseAi::Completions::Llm #generate Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML New implementation has the endpoints return ToolCall objects. Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement decode, decode_chunk (for streaming) It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up. Also (new) Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM Token accounting is fixed for vllm (we were not correctly counting tokens)	2024-11-12 08:14:30 +11:00
Sam	4923837165	FIX: Llm selector / forced tools / search tool (#862 ) * FIX: Llm selector / forced tools / search tool This fixes a few issues: 1. When search was not finding any semantic results we would break the tool 2. Gemin / Anthropic models did not implement forced tools previously despite it being an API option 3. Mechanics around displaying llm selector were not right. If you disabled LLM selector server side persona PM did not work correctly. 4. Disabling native tools for anthropic model moved out of a site setting. This deliberately does not migrate cause this feature is really rare to need now, people who had it set probably did not need it. 5. Updates anthropic model names to latest release * linting * fix a couple of tests I missed * clean up conditional	2024-10-25 06:24:53 +11:00
Roman Rizzi	20efc9285e	FIX: Correctly save provider-specific params for new models. (#744 ) Creating a new model, either manually or from presets, doesn't initialize the `provider_params` object, meaning their custom params won't persist. Additionally, this change adds some validations for Bedrock params, which are mandatory, and a clear message when a completion fails because we cannot build the URL.	2024-08-07 16:08:56 -03:00
Roman Rizzi	bed044448c	DEV: Remove old code now that features rely on LlmModels. (#729 ) * DEV: Remove old code now that features rely on LlmModels. * Hide old settings and migrate persona llm overrides * Remove shadowing special URL + seeding code. Use srv:// prefix instead.	2024-07-30 13:44:57 -03:00
Roman Rizzi	f622e2644f	FEATURE: Store provider-specific parameters. (#686 ) Previously, we stored request parameters like the OpenAI organization and Bedrock's access key and region as site settings. This change stores them in the `llm_models` table instead, letting us drop more settings while also becoming more flexible.	2024-06-25 08:26:30 +10:00
Sam	e04a7be122	FEATURE: LLM presets for model creation (#681 ) * FEATURE: LLM presets for model creation Previous to this users needed to look up complicated settings when setting up models. This introduces and extensible preset system with Google/OpenAI/Anthropic presets. This will cover all the most common LLMs, we can always add more as we go. Additionally: - Proper support for Anthropic Claude Sonnet 3.5 - Stop blurring api keys when navigating away - this made it very complex to reuse keys	2024-06-21 17:32:15 +10:00
Rafael dos Santos Silva	714caf34fe	FEATURE: Support for Claude 3.5 Sonnet via AWS Bedrock (#680 )	2024-06-20 17:51:46 -03:00
Sam	8b81ff45b8	FIX: switch off native tools on Anthropic Claude Opus (#659 ) Native tools do not work well on Opus. Chain of Thought prompting means it consumes enormous amounts of tokens and has poor latency. This commit introduce and XML stripper to remove various chain of thought XML islands from anthropic prompts when tools are involved. This mean Opus native tools is now functions (albeit slowly) From local testing XML just works better now. Also fixes enum support in Anthropic native tools	2024-06-07 10:52:01 -03:00
Sam	3993c685e1	FEATURE: anthropic function calling (#654 ) Adds support for native tool calling (both streaming and non streaming) for Anthropic. This improves general tool support on the Anthropic models.	2024-06-06 08:34:23 +10:00
Roman Rizzi	1d786fbaaf	FEATURE: Set endpoint credentials directly from LlmModel. (#625 ) * FEATURE: Set endpoint credentials directly from LlmModel. Drop Llama2Tokenizer since we no longer use it. * Allow http for custom LLMs --------- Co-authored-by: Rafael Silva <xfalcox@gmail.com>	2024-05-16 09:50:22 -03:00
Roman Rizzi	e22194f321	HACK: Llama3 support for summarization/AI helper. (#616 ) There are still some limitations to which models we can support with the `LlmModel` class. This will enable support for Llama3 while we sort those out.	2024-05-13 15:54:42 -03:00
Sam	0069256efd	FIX: improve function call parsing (#613 ) - support " / ' wrapped values - coerce integer to integer - enforce enum at boundary	2024-05-13 19:40:11 +10:00
Sam	514823daca	FIX: streaming broken in bedrock when chunks are not aligned (#609 ) Also - Stop caching llm list - this cause llm list in persona to be incorrect - Add more UI to debug screen so you can properly see raw response	2024-05-09 12:11:50 +10:00
Roman Rizzi	4f1a3effe0	REFACTOR: Migrate Vllm/TGI-served models to the OpenAI format. (#588 ) Both endpoints provide OpenAI-compatible servers. The only difference is that Vllm doesn't support passing tools as a separate parameter. Even if the tool param is supported, it ultimately relies on the model's ability to handle native functions, which is not the case with the models we have today. As a part of this change, we are dropping support for StableBeluga/Llama2 models. They don't have a chat_template, meaning the new API can translate them. These changes let us remove some of our existing dialects and are a first step in our plan to support any LLM by defining them as data-driven concepts. I rewrote the "translate" method to use a template method and extracted the tool support strategies into its classes to simplify the code. Finally, these changes bring support for Ollama when running in dev mode. It only works with Mistral for now, but it will change soon..	2024-05-07 10:02:16 -03:00
Sam	50be66ee63	FEATURE: Gemini 1.5 pro support and Claude Opus bedrock support (#580 ) - Updated AI Bot to only support Gemini 1.5 (used to support 1.0) - 1.0 was removed cause it is not appropriate for Bot usage - Summaries and automation can now lean on Gemini 1.5 pro - Amazon added support for Claude 3 Opus, added internal support for it on bedrock	2024-04-17 15:37:19 +10:00
Sam	f62703760f	FEATURE: add Claude 3 sonnet/haiku support for Amazon Bedrock (#534 ) This PR consolidates the implements new Anthropic Messages interface for Bedrock Claude endpoints and adds support for the new Claude 3 models (haiku, opus, sonnet). Key changes: - Renamed `AnthropicMessages` and `Anthropic` endpoint classes into a single `Anthropic` class (ditto for ClaudeMessages -> Claude) - Updated `AwsBedrock` endpoints to use the new `/messages` API format for all Claude models - Added `claude-3-haiku`, `claude-3-opus` and `claude-3-sonnet` model support in both Anthropic and AWS Bedrock endpoints - Updated specs for the new consolidated endpoints and Claude 3 model support This refactor removes support for old non messages API which has been deprecated by anthropic	2024-03-19 06:48:46 +11:00
Sam	cec4251b00	DEV: improve error bedrock error messages (#454 ) When bedrock rate limits it returns a 200 BUT also returns a JSON document with the error. Previously we had no special case here so we complained about nil New code properly logs the problem	2024-02-01 08:01:07 -03:00
Roman Rizzi	0634b85a81	UX: Validations to LLM-backed features (except AI Bot) (#436 ) * UX: Validations to Llm-backed features (except AI Bot) This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs. Validations are: * You must choose a model before enabling the feature. * You must turn off the feature before setting the model to blank. * You must configure each model settings before being able to select it. * Add provider name to summarization options * vLLM can technically support same models as HF * Check we can talk to the selected model * Check for Bedrock instead of anthropic as a site could have both creds setup	2024-01-29 16:04:25 -03:00
Roman Rizzi	abde82c1f3	FIX: Use claude-2.1 to enable system prompts (#411 )	2024-01-09 14:10:20 -03:00
Roman Rizzi	f9d7d7f5f0	DEV: AI bot migration to the Llm pattern. (#343 ) * DEV: AI bot migration to the Llm pattern. We added tool and conversation context support to the Llm service in discourse-ai#366, meaning we met all the conditions to migrate this module. This PR migrates to the new pattern, meaning adding a new bot now requires minimal effort as long as the service supports it. On top of this, we introduce the concept of a "Playground" to separate the PM-specific bits from the completion, allowing us to use the bot in other contexts like chat in the future. Commands are called tools, and we simplified all the placeholder logic to perform updates in a single place, making the flow more one-wayish. * Followup fixes based on testing * Cleanup unused inference code * FIX: text-based tools could be in the middle of a sentence * GPT-4-turbo support * Use new LLM API	2024-01-04 10:44:07 -03:00
Sam	03fc94684b	FIX: AI helper not working correctly with mixtral (#399 ) * FIX: AI helper not working correctly with mixtral This PR introduces a new function on the generic llm called #generate This will replace the implementation of completion! #generate introduces a new way to pass temperature, max_tokens and stop_sequences Then LLM implementers need to implement #normalize_model_params to ensure the generic names match the LLM specific endpoint This also adds temperature and stop_sequences to completion_prompts this allows for much more robust completion prompts * port everything over to #generate * Fix translation - On anthropic this no longer throws random "This is your translation:" - On mixtral this actually works * fix markdown table generation as well	2024-01-04 09:53:47 -03:00
Roman Rizzi	4182af230a	FIX: Correctly translate and read tools for Claude and Chat GPT. (#393 ) I tested against the live models for the AI bot migration. It ensures Open AI's tool syntax is correct and we can correctly read the replies. :	2024-01-02 11:21:13 -03:00
Roman Rizzi	e0bf6adb5b	DEV: Tool support for the LLM service. (#366 ) This PR adds tool support to available LLMs. We'll buffer tool invocations and return them instead of making users of this service parse the response. It also adds support for conversation context in the generic prompt. It includes bot messages, user messages, and tool invocations, which we'll trim to make sure it doesn't exceed the prompt limit, then translate them to the correct dialect. Finally, It adds some buffering when reading chunks to handle cases when streaming is extremely slow.:M	2023-12-18 18:06:01 -03:00
Roman Rizzi	203906be65	FIX: Bedrock was complaining input was too long (#365 )	2023-12-18 16:06:06 -03:00
Roman Rizzi	031c2a6b46	Revert "FIX: Recover from Bedrock returning invalid base64 payloads during streaming (#352 )" (#353 ) This reverts commit ef7d4cc5090e54491ab00d5bdd0ef3ad85c499de.	2023-12-12 17:22:44 -03:00
Roman Rizzi	ef7d4cc509	FIX: Recover from Bedrock returning invalid base64 payloads during streaming (#352 )	2023-12-12 17:06:53 -03:00
Roman Rizzi	419c43592a	FIX: Make summaries more cohesive by tweaking prompt. (#310 ) Other changes: - Don't use Bedrock for non claude models if credentials are set. - Remove extra sentence from HyDE prompt.	2023-11-23 16:33:37 -03:00
Roman Rizzi	02efca162e	FIX: Bedrock uses slightly different model names * Revert "FIX: We don't need to prepend anthropic. to bedrock models (#308)" This reverts commit 8a01751991178f7636030eb99e7f75c035707ffd. * FIX: Bedrock uses slightly different model names	2023-11-23 15:49:24 -03:00
Roman Rizzi	8a01751991	FIX: We don't need to prepend anthropic. to bedrock models (#308 )	2023-11-23 14:39:21 -03:00
Roman Rizzi	3064d4c288	REFACTOR: Summarization and HyDE now use an LLM abstraction. (#297 ) * DEV: One LLM abstraction to rule them all * REFACTOR: HyDE search uses new LLM abstraction * REFACTOR: Summarization uses the LLM abstraction * Updated documentation and made small fixes. Remove Bedrock claude-2 restriction	2023-11-23 12:58:54 -03:00

30 Commits