discourse-ai

Commit Graph

Author	SHA1	Message	Date
Sam	8b81ff45b8	FIX: switch off native tools on Anthropic Claude Opus (#659 ) Native tools do not work well on Opus. Chain of Thought prompting means it consumes enormous amounts of tokens and has poor latency. This commit introduce and XML stripper to remove various chain of thought XML islands from anthropic prompts when tools are involved. This mean Opus native tools is now functions (albeit slowly) From local testing XML just works better now. Also fixes enum support in Anthropic native tools	2024-06-07 10:52:01 -03:00
Sam	3993c685e1	FEATURE: anthropic function calling (#654 ) Adds support for native tool calling (both streaming and non streaming) for Anthropic. This improves general tool support on the Anthropic models.	2024-06-06 08:34:23 +10:00
Sam	f62703760f	FEATURE: add Claude 3 sonnet/haiku support for Amazon Bedrock (#534 ) This PR consolidates the implements new Anthropic Messages interface for Bedrock Claude endpoints and adds support for the new Claude 3 models (haiku, opus, sonnet). Key changes: - Renamed `AnthropicMessages` and `Anthropic` endpoint classes into a single `Anthropic` class (ditto for ClaudeMessages -> Claude) - Updated `AwsBedrock` endpoints to use the new `/messages` API format for all Claude models - Added `claude-3-haiku`, `claude-3-opus` and `claude-3-sonnet` model support in both Anthropic and AWS Bedrock endpoints - Updated specs for the new consolidated endpoints and Claude 3 model support This refactor removes support for old non messages API which has been deprecated by anthropic	2024-03-19 06:48:46 +11:00
Sam	79638c2f50	FIX: Tune function calling (#519 ) Adds support for "name" on functions which can be used for tool calls For function calls we need to keep track of id/name and previously we only supported either Also attempts to improve sql helper	2024-03-09 08:46:40 +11:00
Sam	05d8b021f1	FIX: scrub invalid prompts when truncating (#426 ) When you trim a prompt we never want to have a state where there is a "tool" reply without a corresponding tool call, it makes no sense Also - GPT-4-Turbo is 128k, fix that - Claude was not preserving username in prompt - We were throwing away unicode usernames instead of adding to message	2024-01-16 13:48:00 +11:00
Roman Rizzi	04eae76f68	REFACTOR: Represent generic prompts with an Object. (#416 ) * REFACTOR: Represent generic prompts with an Object. * Adds a bit more validation for clarity * Rewrite bot title prompt and fix quirk handling --------- Co-authored-by: Sam Saffron <sam.saffron@gmail.com>	2024-01-12 14:36:44 -03:00
Sam	8df966e9c5	FEATURE: smooth streaming of AI responses on the client (#413 ) This PR introduces 3 things: 1. Fake bot that can be used on local so you can test LLMs, to enable on dev use: SiteSetting.ai_bot_enabled_chat_bots = "fake" 2. More elegant smooth streaming of progress on LLM completion This leans on JavaScript to buffer and trickle llm results through. It also amends it so the progress dot is much more consistently rendered 3. It fixes the Claude dialect Claude needs newlines exactly at the right spot, amended so it is happy --------- Co-authored-by: Martin Brennan <martin@discourse.org>	2024-01-11 15:56:40 +11:00
Roman Rizzi	abde82c1f3	FIX: Use claude-2.1 to enable system prompts (#411 )	2024-01-09 14:10:20 -03:00
Sam	b0a0cbe3ca	FIX: improve bot behavior (#408 ) * FIX: improve bot behavior - Provide more information to Gemini context post function execution - Use system prompts for Claude (fixes Dall E) - Ensure Assistant is properly separated - Teach Claude to return arrays in JSON vs XML Also refactors tests so we do not copy tool preamble everywhere * System msg is claude-2 only. fix typo --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2024-01-08 10:28:03 -03:00
Roman Rizzi	e0bf6adb5b	DEV: Tool support for the LLM service. (#366 ) This PR adds tool support to available LLMs. We'll buffer tool invocations and return them instead of making users of this service parse the response. It also adds support for conversation context in the generic prompt. It includes bot messages, user messages, and tool invocations, which we'll trim to make sure it doesn't exceed the prompt limit, then translate them to the correct dialect. Finally, It adds some buffering when reading chunks to handle cases when streaming is extremely slow.:M	2023-12-18 18:06:01 -03:00
Roman Rizzi	3064d4c288	REFACTOR: Summarization and HyDE now use an LLM abstraction. (#297 ) * DEV: One LLM abstraction to rule them all * REFACTOR: HyDE search uses new LLM abstraction * REFACTOR: Summarization uses the LLM abstraction * Updated documentation and made small fixes. Remove Bedrock claude-2 restriction	2023-11-23 12:58:54 -03:00

11 Commits