Commit Graph

11 Commits

Author SHA1 Message Date
Sam 8b81ff45b8
FIX: switch off native tools on Anthropic Claude Opus (#659)
Native tools do not work well on Opus.

Chain of Thought prompting means it consumes enormous amounts of
tokens and has poor latency.

This commit introduce and XML stripper to remove various chain of
thought XML islands from anthropic prompts when tools are involved.

This mean Opus native tools is now functions (albeit slowly)

From local testing XML just works better now.

Also fixes enum support in Anthropic native tools
2024-06-07 10:52:01 -03:00
Sam 3993c685e1
FEATURE: anthropic function calling (#654)
Adds support for native tool calling (both streaming and non streaming) for Anthropic.

This improves general tool support on the Anthropic models.
2024-06-06 08:34:23 +10:00
Sam f62703760f
FEATURE: add Claude 3 sonnet/haiku support for Amazon Bedrock (#534)
This PR consolidates the  implements new Anthropic Messages interface for Bedrock Claude endpoints and adds support for the new Claude 3 models (haiku, opus, sonnet).

Key changes:
- Renamed `AnthropicMessages` and `Anthropic` endpoint classes into a single `Anthropic` class (ditto for ClaudeMessages -> Claude)
- Updated `AwsBedrock` endpoints to use the new `/messages` API format for all Claude models
- Added `claude-3-haiku`, `claude-3-opus` and `claude-3-sonnet` model support in both Anthropic and AWS Bedrock endpoints
- Updated specs for the new consolidated endpoints and Claude 3 model support

This refactor removes support for old non messages API which has been deprecated by anthropic
2024-03-19 06:48:46 +11:00
Sam 79638c2f50
FIX: Tune function calling (#519)
Adds support for "name" on functions which can be used for tool calls

For function calls we need to keep track of id/name and previously
we only supported either

Also attempts to improve sql helper
2024-03-09 08:46:40 +11:00
Sam 05d8b021f1
FIX: scrub invalid prompts when truncating (#426)
When you trim a prompt we never want to have a state where there
is a "tool" reply without a corresponding tool call, it makes no
sense

Also

- GPT-4-Turbo is 128k, fix that
- Claude was not preserving username in prompt
- We were throwing away unicode usernames instead of adding to
message
2024-01-16 13:48:00 +11:00
Roman Rizzi 04eae76f68
REFACTOR: Represent generic prompts with an Object. (#416)
* REFACTOR: Represent generic prompts with an Object.

* Adds a bit more validation for clarity

* Rewrite bot title prompt and fix quirk handling

---------

Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
2024-01-12 14:36:44 -03:00
Sam 8df966e9c5
FEATURE: smooth streaming of AI responses on the client (#413)
This PR introduces 3 things:

1. Fake bot that can be used on local so you can test LLMs, to enable on dev use:

SiteSetting.ai_bot_enabled_chat_bots = "fake"

2. More elegant smooth streaming of progress on LLM completion

This leans on JavaScript to buffer and trickle llm results through. It also amends it so the progress dot is much 
more consistently rendered

3. It fixes the Claude dialect 

Claude needs newlines **exactly** at the right spot, amended so it is happy 

---------

Co-authored-by: Martin Brennan <martin@discourse.org>
2024-01-11 15:56:40 +11:00
Roman Rizzi abde82c1f3
FIX: Use claude-2.1 to enable system prompts (#411) 2024-01-09 14:10:20 -03:00
Sam b0a0cbe3ca
FIX: improve bot behavior (#408)
* FIX: improve bot behavior

- Provide more information to Gemini context post function execution
- Use system prompts for Claude (fixes Dall E)
- Ensure Assistant is properly separated
- Teach Claude to return arrays in JSON vs XML

Also refactors tests so we do not copy tool preamble everywhere

* System msg is claude-2 only. fix typo

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-01-08 10:28:03 -03:00
Roman Rizzi e0bf6adb5b
DEV: Tool support for the LLM service. (#366)
This PR adds tool support to available LLMs. We'll buffer tool invocations and return them instead of making users of this service parse the response.

It also adds support for conversation context in the generic prompt. It includes bot messages, user messages, and tool invocations, which we'll trim to make sure it doesn't exceed the prompt limit, then translate them to the correct dialect.

Finally, It adds some buffering when reading chunks to handle cases when streaming is extremely slow.:M
2023-12-18 18:06:01 -03:00
Roman Rizzi 3064d4c288
REFACTOR: Summarization and HyDE now use an LLM abstraction. (#297)
* DEV: One LLM abstraction to rule them all

* REFACTOR: HyDE search uses new LLM abstraction

* REFACTOR: Summarization uses the LLM abstraction

* Updated documentation and made small fixes. Remove Bedrock claude-2 restriction
2023-11-23 12:58:54 -03:00