This re-implements tool support in DiscourseAi::Completions::Llm #generate
Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML
New implementation has the endpoints return ToolCall objects.
Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement
decode, decode_chunk (for streaming)
It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up.
Also (new)
Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM
Token accounting is fixed for vllm (we were not correctly counting tokens)
* FIX: Llm selector / forced tools / search tool
This fixes a few issues:
1. When search was not finding any semantic results we would break the tool
2. Gemin / Anthropic models did not implement forced tools previously despite it being an API option
3. Mechanics around displaying llm selector were not right. If you disabled LLM selector server side persona PM did not work correctly.
4. Disabling native tools for anthropic model moved out of a site setting. This deliberately does not migrate cause this feature is really rare to need now, people who had it set probably did not need it.
5. Updates anthropic model names to latest release
* linting
* fix a couple of tests I missed
* clean up conditional
* DEV: Remove old code now that features rely on LlmModels.
* Hide old settings and migrate persona llm overrides
* Remove shadowing special URL + seeding code. Use srv:// prefix instead.
Native tools do not work well on Opus.
Chain of Thought prompting means it consumes enormous amounts of
tokens and has poor latency.
This commit introduce and XML stripper to remove various chain of
thought XML islands from anthropic prompts when tools are involved.
This mean Opus native tools is now functions (albeit slowly)
From local testing XML just works better now.
Also fixes enum support in Anthropic native tools
This PR consolidates the implements new Anthropic Messages interface for Bedrock Claude endpoints and adds support for the new Claude 3 models (haiku, opus, sonnet).
Key changes:
- Renamed `AnthropicMessages` and `Anthropic` endpoint classes into a single `Anthropic` class (ditto for ClaudeMessages -> Claude)
- Updated `AwsBedrock` endpoints to use the new `/messages` API format for all Claude models
- Added `claude-3-haiku`, `claude-3-opus` and `claude-3-sonnet` model support in both Anthropic and AWS Bedrock endpoints
- Updated specs for the new consolidated endpoints and Claude 3 model support
This refactor removes support for old non messages API which has been deprecated by anthropic
* REFACTOR: Represent generic prompts with an Object.
* Adds a bit more validation for clarity
* Rewrite bot title prompt and fix quirk handling
---------
Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
This PR adds tool support to available LLMs. We'll buffer tool invocations and return them instead of making users of this service parse the response.
It also adds support for conversation context in the generic prompt. It includes bot messages, user messages, and tool invocations, which we'll trim to make sure it doesn't exceed the prompt limit, then translate them to the correct dialect.
Finally, It adds some buffering when reading chunks to handle cases when streaming is extremely slow.:M
Previous to this change we relied on explicit loading for a files in Discourse AI.
This had a few downsides:
- Busywork whenever you add a file (an extra require relative)
- We were not keeping to conventions internally ... some places were OpenAI others are OpenAi
- Autoloader did not work which lead to lots of full application broken reloads when developing.
This moves all of DiscourseAI into a Zeitwerk compatible structure.
It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there)
To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi
Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections
Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.
* Revert "FIX: We don't need to prepend anthropic. to bedrock models (#308)"
This reverts commit 8a01751991.
* FIX: Bedrock uses slightly different model names
* DEV: One LLM abstraction to rule them all
* REFACTOR: HyDE search uses new LLM abstraction
* REFACTOR: Summarization uses the LLM abstraction
* Updated documentation and made small fixes. Remove Bedrock claude-2 restriction