This is a significant PR that introduces AI Artifacts functionality to the discourse-ai plugin along with several other improvements. Here are the key changes:
1. AI Artifacts System:
- Adds a new `AiArtifact` model and database migration
- Allows creation of web artifacts with HTML, CSS, and JavaScript content
- Introduces security settings (`strict`, `lax`, `disabled`) for controlling artifact execution
- Implements artifact rendering in iframes with sandbox protection
- New `CreateArtifact` tool for AI to generate interactive content
2. Tool System Improvements:
- Adds support for partial tool calls, allowing incremental updates during generation
- Better handling of tool call states and progress tracking
- Improved XML tool processing with CDATA support
- Fixes for tool parameter handling and duplicate invocations
3. LLM Provider Updates:
- Updates for Anthropic Claude models with correct token limits
- Adds support for native/XML tool modes in Gemini integration
- Adds new model configurations including Llama 3.1 models
- Improvements to streaming response handling
4. UI Enhancements:
- New artifact viewer component with expand/collapse functionality
- Security controls for artifact execution (click-to-run in strict mode)
- Improved dialog and response handling
- Better error management for tool execution
5. Security Improvements:
- Sandbox controls for artifact execution
- Public/private artifact sharing controls
- Security settings to control artifact behavior
- CSP and frame-options handling for artifacts
6. Technical Improvements:
- Better post streaming implementation
- Improved error handling in completions
- Better memory management for partial tool calls
- Enhanced testing coverage
7. Configuration:
- New site settings for artifact security
- Extended LLM model configurations
- Additional tool configuration options
This PR significantly enhances the plugin's capabilities for generating and displaying interactive content while maintaining security and providing flexible configuration options for administrators.
Native tools do not work well on Opus.
Chain of Thought prompting means it consumes enormous amounts of
tokens and has poor latency.
This commit introduce and XML stripper to remove various chain of
thought XML islands from anthropic prompts when tools are involved.
This mean Opus native tools is now functions (albeit slowly)
From local testing XML just works better now.
Also fixes enum support in Anthropic native tools
Both endpoints provide OpenAI-compatible servers. The only difference is that Vllm doesn't support passing tools as a separate parameter. Even if the tool param is supported, it ultimately relies on the model's ability to handle native functions, which is not the case with the models we have today.
As a part of this change, we are dropping support for StableBeluga/Llama2 models. They don't have a chat_template, meaning the new API can translate them.
These changes let us remove some of our existing dialects and are a first step in our plan to support any LLM by defining them as data-driven concepts.
I rewrote the "translate" method to use a template method and extracted the tool support strategies into its classes to simplify the code.
Finally, these changes bring support for Ollama when running in dev mode. It only works with Mistral for now, but it will change soon..