- Added a new admin interface to track AI usage metrics, including tokens, features, and models.
- Introduced a new route `/admin/plugins/discourse-ai/ai-usage` and supporting API endpoint in `AiUsageController`.
- Implemented `AiUsageSerializer` for structuring AI usage data.
- Integrated CSS stylings for charts and tables under `stylesheets/modules/llms/common/usage.scss`.
- Enhanced backend with `AiApiAuditLog` model changes: added `cached_tokens` column (implemented with OpenAI for now) with relevant DB migration and indexing.
- Created `Report` module for efficient aggregation and filtering of AI usage metrics.
- Updated AI Bot title generation logic to log correctly to user vs bot
- Extended test coverage for the new tracking features, ensuring data consistency and access controls.
* FEATURE: allow mentioning an LLM mid conversation to switch
This is a edgecase feature that allow you to start a conversation
in a PM with LLM1 and then use LLM2 to evaluation or continue
the conversation
* FEATURE: allow auto silencing of spam accounts
New rule can also allow for silencing an account automatically
This can prevent spammers from creating additional posts.
This is a significant PR that introduces AI Artifacts functionality to the discourse-ai plugin along with several other improvements. Here are the key changes:
1. AI Artifacts System:
- Adds a new `AiArtifact` model and database migration
- Allows creation of web artifacts with HTML, CSS, and JavaScript content
- Introduces security settings (`strict`, `lax`, `disabled`) for controlling artifact execution
- Implements artifact rendering in iframes with sandbox protection
- New `CreateArtifact` tool for AI to generate interactive content
2. Tool System Improvements:
- Adds support for partial tool calls, allowing incremental updates during generation
- Better handling of tool call states and progress tracking
- Improved XML tool processing with CDATA support
- Fixes for tool parameter handling and duplicate invocations
3. LLM Provider Updates:
- Updates for Anthropic Claude models with correct token limits
- Adds support for native/XML tool modes in Gemini integration
- Adds new model configurations including Llama 3.1 models
- Improvements to streaming response handling
4. UI Enhancements:
- New artifact viewer component with expand/collapse functionality
- Security controls for artifact execution (click-to-run in strict mode)
- Improved dialog and response handling
- Better error management for tool execution
5. Security Improvements:
- Sandbox controls for artifact execution
- Public/private artifact sharing controls
- Security settings to control artifact behavior
- CSP and frame-options handling for artifacts
6. Technical Improvements:
- Better post streaming implementation
- Improved error handling in completions
- Better memory management for partial tool calls
- Enhanced testing coverage
7. Configuration:
- New site settings for artifact security
- Extended LLM model configurations
- Additional tool configuration options
This PR significantly enhances the plugin's capabilities for generating and displaying interactive content while maintaining security and providing flexible configuration options for administrators.
This re-implements tool support in DiscourseAi::Completions::Llm #generate
Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML
New implementation has the endpoints return ToolCall objects.
Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement
decode, decode_chunk (for streaming)
It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up.
Also (new)
Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM
Token accounting is fixed for vllm (we were not correctly counting tokens)
Splits persona permissions so you can allow a persona on:
- chat dms
- personal messages
- topic mentions
- chat channels
(any combination is allowed)
Previously we did not have this flexibility.
Additionally, adds the ability to "tether" a language model to a persona so it will always be used by the persona. This allows people to use a cheaper language model for one group of people and more expensive one for other people
This introduces another configuration that allows operators to
limit the amount of interactions with forced tool usage.
Forced tools are very handy in initial llm interactions, but as
conversation progresses they can hinder by slowing down stuff
and adding confusion.
This adds chain halting (ability to terminate llm chain in a tool)
and the ability to create uploads in a tool
Together this lets us integrate custom image generators into a
custom tool.
* FEATURE: allows forced LLM tool use
Sometimes we need to force LLMs to use tools, for example in RAG
like use cases we may want to force an unconditional search.
The new framework allows you backend to force tool usage.
Front end commit to follow
* UI for forcing tools now works, but it does not react right
* fix bugs
* fix tests, this is now ready for review
Previously we waited 1 minute before automatically titling PMs
The new change introduces adding a title immediately after the the
llm replies
Prompt was also modified to include the LLM reply in title suggestion.
This helps situation like:
user: tell me a joke
llm: a very funy joke about horses
Then the title would be "A Funny Horse Joke"
Specs already covered some auto title logic, amended to also
catch the new message bus message we have been sending.
* DEV: Remove old code now that features rely on LlmModels.
* Hide old settings and migrate persona llm overrides
* Remove shadowing special URL + seeding code. Use srv:// prefix instead.
Introduces custom AI tools functionality.
1. Why it was added:
The PR adds the ability to create, manage, and use custom AI tools within the Discourse AI system. This feature allows for more flexibility and extensibility in the AI capabilities of the platform.
2. What it does:
- Introduces a new `AiTool` model for storing custom AI tools
- Adds CRUD (Create, Read, Update, Delete) operations for AI tools
- Implements a tool runner system for executing custom tool scripts
- Integrates custom tools with existing AI personas
- Provides a user interface for managing custom tools in the admin panel
3. Possible use cases:
- Creating custom tools for specific tasks or integrations (stock quotes, currency conversion etc...)
- Allowing administrators to add new functionalities to AI assistants without modifying core code
- Implementing domain-specific tools for particular communities or industries
4. Code structure:
The PR introduces several new files and modifies existing ones:
a. Models:
- `app/models/ai_tool.rb`: Defines the AiTool model
- `app/serializers/ai_custom_tool_serializer.rb`: Serializer for AI tools
b. Controllers:
- `app/controllers/discourse_ai/admin/ai_tools_controller.rb`: Handles CRUD operations for AI tools
c. Views and Components:
- New Ember.js components for tool management in the admin interface
- Updates to existing AI persona management components to support custom tools
d. Core functionality:
- `lib/ai_bot/tool_runner.rb`: Implements the custom tool execution system
- `lib/ai_bot/tools/custom.rb`: Defines the custom tool class
e. Routes and configurations:
- Updates to route configurations to include new AI tool management pages
f. Migrations:
- `db/migrate/20240618080148_create_ai_tools.rb`: Creates the ai_tools table
g. Tests:
- New test files for AI tool functionality and integration
The PR integrates the custom tools system with the existing AI persona framework, allowing personas to use both built-in and custom tools. It also includes safety measures such as timeouts and HTTP request limits to prevent misuse of custom tools.
Overall, this PR significantly enhances the flexibility and extensibility of the Discourse AI system by allowing administrators to create and manage custom AI tools tailored to their specific needs.
Co-authored-by: Martin Brennan <martin@discourse.org>
* DRAFT: Create AI Bot users dynamically and support custom LlmModels
* Get user associated to llm_model
* Track enabled bots with attribute
* Don't store bot username. Minor touches to migrate default values in settings
* Handle scenario where vLLM uses a SRV record
* Made 3.5-turbo-16k the default version so we can remove hack
This is a rather huge refactor with 1 new feature (tool details can
be suppressed)
Previously we use the name "Command" to describe "Tools", this unifies
all the internal language and simplifies the code.
We also amended the persona UI to use less DToggles which aligns
with our design guidelines.
Co-authored-by: Martin Brennan <martin@discourse.org>
When the bot is @mentioned, we need to be a lot more careful
about constructing context otherwise bot gets ultra confused.
This changes multiple things:
1. We were omitting all thread first messages (fixed)
2. Include thread title (if available) in context
3. Construct context in a clearer way separating user request from data
Add support for chat with AI personas
- Allow enabling chat for AI personas that have an associated user
- Add new setting `allow_chat` to AI persona to enable/disable chat
- When a message is created in a DM channel with an allowed AI persona user, schedule a reply job
- AI replies to chat messages using the persona's `max_context_posts` setting to determine context
- Store tool calls and custom prompts used to generate a chat reply on the `ChatMessageCustomPrompt` table
- Add tests for AI chat replies with tools and context
At the moment unlike posts we do not carry tool calls in the context.
No @mention support yet for ai personas in channels, this is future work
This commit adds the ability to enable vision for AI personas, allowing them to understand images that are posted in the conversation.
For personas with vision enabled, any images the user has posted will be resized to be within the configured max_pixels limit, base64 encoded and included in the prompt sent to the AI provider.
The persona editor allows enabling/disabling vision and has a dropdown to select the max supported image size (low, medium, high). Vision is disabled by default.
This initial vision support has been tested and implemented with Anthropic's claude-3 models which accept images in a special format as part of the prompt.
Other integrations will need to be updated to support images.
Several specs were added to test the new functionality at the persona, prompt building and API layers.
- Gemini is omitted, pending API support for Gemini 1.5. Current Gemini bot is not performing well, adding images is unlikely to make it perform any better.
- Open AI is omitted, vision support on GPT-4 it limited in that the API has no tool support when images are enabled so we would need to full back to a different prompting technique, something that would add lots of complexity
---------
Co-authored-by: Martin Brennan <martin@discourse.org>
- Stop replying as bot, when human replies to another human
- Reply as correct persona when replying directly to a persona
- Fix paper cut where suppressing notifications was not doing so
Introduces a new AI Bot persona called 'GitHub Helper' which is specialized in assisting with GitHub-related tasks and questions. It includes the following key changes:
- Implements the GitHub Helper persona class with its system prompt and available tools
- Adds three new AI Bot tools for GitHub interactions:
- github_file_content: Retrieves content of files from a GitHub repository
- github_pull_request_diff: Retrieves the diff for a GitHub pull request
- github_search_code: Searches for code in a GitHub repository
- Updates the AI Bot dialects to support the new GitHub tools
- Implements multiple function calls for standard tool dialect
This provides new support for messages API from Claude.
It is required for latest model access.
Also corrects implementation of function calls.
* Fix message interleving
* fix broken spec
* add new models to automation
- FIX: only update system attributes when updating system persona
- FIX: update participant count by hand so bot messages show in inbox
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
* FIX: support multiple tool calls
Prior to this change we had a hard limit of 1 tool call per llm
round trip. This meant you could not google multiple things at
once or perform searches across two tools.
Also:
- Hint when Google stops working
- Log topic_id / post_id when performing completions
* Also track id for title
* DEV: improve internal design of ai persona and bug fix
- Fixes bug where OpenAI could not describe images
- Fixes bug where mentionable personas could not be mentioned unless overarching bot was enabled
- Improves internal design of playground and bot to allow better for non "bot" users
- Allow PMs directly to persona users (previously bot user would also have to be in PM)
- Simplify internal code
Co-authored-by: Martin Brennan <martin@discourse.org>
1. Personas are now optionally mentionable, meaning that you can mention them either from public topics or PMs
- Mentioning from PMs helps "switch" persona mid conversation, meaning if you want to look up sites setting you can invoke the site setting bot, or if you want to generate an image you can invoke dall e
- Mentioning outside of PMs allows you to inject a bot reply in a topic trivially
- We also add the support for max_context_posts this allow you to limit the amount of context you feed in, which can help control costs
2. Add support for a "random picker" tool that can be used to pick random numbers
3. Clean up routing ai_personas -> ai-personas
4. Add Max Context Posts so users can control how much history a persona can consume (this is important for mentionable personas)
Co-authored-by: Martin Brennan <martin@discourse.org>
Account properly for function calls, don't stream through <details> blocks
- Rush cooked content back to client
- Wait longer (up to 60 seconds) before giving up on streaming
- Clean up message bus channels so we don't have leftover data
- Make ai streamer much more reusable and much easier to read
- If buffer grows quickly, rush update so you are not artificially waiting
- Refine prompt interface
- Fix lost system message when prompt gets long
* REFACTOR: Represent generic prompts with an Object.
* Adds a bit more validation for clarity
* Rewrite bot title prompt and fix quirk handling
---------
Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
Previous to this change it was very hard to tell if completion was
stuck or not.
This introduces a "dot" that follows the completion and starts
flashing after 5 seconds.
* FIX: don't include <details> in context
We need to be careful adding <details> into context of conversations
it can cause LLMs to hallucinate results
* Fix Gemini multi-turn ctx flattening
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
It also corrects the syntax around tool support, which was wrong.
Gemini doesn't want us to include messages about previous tool invocations, so I had to shuffle around some code to send the response it generated from those invocations instead. For this, I created the "multi_turn" context, which bundles all the context involved in the interaction.
* DEV: AI bot migration to the Llm pattern.
We added tool and conversation context support to the Llm service in discourse-ai#366, meaning we met all the conditions to migrate this module.
This PR migrates to the new pattern, meaning adding a new bot now requires minimal effort as long as the service supports it. On top of this, we introduce the concept of a "Playground" to separate the PM-specific bits from the completion, allowing us to use the bot in other contexts like chat in the future. Commands are called tools, and we simplified all the placeholder logic to perform updates in a single place, making the flow more one-wayish.
* Followup fixes based on testing
* Cleanup unused inference code
* FIX: text-based tools could be in the middle of a sentence
* GPT-4-turbo support
* Use new LLM API