Commit Graph

105 Commits

Author SHA1 Message Date
Sam bc0657f478
FEATURE: AI Usage page (#964)
- Added a new admin interface to track AI usage metrics, including tokens, features, and models.
- Introduced a new route `/admin/plugins/discourse-ai/ai-usage` and supporting API endpoint in `AiUsageController`.
- Implemented `AiUsageSerializer` for structuring AI usage data.
- Integrated CSS stylings for charts and tables under `stylesheets/modules/llms/common/usage.scss`.
- Enhanced backend with `AiApiAuditLog` model changes: added `cached_tokens` column  (implemented with OpenAI for now) with relevant DB migration and indexing.
- Created `Report` module for efficient aggregation and filtering of AI usage metrics.
- Updated AI Bot title generation logic to log correctly to user vs bot
- Extended test coverage for the new tracking features, ensuring data consistency and access controls.
2024-11-29 06:26:48 +11:00
Keegan George 4da033c667
FIX: Double render error with thumbnail suggestions (#968)
This PR fixes a bug where a double render error appears in the logs when thumbnails are suggested
2024-11-27 15:12:27 -08:00
Sam 86cf4ccba7
FIX: automatically bust cache for share ai assets (#942)
* FIX: automatically bust cache for share ai assets

CDNs can be configured to strip query params in Discourse
hosting. This is generally safe, but in this case we had
no way of busting the cache using the path.

New design properly caches and properly breaks busts the
cache if asset changes so we don't need to worry about versions

* one day I will set up conditional lint on save :)
2024-11-22 11:23:15 +11:00
Sam 52c644798d
DEV: improve artifact presentation (#932)
1. Keep source in a "details" block after rendered so it does
not overwhelm users

2. Ensure artifacts are never indexed by robots

3. Cache break our CSS that changed recently
2024-11-20 18:53:19 +11:00
David Taylor b10be23533
FIX: Ensure artifacts are sandboxed, even when visited directly (#921)
It's important that artifacts are never given 'same origin' access to the forum domain, so that they cannot access cookies, or make authenticated HTTP requests. So even when visiting the URL directly, we need to wrap them in a sandboxed iframe.
2024-11-19 11:44:17 +00:00
Sam 3ae1e4eaf0
FIX: properly bypass CSP for artifacts (#920)
Was meant to be bypassed but was not implemented correctly
2024-11-19 20:25:07 +11:00
Sam 0d7f353284
FEATURE: AI artifacts (#898)
This is a significant PR that introduces AI Artifacts functionality to the discourse-ai plugin along with several other improvements. Here are the key changes:

1. AI Artifacts System:
   - Adds a new `AiArtifact` model and database migration
   - Allows creation of web artifacts with HTML, CSS, and JavaScript content
   - Introduces security settings (`strict`, `lax`, `disabled`) for controlling artifact execution
   - Implements artifact rendering in iframes with sandbox protection
   - New `CreateArtifact` tool for AI to generate interactive content

2. Tool System Improvements:
   - Adds support for partial tool calls, allowing incremental updates during generation
   - Better handling of tool call states and progress tracking
   - Improved XML tool processing with CDATA support
   - Fixes for tool parameter handling and duplicate invocations

3. LLM Provider Updates:
   - Updates for Anthropic Claude models with correct token limits
   - Adds support for native/XML tool modes in Gemini integration
   - Adds new model configurations including Llama 3.1 models
   - Improvements to streaming response handling

4. UI Enhancements:
   - New artifact viewer component with expand/collapse functionality
   - Security controls for artifact execution (click-to-run in strict mode)
   - Improved dialog and response handling
   - Better error management for tool execution

5. Security Improvements:
   - Sandbox controls for artifact execution
   - Public/private artifact sharing controls
   - Security settings to control artifact behavior
   - CSP and frame-options handling for artifacts

6. Technical Improvements:
   - Better post streaming implementation
   - Improved error handling in completions
   - Better memory management for partial tool calls
   - Enhanced testing coverage

7. Configuration:
   - New site settings for artifact security
   - Extended LLM model configurations
   - Additional tool configuration options

This PR significantly enhances the plugin's capabilities for generating and displaying interactive content while maintaining security and providing flexible configuration options for administrators.
2024-11-19 09:22:39 +11:00
Sam 823e8ef490
FEATURE: partial tool call support for OpenAI and Anthropic (#908)
Implement streaming tool call implementation for Anthropic and Open AI.

When calling:

llm.generate(..., partial_tool_calls: true) do ...
Partials may contain ToolCall instances with partial: true, These tool calls are partially populated with json partially parsed.

So for example when performing a search you may get:

ToolCall(..., {search: "hello" })
ToolCall(..., {search: "hello world" })

The library used to parse json is:

https://github.com/dgraham/json-stream

We use a fork cause we need access to the internal buffer.

This prepares internals to perform partial tool calls, but does not implement it yet.
2024-11-14 06:58:24 +11:00
Rafael dos Santos Silva aef9a03d4c
FEATURE: Truncate AI Captions to a reasonable max size (#907) 2024-11-12 15:52:46 -03:00
Sam e817b7dc11
FEATURE: improve tool support (#904)
This re-implements tool support in DiscourseAi::Completions::Llm #generate

Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML

New implementation has the endpoints return ToolCall objects.

Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement

decode, decode_chunk (for streaming)

It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up.

Also (new)

    Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM
    Token accounting is fixed for vllm (we were not correctly counting tokens)
2024-11-12 08:14:30 +11:00
Sam 98022d7d96
FEATURE: support custom instructions for persona streaming (#890)
This allows us to inject information into the system prompt
which can help shape replies without repeating over and over
in messages.
2024-11-05 07:43:26 +11:00
Sam 34a59b623e
FIX: ensure replies are never double streamed (#879)
The custom field "discourse_ai_bypass_ai_reply" was added so
we can signal the post created hook to bypass replying even
if it thinks it should.

Otherwise there are cases where we double answer user questions
leading to much confusion.

This also slightly refactors code making the controller smaller
2024-10-30 20:24:39 +11:00
Sam be0b78cacd
FEATURE: new endpoint for directly accessing a persona (#876)
The new `/admin/plugins/discourse-ai/ai-personas/stream-reply.json` was added.

This endpoint streams data direct from a persona and can be used
to access a persona from remote systems leaving a paper trail in
PMs about the conversation that happened

This endpoint is only accessible to admins.

---------

Co-authored-by: Gabriel Grubba <70247653+Grubba27@users.noreply.github.com>
Co-authored-by: Keegan George <kgeorge13@gmail.com>
2024-10-30 10:28:20 +11:00
Sam 12869f2146
FIX: testing tool was not showing rag results (#867)
This changeset contains 4 fixes:

1. We were allowing running tests on unsaved tools,
this is problematic cause uploads are not yet associated or indexed
leading to confusing results. We now only show the test button when
tool is saved.


2. We were not properly scoping rag document fragements, this
meant that personas and ai tools could get results from other
unrelated tools, just to be filtered out later


3. index.search showed options as "optional" but implementation
required the second option

4. When testing tools searching through document fragments was
not working at all cause we did not properly load the tool
2024-10-25 16:01:25 +11:00
Rafael dos Santos Silva 96f5f8cbd0
FIX: Basic cleanup of AI Caption to remove line breaks and pipes (#857) 2024-10-23 18:38:29 -03:00
Sam a1f859a415
FEATURE: improve visibility of AI usage in LLM page (#845)
This changeset: 

1. Corrects some issues with "force_default_llm" not applying
2. Expands the LLM list page to show LLM usage
3. Clarifies better what "enabling a bot" on an llm means (you get it in the selector)
2024-10-22 11:16:02 +11:00
Rafael dos Santos Silva 792703c942
FEATURE: Discord Bot integration (#831)
This adds support for the a Discord bot that can search in a Discourse instance when invoked via slash commands in Discord Guild channel.
2024-10-16 12:41:18 -03:00
Sam bdf3b6268b
FEATURE: smarter persona tethering (#832)
Splits persona permissions so you can allow a persona on:

- chat dms
- personal messages
- topic mentions
- chat channels

(any combination is allowed)

Previously we did not have this flexibility.

Additionally, adds the ability to "tether" a language model to a persona so it will always be used by the persona. This allows people to use a cheaper language model for one group of people and more expensive one for other people
2024-10-16 07:20:31 +11:00
Roman Rizzi c7acb4a6a0
REFACTOR: Support of different summarization targets/prompts. (#835)
* DEV: Add summary types

* Refactor for different summary types

* Use enum for summary types

* Update lib/summarization/strategies/topic_summary.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Update lib/summarization/strategies/topic_gist.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Update lib/summarization/strategies/chat_messages.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Fix chat_messages single prompt

* Small tweak to the chat summarization prompt

---------

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2024-10-15 13:53:26 -03:00
Sam 6c4c96e83c
FEATURE: allow persona to only force tool calls on limited replies (#827)
This introduces another configuration that allows operators to
limit the amount of interactions with forced tool usage.

Forced tools are very handy in initial llm interactions, but as
conversation progresses they can hinder by slowing down stuff
and adding confusion.
2024-10-11 07:23:42 +11:00
Sam 545500b329
FEATURE: allows forced LLM tool use (#818)
* FEATURE: allows forced LLM tool use

Sometimes we need to force LLMs to use tools, for example in RAG
like use cases we may want to force an unconditional search.

The new framework allows you backend to force tool usage.

Front end commit to follow

* UI for forcing tools now works, but it does not react right

* fix bugs

* fix tests, this is now ready for review
2024-10-05 09:46:57 +10:00
Keegan George 110a1629aa
DEV: Update rate limits for image captioning (#816)
This PR updates the rate limits for AI helper so that image caption follows a specific rate limit of 20 requests per minute. This should help when uploading multiple files that need to be captioned. This PR also updates the UI so that it shows toast message with the extracted error message instead of having a blocking `popupAjaxError` error dialog.
---------

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2024-10-02 10:36:35 -07:00
Sam 5cbc9190eb
FEATURE: RAG search within tools (#802)
This allows custom tools access to uploads and sophisticated searches using embedding.

It introduces:

 - A shared front end for listing and uploading files (shared with personas)
 -  Backend implementation of index.search function within a custom tool.

Custom tools now may search through uploaded files

function invoke(params) {
   return index.search(params.query)
}

This means that RAG implementers now may preload tools with knowledge and have high fidelity over
the search.

The search function support

    specifying max results
    specifying a subset of files to search (from uploads)

Also

 - Improved documentation for tools (when creating a tool a preamble explains all the functionality)
  - uploads were a bit finicky, fixed an edge case where the UI would not show them as updated
2024-09-30 17:27:50 +10:00
Kris 18ecc843e5
UX: move templates to main LLM config tab, restyle (#813)
Restructures LLM config page so it is far clearer. 

Also corrects bugs around adding LLMs and having LLMs not editable post addition 
---------

Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
2024-09-30 17:15:11 +10:00
Sam 03eccbe392
FEATURE: Make tool support polymorphic (#798)
Polymorphic RAG means that we will be able to access RAG fragments both from AiPersona and AiCustomTool

In turn this gives us support for richer RAG implementations.
2024-09-16 08:17:17 +10:00
Sam a5b5c3bebe
PERF: speed up spec (#794)
~500ms -> ~100ms

It is still not a super fast spec given search is not free, but
it is a bit faster and clearer
2024-09-04 16:14:32 +10:00
Sam cabecb801e
FEATURE: disable rate limiting when skipping hyde (#793)
Embedding search is rate limited due to potentially expensive
hyde operation (which require LLM access).

Embedding generally is very cheap compared to it. (usually 100x cheaper)

This raises the limit to 100 per minute for embedding searches,
while keeping the old 4 per minute for HyDE powered search.
2024-09-04 15:51:01 +10:00
Roman Rizzi e408cd080c
FIX: coerce value before downcasing the hyde param (#787) 2024-08-30 12:13:29 -03:00
Sam eee8e72756
FEATURE: API scope for semantic search (#785)
The new API scope allows restricting access to semantic search
only.
2024-08-30 09:35:20 +10:00
Sam 0687ec75c3
FEATURE: allow embedding based search without hyde (#777)
This allows callers of embedding based search to bypass hyde.

Hyde will expand the search term using an LLM, but if an LLM is
performing the search we can skip this expansion.

It also introduced some tests for the controller which we did not have
2024-08-28 14:17:34 +10:00
Roman Rizzi 64641b6175
FEATURE: LLM Triage support for systemless models. (#757)
* FEATURE: LLM Triage support for systemless models.

This change adds support for OSS models without support for system messages. LlmTriage's system message field is no longer mandatory. We now send the post contents in a separate user message.

* Models using Ollama can also disable system prompts
2024-08-21 11:41:55 -03:00
Sam 14443bf890
FIX: more robust summary implementation (#750)
When navigating between topic we were not correctly resetting
internal state for summarization. This leads to a situation where
incorrect summaries can be displayed to users and wrong summaries
can be displayed.

Additionally our controller for grabbing summaries was always
streaming results via message bus, which could be delayed when
sidekiq is overloaded. We now will return the cached summary
right away if it is available direct from REST endpoint.
2024-08-13 08:47:47 -03:00
Keegan George f72ab12761
DEV: Clearly separate post/composer helper settings (#747) 2024-08-12 15:40:23 -07:00
Roman Rizzi 20efc9285e
FIX: Correctly save provider-specific params for new models. (#744)
Creating a new model, either manually or from presets, doesn't initialize the `provider_params` object, meaning their custom params won't persist.

Additionally, this change adds some validations for Bedrock params, which are mandatory, and a clear message when a completion fails because we cannot build the URL.
2024-08-07 16:08:56 -03:00
Roman Rizzi 7b4c099673
FIX: LlmModel validations. (#742)
- Validate fields to reduce the chance of breaking features by a misconfigured model.
- Fixed a bug where the URL might get deleted during an update.
- Display a warning when a model is currently in use.
2024-08-06 14:35:35 -03:00
Roman Rizzi 5c196bca89
FEATURE: Track if a model can do vision in the llm_models table (#725)
* FEATURE: Track if a model can do vision in the llm_models table

* Data migration
2024-07-24 16:29:47 -03:00
Roman Rizzi f328b81c78
FIX: Make sure custom tool enums follow json-schema. (#718)
Enums didn't work as expected because we the dialect couldn't translate
them correctly. It doesn't understand what "enum_values" is.
2024-07-16 14:23:17 -03:00
Sam 1320eed9b2
FEATURE: move summary to use llm_model (#699)
This allows summary to use the new LLM models and migrates of API key based model selection

Claude 3.5 etc... all work now. 

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-07-04 10:48:18 +10:00
Keegan George 1b0ba9197c
DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
Rafael dos Santos Silva a708d4dfa2
FIX: Use base64 encoded images in AI Image Caption via LLaVa (#693)
* FIX: Use base64 encoded images in AI Image Caption via LLaVa

This fixed a regression introduced in #646 where we started sending
schemaless URLs for our LLaVa service, which doesn't handle it well.

Moving to base64 encoded images solves:

- The service needing to download images
  Now the service running LLaVa doesn't need internet access

- Secure uploads compat
  Every image is treated the same, less branching for secure uploads

- Image Size problems
  Discourse is now responsible for ensure a max size for images

- Troublesome dev env
  Previously to this commit you would need a dev env that was internet
acessible to use llava image captions
2024-06-27 16:24:44 -03:00
Sam b863ddc94b
FEATURE: custom user defined tools (#677)
Introduces custom AI tools functionality. 

1. Why it was added:
   The PR adds the ability to create, manage, and use custom AI tools within the Discourse AI system. This feature allows for more flexibility and extensibility in the AI capabilities of the platform.

2. What it does:
   - Introduces a new `AiTool` model for storing custom AI tools
   - Adds CRUD (Create, Read, Update, Delete) operations for AI tools
   - Implements a tool runner system for executing custom tool scripts
   - Integrates custom tools with existing AI personas
   - Provides a user interface for managing custom tools in the admin panel

3. Possible use cases:
   - Creating custom tools for specific tasks or integrations (stock quotes, currency conversion etc...)
   - Allowing administrators to add new functionalities to AI assistants without modifying core code
   - Implementing domain-specific tools for particular communities or industries

4. Code structure:
   The PR introduces several new files and modifies existing ones:

   a. Models:
      - `app/models/ai_tool.rb`: Defines the AiTool model
      - `app/serializers/ai_custom_tool_serializer.rb`: Serializer for AI tools

   b. Controllers:
      - `app/controllers/discourse_ai/admin/ai_tools_controller.rb`: Handles CRUD operations for AI tools

   c. Views and Components:
      - New Ember.js components for tool management in the admin interface
      - Updates to existing AI persona management components to support custom tools 

   d. Core functionality:
      - `lib/ai_bot/tool_runner.rb`: Implements the custom tool execution system
      - `lib/ai_bot/tools/custom.rb`: Defines the custom tool class

   e. Routes and configurations:
      - Updates to route configurations to include new AI tool management pages

   f. Migrations:
      - `db/migrate/20240618080148_create_ai_tools.rb`: Creates the ai_tools table

   g. Tests:
      - New test files for AI tool functionality and integration

The PR integrates the custom tools system with the existing AI persona framework, allowing personas to use both built-in and custom tools. It also includes safety measures such as timeouts and HTTP request limits to prevent misuse of custom tools.

Overall, this PR significantly enhances the flexibility and extensibility of the Discourse AI system by allowing administrators to create and manage custom AI tools tailored to their specific needs.

Co-authored-by: Martin Brennan <martin@discourse.org>
2024-06-27 17:27:40 +10:00
Roman Rizzi e39e0bdb4a
FIX: Move the bot user toggling to the controller. (#688)
Having this as a callback prevents deploys of sites with a vLLM SRV configured and pending migrations. Additionally, this fixes a bug where we didn't delete/deactivate the companion user after deleting an LLM.
2024-06-25 12:45:19 -03:00
Roman Rizzi f622e2644f
FEATURE: Store provider-specific parameters. (#686)
Previously, we stored request parameters like the OpenAI organization and Bedrock's access key and region as site settings. This change stores them in the `llm_models` table instead, letting us drop more settings while also becoming more flexible.
2024-06-25 08:26:30 +10:00
Roman Rizzi 8849caf136
DEV: Transition "Select model" settings to only use LlmModels (#675)
We no longer support the "provider:model" format in the "ai_helper_model" and
"ai_embeddings_semantic_search_hyde_model" settings. We'll migrate existing
values and work with our new data-driven LLM configs from now on.
2024-06-19 18:01:35 -03:00
Roman Rizzi 8d5f901a67
DEV: Rewire AI bot internals to use LlmModel (#638)
* DRAFT: Create AI Bot users dynamically and support custom LlmModels

* Get user associated to llm_model

* Track enabled bots with attribute

* Don't store bot username. Minor touches to migrate default values in settings

* Handle scenario where vLLM uses a SRV record

* Made 3.5-turbo-16k the default version so we can remove hack
2024-06-18 14:32:14 -03:00
Sam 52a7dd2a4b
FEATURE: optional tool detail blocks (#662)
This is a rather huge refactor with 1 new feature (tool details can
be suppressed)

Previously we use the name "Command" to describe "Tools", this unifies
all the internal language and simplifies the code.

We also amended the persona UI to use less DToggles which aligns
with our design guidelines.

Co-authored-by: Martin Brennan <martin@discourse.org>
2024-06-11 18:14:14 +10:00
Sam 13840f68b3
FEATURE: restrict public sharing on login required sites (#649)
Initial implementation allowed internet wide sharing of
AI conversations, on sites that require login.

This feature can be an anti feature for private sites cause they
can not share conversations internally.

For now we are removing support for public sharing on login required
sites, if the community need the feature we can consider adding a
setting.
2024-05-29 11:04:47 +10:00
Sam b487de933d
FEATURE: add support for all vision models (#646)
Previoulsy on GPT-4-vision was supported, change introduces support
for Google/Anthropic and new OpenAI models

Additionally this makes vision work properly in dev environments
cause we sent the encoded payload via prompt vs sending urls
2024-05-28 10:31:15 -03:00
Roman Rizzi 333b331eb9
FEATURE: Allow deleting custom LLMs. (#643)
This change allows us to delete custom models. It checks if there is no module using them.

It also fixes a bug where the after-create transition wasn't working. While this prevents a model from being saved multiple times, endpoint validations are still needed (will be added in a separate PR).:
2024-05-27 16:44:08 -03:00
Keegan George a1c649965f
FEATURE: Auto image captions (#637) 2024-05-27 10:49:24 -07:00