This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:
**1. LLM Model Association for RAG and Personas:**
- **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM. Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
- **Migration:** Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
- **Model Changes:** The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier. `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
- **UI Updates:** The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled). The RAG options component displays an LLM selector.
- **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.
**2. PDF and Image Support for RAG:**
- **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
- **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled. Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
- **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
- **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
- **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
- **UI Updates:** The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.
**3. Refactoring and Improvements:**
- **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
- **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
- **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
- **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.
**4. Testing:**
- The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
* Use AR model for embeddings features
* endpoints
* Embeddings CRUD UI
* Add presets. Hide a couple more settings
* system specs
* Seed embedding definition from old settings
* Generate search bit index on the fly. cleanup orphaned data
* support for seeded models
* Fix run test for new embedding
* fix selected model not set correctly
* FEATURE: smart date support for AI helper
This feature allows conversion of human typed in dates and times
to smart "Discourse" timezone friendly dates.
* fix specs and lint
* lint
* address feedback
* add specs
In a previous refactor, we moved the responsibility of querying and storing embeddings into the `Schema` class. Now, it's time for embedding generation.
The motivation behind these changes is to isolate vector characteristics in simple objects to later replace them with a DB-backed version, similar to what we did with LLM configs.
* REFACTOR: A Simpler way of interacting with embeddings' tables.
This change adds a new abstraction called `Schema`, which acts as a repository that supports the same DB features `VectorRepresentation::Base` has, with the exception that removes the need to have duplicated methods per embeddings table.
It is also a bit more flexible when performing a similarity search because you can pass it a block that gives you access to the builder, allowing you to add multiple joins/where conditions.
This PR fixes an issue where the tag suggester for edit title topic area was suggesting tags that are already assigned on a post. It also updates the amount of suggested tags to 7 so that there is still a decent amount of tags suggested when tags are already assigned.
This PR updates the logic for the location map so it permits only the desired prompts through to the composer/post menu. Anything else won't be shown by default.
This PR also adds relevant tests to prevent regression.
### 🔍 Overview
With the recent changes to allow DiscourseAi in the translator plugin, `detect_text_locale` was needed as a CompletionPrompt. However, it is leaking into composer/post helper menus. This PR ensures we don't not show it in those menus.
* FIX: Use base64 encoded images in AI Image Caption via LLaVa
This fixed a regression introduced in #646 where we started sending
schemaless URLs for our LLaVa service, which doesn't handle it well.
Moving to base64 encoded images solves:
- The service needing to download images
Now the service running LLaVa doesn't need internet access
- Secure uploads compat
Every image is treated the same, less branching for secure uploads
- Image Size problems
Discourse is now responsible for ensure a max size for images
- Troublesome dev env
Previously to this commit you would need a dev env that was internet
acessible to use llava image captions
Previoulsy on GPT-4-vision was supported, change introduces support
for Google/Anthropic and new OpenAI models
Additionally this makes vision work properly in dev environments
cause we sent the encoded payload via prompt vs sending urls
- Introduce new support for GPT4o (automation / bot / summary / helper)
- Properly account for token counts on OpenAI models
- Track feature that was used when generating AI completions
- Remove custom llm support for summarization as we need better interfaces to control registration and de-registration
- Adds support for sd3 and sd3 turbo models - this requires new endpoints
- Adds a hack to normalize arrays in the tool calls
- Removes some leftover code
- Adds support for aspect ratio as well so you can generate wide or tall images
Chat thread replies draft trigger the thread_created event, which we relied on
to trigger the AI generated title. Because of that we now will use the noisier
chat_message_created event, and manually check for thread and replies existence.
See https://github.com/discourse/discourse/pull/26033
* DEV: improve internal design of ai persona and bug fix
- Fixes bug where OpenAI could not describe images
- Fixes bug where mentionable personas could not be mentioned unless overarching bot was enabled
- Improves internal design of playground and bot to allow better for non "bot" users
- Allow PMs directly to persona users (previously bot user would also have to be in PM)
- Simplify internal code
Co-authored-by: Martin Brennan <martin@discourse.org>
* FEATURE: AI helper support in non English languages
This attempts some prompt engineering to coerce AI helper to answer
in the appropriate language.
Note mileage will vary, in testing GPT-4 produces the best results
GPT-3.5 can return OKish results.
* Extend non english support for GPT-4V image caption
* Update db/fixtures/ai_helper/603_completion_prompts.rb
---------
Co-authored-by: Rafael Silva <xfalcox@gmail.com>
This PR adds a new feature where you can generate captions for images in the composer using AI.
---------
Co-authored-by: Rafael Silva <xfalcox@gmail.com>
* FIX: Better AI chat thread titles
- Fix quote removal when multi-line
- Use XML tags for better LLM output parsing
- Use stop_sequences for faster and less wasteful LLM calls
- Adds truncation as the last line of defense
* REFACTOR: Represent generic prompts with an Object.
* Adds a bit more validation for clarity
* Rewrite bot title prompt and fix quirk handling
---------
Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
* FIX: AI helper not working correctly with mixtral
This PR introduces a new function on the generic llm called #generate
This will replace the implementation of completion!
#generate introduces a new way to pass temperature, max_tokens and stop_sequences
Then LLM implementers need to implement #normalize_model_params to
ensure the generic names match the LLM specific endpoint
This also adds temperature and stop_sequences to completion_prompts
this allows for much more robust completion prompts
* port everything over to #generate
* Fix translation
- On anthropic this no longer throws random "This is your translation:"
- On mixtral this actually works
* fix markdown table generation as well