12 Commits

Author SHA1 Message Date
Sam
ce79a18790
FEATURE: Native PDF support (#1127)
* FEATURE: Native PDF support

This amends it so we use PDF Reader gem to extract text from PDFs

* This means that our simple pdf eval passes at last

* fix spec

* skip test in CI

* test file support

* Update lib/utils/image_to_text.rb

Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>

* address pr comments

---------

Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2025-02-18 09:22:57 +11:00
Sam
5e80f93e4c
FEATURE: PDF support for rag pipeline (#1118)
This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:

**1. LLM Model Association for RAG and Personas:**

-   **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM.  Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
-   **Migration:**  Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
-   **Model Changes:**  The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier.  `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
-   **UI Updates:**  The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled).  The RAG options component displays an LLM selector.
-   **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.

**2. PDF and Image Support for RAG:**

-   **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
-   **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled.  Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
-   **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
-   **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
-   **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
-   **UI Updates:**  The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.

**3. Refactoring and Improvements:**

-   **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
-   **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
-   **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
-   **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.

**4. Testing:**

-    The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
2025-02-14 12:15:07 +11:00
Roman Rizzi
f5cf1019fb
FEATURE: configurable embeddings (#1049)
* Use AR model for embeddings features

* endpoints

* Embeddings CRUD UI

* Add presets. Hide a couple more settings

* system specs

* Seed embedding definition from old settings

* Generate search bit index on the fly. cleanup orphaned data

* support for seeded models

* Fix run test for new embedding

* fix selected model not set correctly
2025-01-21 12:23:19 -03:00
Roman Rizzi
eae527f99d
REFACTOR: A Simpler way of interacting with embeddings tables. (#1023)
* REFACTOR: A Simpler way of interacting with embeddings' tables.

This change adds a new abstraction called `Schema`, which acts as a repository that supports the same DB features `VectorRepresentation::Base` has, with the exception that removes the need to have duplicated methods per embeddings table.

It is also a bit more flexible when performing a similarity search because you can pass it a block that gives you access to the builder, allowing you to add multiple joins/where conditions.
2024-12-13 10:15:21 -03:00
Sam
5cbc9190eb
FEATURE: RAG search within tools (#802)
This allows custom tools access to uploads and sophisticated searches using embedding.

It introduces:

 - A shared front end for listing and uploading files (shared with personas)
 -  Backend implementation of index.search function within a custom tool.

Custom tools now may search through uploaded files

function invoke(params) {
   return index.search(params.query)
}

This means that RAG implementers now may preload tools with knowledge and have high fidelity over
the search.

The search function support

    specifying max results
    specifying a subset of files to search (from uploads)

Also

 - Improved documentation for tools (when creating a tool a preamble explains all the functionality)
  - uploads were a bit finicky, fixed an edge case where the UI would not show them as updated
2024-09-30 17:27:50 +10:00
Sam
03eccbe392
FEATURE: Make tool support polymorphic (#798)
Polymorphic RAG means that we will be able to access RAG fragments both from AiPersona and AiCustomTool

In turn this gives us support for richer RAG implementations.
2024-09-16 08:17:17 +10:00
Roman Rizzi
283445cf81
FIX: RAG uploader must support multi-file indexing. (#592)
Updating the editing model's rag_uploads in the editor component broke multi-file uploading. Instead, we'll keep the uploads in the uploader and update the model when we finish.

This PR also fast-tracks the initial update so we can show feedback to the user quickly, and allows uploading MD files.

Bug reported on https://meta.discourse.org/t/discourse-ai-persona-upload-support/304049/11
2024-04-25 10:48:55 -03:00
Sam
a5e4ab2825
FIX: blank metadata leading to errors (#578)
blank metadata block in RAG was leading to an error, this handles the edge case
2024-04-17 13:46:40 +10:00
Sam
f6ac5cd0a8
FEATURE: allow tuning of RAG generation (#565)
* FEATURE: allow tuning of RAG generation

- change chunking to be token based vs char based (which is more accurate)
- allow control over overlap / tokens per chunk and conversation snippets inserted
- UI to control new settings

* improve ui a bit

* fix various reindex issues

* reduce concurrency

* try ultra low queue ... concurrency 1 is too slow.
2024-04-12 10:32:46 -03:00
Roman Rizzi
aa8918911d
UX: Display the indexing progress for RAG uploads (#557) 2024-04-09 11:03:07 -03:00
Sam
830cc26075
FEATURE: Add metadata support for RAG (#553)
* FEATURE: Add metadata support for RAG

You may include non indexed metadata in the RAG document by using

[[metadata ....]]

This information is attached to all the text below and provided to
the retriever.

This allows for RAG to operate within a rich amount of contexts
without getting lost

Also:

- re-implemented chunking algorithm so it streams
- moved indexing to background low priority queue

* Baran gem no longer required.

* tokenizers is on 4.4 ... upgrade it ...
2024-04-04 11:02:16 -03:00
Roman Rizzi
1f1c94e5c6
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.

For now, we'll only allow plain-text files, but this will change in the future.

Commits:

* FEATURE: RAG embeddings for the AI Bot

This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.

* Basic asymmetric similarity search to provide guidance in system prompt

* Fix tests and lint

* Apply reranker to fragments

* Uploads filter, css adjustments and file validations

* Add placeholder for rag fragments

* Update annotations
2024-04-01 13:43:34 -03:00