Sam 5e80f93e4c
FEATURE: PDF support for rag pipeline (#1118)
This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:

**1. LLM Model Association for RAG and Personas:**

-   **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM.  Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
-   **Migration:**  Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
-   **Model Changes:**  The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier.  `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
-   **UI Updates:**  The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled).  The RAG options component displays an LLM selector.
-   **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.

**2. PDF and Image Support for RAG:**

-   **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
-   **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled.  Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
-   **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
-   **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
-   **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
-   **UI Updates:**  The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.

**3. Refactoring and Improvements:**

-   **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
-   **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
-   **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
-   **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.

**4. Testing:**

-    The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
2025-02-14 12:15:07 +11:00

137 lines
3.4 KiB
Ruby

#frozen_string_literal: true
class DiscourseAi::Evals::Eval
attr_reader :type,
:path,
:name,
:description,
:id,
:args,
:vision,
:expected_output,
:expected_output_regex
def initialize(path:)
@yaml = YAML.load_file(path).symbolize_keys
@path = path
@name = @yaml[:name]
@id = @yaml[:id]
@description = @yaml[:description]
@vision = @yaml[:vision]
@args = @yaml[:args]&.symbolize_keys
@type = @yaml[:type]
@expected_output = @yaml[:expected_output]
@expected_output_regex = @yaml[:expected_output_regex]
@expected_output_regex =
Regexp.new(@expected_output_regex, Regexp::MULTILINE) if @expected_output_regex
@args[:path] = File.expand_path(File.join(File.dirname(path), @args[:path])) if @args&.key?(
:path,
)
end
def run(llm:)
result =
case type
when "helper"
helper(llm, **args)
when "pdf_to_text"
pdf_to_text(llm, **args)
when "image_to_text"
image_to_text(llm, **args)
end
if expected_output
if result == expected_output
{ result: :pass }
else
{ result: :fail, expected_output: expected_output, actual_output: result }
end
elsif expected_output_regex
if result.match?(expected_output_regex)
{ result: :pass }
else
{ result: :fail, expected_output: expected_output_regex, actual_output: result }
end
else
{ result: :unknown, actual_output: result }
end
end
def print
puts "#{id}: #{description}"
end
def to_json
{
type: @type,
path: @path,
name: @name,
description: @description,
id: @id,
args: @args,
vision: @vision,
expected_output: @expected_output,
expected_output_regex: @expected_output_regex,
}.compact
end
private
def helper(llm, input:, name:)
completion_prompt = CompletionPrompt.find_by(name: name)
helper = DiscourseAi::AiHelper::Assistant.new(helper_llm: llm.llm_proxy)
result =
helper.generate_and_send_prompt(
completion_prompt,
input,
current_user = Discourse.system_user,
_force_default_locale = false,
)
result[:suggestions].first
end
def image_to_text(llm, path:)
upload =
UploadCreator.new(File.open(path), File.basename(path)).create_for(Discourse.system_user.id)
text = +""
DiscourseAi::Utils::ImageToText
.new(upload: upload, llm_model: llm.llm_model, user: Discourse.system_user)
.extract_text do |chunk, error|
text << chunk if chunk
text << "\n\n" if chunk
end
text
ensure
upload.destroy if upload
end
def pdf_to_text(llm, path:)
upload =
UploadCreator.new(File.open(path), File.basename(path)).create_for(Discourse.system_user.id)
uploads =
DiscourseAi::Utils::PdfToImages.new(
upload: upload,
user: Discourse.system_user,
).uploaded_pages
text = +""
uploads.each do |page_upload|
DiscourseAi::Utils::ImageToText
.new(upload: page_upload, llm_model: llm.llm_model, user: Discourse.system_user)
.extract_text do |chunk, error|
text << chunk if chunk
text << "\n\n" if chunk
end
upload.destroy
end
text
ensure
upload.destroy if upload
end
end