FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
# frozen_string_literal: true
|
|
|
|
|
|
|
|
module ::Jobs
|
|
|
|
class DigestRagUpload < ::Jobs::Base
|
2024-04-05 01:02:16 +11:00
|
|
|
CHUNK_SIZE = 1024
|
|
|
|
CHUNK_OVERLAP = 64
|
2024-04-12 23:32:46 +10:00
|
|
|
MAX_FRAGMENTS = 100_000
|
2024-04-05 01:02:16 +11:00
|
|
|
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
# TODO(roman): Add a way to automatically recover from errors, resulting in unindexed uploads.
|
|
|
|
def execute(args)
|
|
|
|
return if (upload = Upload.find_by(id: args[:upload_id])).nil?
|
2024-09-16 08:17:17 +10:00
|
|
|
|
|
|
|
target_type = args[:target_type]
|
|
|
|
target_id = args[:target_id]
|
|
|
|
|
|
|
|
return if !target_type || !target_id
|
|
|
|
|
|
|
|
target = target_type.constantize.find_by(id: target_id)
|
|
|
|
return if !target
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
|
2025-01-21 12:23:19 -03:00
|
|
|
vector_rep = DiscourseAi::Embeddings::Vector.instance
|
2024-04-12 23:32:46 +10:00
|
|
|
|
|
|
|
tokenizer = vector_rep.tokenizer
|
2024-09-16 08:17:17 +10:00
|
|
|
chunk_tokens = target.rag_chunk_tokens
|
|
|
|
overlap_tokens = target.rag_chunk_overlap_tokens
|
2024-04-12 23:32:46 +10:00
|
|
|
|
2024-09-16 08:17:17 +10:00
|
|
|
fragment_ids = RagDocumentFragment.where(target: target, upload: upload).pluck(:id)
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
|
|
|
|
# Check if this is the first time we process this upload.
|
|
|
|
if fragment_ids.empty?
|
FEATURE: PDF support for rag pipeline (#1118)
This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:
**1. LLM Model Association for RAG and Personas:**
- **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM. Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
- **Migration:** Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
- **Model Changes:** The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier. `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
- **UI Updates:** The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled). The RAG options component displays an LLM selector.
- **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.
**2. PDF and Image Support for RAG:**
- **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
- **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled. Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
- **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
- **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
- **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
- **UI Updates:** The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.
**3. Refactoring and Improvements:**
- **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
- **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
- **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
- **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.
**4. Testing:**
- The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
2025-02-14 12:15:07 +11:00
|
|
|
document = get_uploaded_file(upload: upload, target: target)
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
return if document.nil?
|
|
|
|
|
2024-04-25 10:48:55 -03:00
|
|
|
RagDocumentFragment.publish_status(upload, { total: 0, indexed: 0, left: 0 })
|
|
|
|
|
2024-04-05 01:02:16 +11:00
|
|
|
fragment_ids = []
|
|
|
|
idx = 0
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
|
2024-04-05 01:02:16 +11:00
|
|
|
ActiveRecord::Base.transaction do
|
2024-04-12 23:32:46 +10:00
|
|
|
chunk_document(
|
|
|
|
file: document,
|
|
|
|
tokenizer: tokenizer,
|
|
|
|
chunk_tokens: chunk_tokens,
|
|
|
|
overlap_tokens: overlap_tokens,
|
|
|
|
) do |chunk, metadata|
|
2024-04-05 01:02:16 +11:00
|
|
|
fragment_ids << RagDocumentFragment.create!(
|
2024-09-16 08:17:17 +10:00
|
|
|
target: target,
|
2024-04-05 01:02:16 +11:00
|
|
|
fragment: chunk,
|
|
|
|
fragment_number: idx + 1,
|
|
|
|
upload: upload,
|
|
|
|
metadata: metadata,
|
|
|
|
).id
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
|
2024-04-05 01:02:16 +11:00
|
|
|
idx += 1
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
|
2024-04-05 01:02:16 +11:00
|
|
|
if idx > MAX_FRAGMENTS
|
|
|
|
Rails.logger.warn("Upload #{upload.id} has too many fragments, truncating.")
|
|
|
|
break
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
end
|
2024-04-05 01:02:16 +11:00
|
|
|
end
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
end
|
|
|
|
end
|
|
|
|
|
|
|
|
fragment_ids.each_slice(50) do |slice|
|
|
|
|
Jobs.enqueue(:generate_rag_embeddings, fragment_ids: slice)
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
|
|
|
private
|
|
|
|
|
2024-04-12 23:32:46 +10:00
|
|
|
def chunk_document(file:, tokenizer:, chunk_tokens:, overlap_tokens:)
|
2024-04-05 01:02:16 +11:00
|
|
|
buffer = +""
|
|
|
|
current_metadata = nil
|
|
|
|
done = false
|
|
|
|
overlap = ""
|
|
|
|
|
2024-04-12 23:32:46 +10:00
|
|
|
# generally this will be plenty
|
|
|
|
read_size = chunk_tokens * 10
|
|
|
|
|
2024-04-05 01:02:16 +11:00
|
|
|
while buffer.present? || !done
|
2024-04-12 23:32:46 +10:00
|
|
|
if buffer.length < read_size
|
|
|
|
read = file.read(read_size)
|
2024-04-05 01:02:16 +11:00
|
|
|
done = true if read.nil?
|
|
|
|
|
|
|
|
read = Encodings.to_utf8(read) if read
|
|
|
|
|
|
|
|
buffer << (read || "")
|
|
|
|
end
|
|
|
|
|
|
|
|
# at this point we unconditionally have 2x CHUNK_SIZE worth of data in the buffer
|
|
|
|
metadata_regex = /\[\[metadata (.*?)\]\]/m
|
|
|
|
|
|
|
|
before_metadata, new_metadata, after_metadata = buffer.split(metadata_regex)
|
|
|
|
to_chunk = nil
|
|
|
|
|
|
|
|
if before_metadata.present?
|
|
|
|
to_chunk = before_metadata
|
|
|
|
elsif after_metadata.present?
|
|
|
|
current_metadata = new_metadata
|
|
|
|
to_chunk = after_metadata
|
|
|
|
buffer = buffer.split(metadata_regex, 2).last
|
|
|
|
overlap = ""
|
2024-04-17 13:46:40 +10:00
|
|
|
else
|
|
|
|
current_metadata = new_metadata
|
|
|
|
buffer = buffer.split(metadata_regex, 2).last
|
|
|
|
overlap = ""
|
|
|
|
next
|
2024-04-05 01:02:16 +11:00
|
|
|
end
|
|
|
|
|
2024-04-12 23:32:46 +10:00
|
|
|
chunk, split_char = first_chunk(to_chunk, tokenizer: tokenizer, chunk_tokens: chunk_tokens)
|
2024-04-05 01:02:16 +11:00
|
|
|
buffer = buffer[chunk.length..-1]
|
|
|
|
|
|
|
|
processed_chunk = overlap + chunk
|
|
|
|
|
|
|
|
processed_chunk.strip!
|
|
|
|
processed_chunk.gsub!(/\n[\n]+/, "\n\n")
|
|
|
|
|
|
|
|
yield processed_chunk, current_metadata
|
|
|
|
|
2024-04-12 23:32:46 +10:00
|
|
|
current_chunk_tokens = tokenizer.encode(chunk)
|
|
|
|
overlap_token_ids = current_chunk_tokens[-overlap_tokens..-1] || current_chunk_tokens
|
|
|
|
|
|
|
|
overlap = ""
|
|
|
|
|
|
|
|
while overlap_token_ids.present?
|
|
|
|
begin
|
2024-09-30 16:27:50 +09:00
|
|
|
padding = split_char
|
|
|
|
padding = " " if padding.empty?
|
|
|
|
overlap = tokenizer.decode(overlap_token_ids) + padding
|
2024-04-12 23:32:46 +10:00
|
|
|
break if overlap.encoding == Encoding::UTF_8
|
|
|
|
rescue StandardError
|
|
|
|
# it is possible that we truncated mid char
|
|
|
|
end
|
|
|
|
overlap_token_ids.shift
|
|
|
|
end
|
2024-04-05 01:02:16 +11:00
|
|
|
|
|
|
|
# remove first word it is probably truncated
|
2024-09-30 16:27:50 +09:00
|
|
|
overlap = overlap.split(/\s/, 2).last.to_s.lstrip
|
2024-04-05 01:02:16 +11:00
|
|
|
end
|
|
|
|
end
|
|
|
|
|
2024-04-12 23:32:46 +10:00
|
|
|
def first_chunk(text, chunk_tokens:, tokenizer:, splitters: ["\n\n", "\n", ".", ""])
|
|
|
|
return text, " " if tokenizer.tokenize(text).length <= chunk_tokens
|
2024-04-05 01:02:16 +11:00
|
|
|
|
|
|
|
splitters = splitters.find_all { |s| text.include?(s) }.compact
|
|
|
|
|
|
|
|
buffer = +""
|
|
|
|
split_char = nil
|
|
|
|
|
|
|
|
splitters.each do |splitter|
|
|
|
|
split_char = splitter
|
|
|
|
|
|
|
|
text
|
|
|
|
.split(split_char)
|
|
|
|
.each do |part|
|
2024-04-12 23:32:46 +10:00
|
|
|
break if tokenizer.tokenize(buffer + split_char + part).length > chunk_tokens
|
2024-04-05 01:02:16 +11:00
|
|
|
buffer << split_char
|
|
|
|
buffer << part
|
|
|
|
end
|
|
|
|
break if buffer.length > 0
|
|
|
|
end
|
|
|
|
|
|
|
|
[buffer, split_char]
|
|
|
|
end
|
|
|
|
|
FEATURE: PDF support for rag pipeline (#1118)
This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:
**1. LLM Model Association for RAG and Personas:**
- **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM. Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
- **Migration:** Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
- **Model Changes:** The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier. `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
- **UI Updates:** The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled). The RAG options component displays an LLM selector.
- **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.
**2. PDF and Image Support for RAG:**
- **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
- **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled. Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
- **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
- **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
- **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
- **UI Updates:** The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.
**3. Refactoring and Improvements:**
- **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
- **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
- **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
- **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.
**4. Testing:**
- The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
2025-02-14 12:15:07 +11:00
|
|
|
def get_uploaded_file(upload:, target:)
|
|
|
|
if %w[pdf png jpg jpeg].include?(upload.extension) && !SiteSetting.ai_rag_pdf_images_enabled
|
|
|
|
raise Discourse::InvalidAccess.new(
|
|
|
|
"The setting ai_rag_pdf_images_enabled is false, can not index images and pdfs.",
|
|
|
|
)
|
|
|
|
end
|
|
|
|
if upload.extension == "pdf"
|
|
|
|
pages =
|
|
|
|
DiscourseAi::Utils::PdfToImages.new(
|
|
|
|
upload: upload,
|
|
|
|
user: Discourse.system_user,
|
|
|
|
).uploaded_pages
|
|
|
|
|
|
|
|
return(
|
|
|
|
DiscourseAi::Utils::ImageToText.as_fake_file(
|
|
|
|
uploads: pages,
|
|
|
|
llm_model: target.rag_llm_model,
|
|
|
|
user: Discourse.system_user,
|
|
|
|
)
|
|
|
|
)
|
|
|
|
end
|
|
|
|
|
|
|
|
if %w[png jpg jpeg].include?(upload.extension)
|
|
|
|
return(
|
|
|
|
DiscourseAi::Utils::ImageToText.as_fake_file(
|
|
|
|
uploads: [upload],
|
|
|
|
llm_model: target.rag_llm_model,
|
|
|
|
user: Discourse.system_user,
|
|
|
|
)
|
|
|
|
)
|
|
|
|
end
|
|
|
|
|
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.
For now, we'll only allow plain-text files, but this will change in the future.
Commits:
* FEATURE: RAG embeddings for the AI Bot
This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.
* Basic asymmetric similarity search to provide guidance in system prompt
* Fix tests and lint
* Apply reranker to fragments
* Uploads filter, css adjustments and file validations
* Add placeholder for rag fragments
* Update annotations
2024-04-01 13:43:34 -03:00
|
|
|
store = Discourse.store
|
|
|
|
@file ||=
|
|
|
|
if store.external?
|
|
|
|
# Upload#filesize could be approximate.
|
|
|
|
# add two extra Mbs to make sure that we'll be able to download the upload.
|
|
|
|
max_filesize = upload.filesize + 2.megabytes
|
|
|
|
store.download(upload, max_file_size_kb: max_filesize)
|
|
|
|
else
|
|
|
|
File.open(store.path_for(upload))
|
|
|
|
end
|
|
|
|
end
|
|
|
|
end
|
|
|
|
end
|