34 Commits

Author SHA1 Message Date
Sam
5e80f93e4c
FEATURE: PDF support for rag pipeline (#1118)
This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:

**1. LLM Model Association for RAG and Personas:**

-   **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM.  Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
-   **Migration:**  Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
-   **Model Changes:**  The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier.  `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
-   **UI Updates:**  The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled).  The RAG options component displays an LLM selector.
-   **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.

**2. PDF and Image Support for RAG:**

-   **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
-   **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled.  Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
-   **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
-   **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
-   **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
-   **UI Updates:**  The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.

**3. Refactoring and Improvements:**

-   **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
-   **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
-   **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
-   **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.

**4. Testing:**

-    The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
2025-02-14 12:15:07 +11:00
Roman Rizzi
e52045ebdc
DEV: Robust check for embeddings enabled (#1116) 2025-02-06 12:18:55 -03:00
Keegan George
13f9a1908a
DEV: Only permit config of allowed seeded models (#1103)
This update resolves a regression that was introduced in https://github.com/discourse/discourse-ai/pull/1036/files. Previously, only seeded models that were allowed could be configured for model settings. However, in our attempts to prevent unreachable LLM errors from not allowing settings to persist, it also unknowingly allowed seeded models that were not allowed to be configured. This update resolves this issue, while maintaining the ability to still set unreachable LLMs.
2025-01-31 13:01:15 -08:00
Roman Rizzi
1572068735
DEV: Improve embedding configs validations (#1101)
Before this change, we let you set the embeddings selected model back to " " even with embeddings enabled. This will leave the site in a broken state.

Additionally, it adds a fail-safe for these scenarios to avoid errors on the topics page.
2025-01-30 14:16:56 -03:00
Roman Rizzi
f5cf1019fb
FEATURE: configurable embeddings (#1049)
* Use AR model for embeddings features

* endpoints

* Embeddings CRUD UI

* Add presets. Hide a couple more settings

* system specs

* Seed embedding definition from old settings

* Generate search bit index on the fly. cleanup orphaned data

* support for seeded models

* Fix run test for new embedding

* fix selected model not set correctly
2025-01-21 12:23:19 -03:00
Keegan George
b480f13a0f
FIX: Prevent LLM enumerator from erroring when spam enabled (#1045)
This PR fixes an issue where LLM enumerator would error out when `SiteSetting.ai_spam_detection = true` but there was no `AiModerationSetting.spam` present.

Typically, we add an `LlmDependencyValidator` for the setting itself, however, since Spam is unique in that it has it's model set in `AiModerationSetting` instead of a `SiteSetting`, we'll add a simple check here to prevent erroring out.
2024-12-27 09:12:29 +11:00
Sam
47ecf86aa1
FIX: embedding validation (#1043) 2024-12-24 09:37:23 +11:00
Roman Rizzi
ceac6e5efb
FIX: Embeddings validator test needs to use the new Vector class. (#1041) 2024-12-23 14:19:22 -03:00
Keegan George
bdb8f1d5e0
FIX: Custom prefix causing allowed seeded LLMs not to be shown (#1039)
* FIX: Custom prefix causing allowed seeded LLMs not to be shown

* DEV: update spec

* not `_map` so should be string not array
2024-12-23 16:42:26 +11:00
Keegan George
059b3fabb8
DEV: Unreachable LLM error shouldn't prevent setting (#1036)
Previously we had the behaviour for model settings so that when you try and set a model, it runs a test and returns an error if it can't run the test successfully. The error then prevents you from setting the site setting.

This results in some issues when we try and automate things. This PR updates that so that the test runs and discreetly logs the changes, but doesn't prevent the setting from being set. Instead we rely on "run test" in the LLM config along with ProblemChecks to catch issues.
2024-12-20 11:52:11 -08:00
Sam
47f5da7e42
FEATURE: Add AI-powered spam detection for new user posts (#1004)
This introduces a comprehensive spam detection system that uses LLM models
to automatically identify and flag potential spam posts. The system is
designed to be both powerful and configurable while preventing false positives.

Key Features:
* Automatically scans first 3 posts from new users (TL0/TL1)
* Creates dedicated AI flagging user to distinguish from system flags
* Tracks false positives/negatives for quality monitoring
* Supports custom instructions to fine-tune detection
* Includes test interface for trying detection on any post

Technical Implementation:
* New database tables:
  - ai_spam_logs: Stores scan history and results
  - ai_moderation_settings: Stores LLM config and custom instructions
* Rate limiting and safeguards:
  - Minimum 10-minute delay between rescans
  - Only scans significant edits (>10 char difference)
  - Maximum 3 scans per post
  - 24-hour maximum age for scannable posts
* Admin UI features:
  - Real-time testing capabilities
  - 7-day statistics dashboard
  - Configurable LLM model selection
  - Custom instruction support

Security and Performance:
* Respects trust levels - only scans TL0/TL1 users
* Skips private messages entirely
* Stops scanning users after 3 successful public posts
* Includes comprehensive test coverage
* Maintains audit log of all scan attempts


---------

Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
2024-12-12 09:17:25 +11:00
Sam
a1f859a415
FEATURE: improve visibility of AI usage in LLM page (#845)
This changeset: 

1. Corrects some issues with "force_default_llm" not applying
2. Expands the LLM list page to show LLM usage
3. Clarifies better what "enabling a bot" on an llm means (you get it in the selector)
2024-10-22 11:16:02 +11:00
Rafael dos Santos Silva
792703c942
FEATURE: Discord Bot integration (#831)
This adds support for the a Discord bot that can search in a Discourse instance when invoked via slash commands in Discord Guild channel.
2024-10-16 12:41:18 -03:00
Roman Rizzi
ed97827f49
FIX: Correctly display errors when parent module needs to be disabled first (#788)
* FIX: Correctly display errors when parent module needs to be disabled first

* Update spec/configuration/llm_validator_spec.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

---------

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2024-08-30 17:16:11 -03:00
Rafael dos Santos Silva
a08d168740
FEATURE: Initial support for seeded LLMs (#756) 2024-08-28 15:57:58 -03:00
Keegan George
f72ab12761
DEV: Clearly separate post/composer helper settings (#747) 2024-08-12 15:40:23 -07:00
Roman Rizzi
bed044448c
DEV: Remove old code now that features rely on LlmModels. (#729)
* DEV: Remove old code now that features rely on LlmModels.

* Hide old settings and migrate persona llm overrides

* Remove shadowing special URL + seeding code. Use srv:// prefix instead.
2024-07-30 13:44:57 -03:00
Roman Rizzi
5c196bca89
FEATURE: Track if a model can do vision in the llm_models table (#725)
* FEATURE: Track if a model can do vision in the llm_models table

* Data migration
2024-07-24 16:29:47 -03:00
Sam
1320eed9b2
FEATURE: move summary to use llm_model (#699)
This allows summary to use the new LLM models and migrates of API key based model selection

Claude 3.5 etc... all work now. 

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-07-04 10:48:18 +10:00
Keegan George
1b0ba9197c
DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
Roman Rizzi
091dc626e8
FIX: Make sure LlmEnumerator always return value hashes using symbols (#684) 2024-06-21 16:35:31 -03:00
Roman Rizzi
8849caf136
DEV: Transition "Select model" settings to only use LlmModels (#675)
We no longer support the "provider:model" format in the "ai_helper_model" and
"ai_embeddings_semantic_search_hyde_model" settings. We'll migrate existing
values and work with our new data-driven LLM configs from now on.
2024-06-19 18:01:35 -03:00
Roman Rizzi
8d5f901a67
DEV: Rewire AI bot internals to use LlmModel (#638)
* DRAFT: Create AI Bot users dynamically and support custom LlmModels

* Get user associated to llm_model

* Track enabled bots with attribute

* Don't store bot username. Minor touches to migrate default values in settings

* Handle scenario where vLLM uses a SRV record

* Made 3.5-turbo-16k the default version so we can remove hack
2024-06-18 14:32:14 -03:00
Sam
b487de933d
FEATURE: add support for all vision models (#646)
Previoulsy on GPT-4-vision was supported, change introduces support
for Google/Anthropic and new OpenAI models

Additionally this makes vision work properly in dev environments
cause we sent the encoded payload via prompt vs sending urls
2024-05-28 10:31:15 -03:00
Roman Rizzi
333b331eb9
FEATURE: Allow deleting custom LLMs. (#643)
This change allows us to delete custom models. It checks if there is no module using them.

It also fixes a bug where the after-create transition wasn't working. While this prevents a model from being saved multiple times, endpoint validations are still needed (will be added in a separate PR).:
2024-05-27 16:44:08 -03:00
Roman Rizzi
1d786fbaaf
FEATURE: Set endpoint credentials directly from LlmModel. (#625)
* FEATURE: Set endpoint credentials directly from LlmModel.

Drop Llama2Tokenizer since we no longer use it.

* Allow http for custom LLMs

---------

Co-authored-by: Rafael Silva <xfalcox@gmail.com>
2024-05-16 09:50:22 -03:00
Sam
8eee6893d6
FEATURE: GPT4o support and better auditing (#618)
- Introduce new support for GPT4o (automation / bot / summary / helper)
- Properly account for token counts on OpenAI models
- Track feature that was used when generating AI completions
- Remove custom llm support for summarization as we need better interfaces to control registration and de-registration
2024-05-14 13:28:46 +10:00
Roman Rizzi
e22194f321
HACK: Llama3 support for summarization/AI helper. (#616)
There are still some limitations to which models we can support with the `LlmModel` class. This will enable support for Llama3 while we sort those out.
2024-05-13 15:54:42 -03:00
Roman Rizzi
62fc7d6ed0
FEATURE: Configurable LLMs. (#606)
This PR introduces the concept of "LlmModel" as a new way to quickly add new LLM models without making any code changes. We are releasing this first version and will add incremental improvements, so expect changes.

The AI Bot can't fully take advantage of this feature as users are hard-coded. We'll fix this in a separate PR.s
2024-05-13 12:46:42 -03:00
Sam
514823daca
FIX: streaming broken in bedrock when chunks are not aligned (#609)
Also

- Stop caching llm list - this cause llm list in persona to be incorrect
- Add more UI to debug screen so you can properly see raw response
2024-05-09 12:11:50 +10:00
Roman Rizzi
fba9c1bf2c
UX: Re-introduce embedding settings validations (#457)
* Revert "Revert "UX: Validate embeddings settings (#455)" (#456)"

This reverts commit 392e2e8aef7d5b0d988b3c3bc5cc19f1d83c4491.

* Resstore previous default
2024-02-01 16:54:09 -03:00
Roman Rizzi
392e2e8aef
Revert "UX: Validate embeddings settings (#455)" (#456)
This reverts commit 85fca89e011933a0479abaf4bf0945983fb948b8.
2024-02-01 14:06:51 -03:00
Roman Rizzi
85fca89e01
UX: Validate embeddings settings (#455) 2024-02-01 13:05:38 -03:00
Roman Rizzi
0634b85a81
UX: Validations to LLM-backed features (except AI Bot) (#436)
* UX: Validations to Llm-backed features (except AI Bot)

This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs.

Validations are:

* You must choose a model before enabling the feature.
* You must turn off the feature before setting the model to blank.
* You must configure each model settings before being able to select it.

* Add provider name to summarization options

* vLLM can technically support same models as HF

* Check we can talk to the selected model

* Check for Bedrock instead of anthropic as a site could have both creds setup
2024-01-29 16:04:25 -03:00