64 Commits

Author SHA1 Message Date
Keegan George
b675c4c39b
DEV: Remove custom prefix in specs 2025-07-16 11:39:55 -07:00
Keegan George
0fadf1da1a
DEV: update enumerator 2025-07-16 10:56:18 -07:00
Keegan George
2377b286dd
DEV: rely on default llm model in spec 2025-07-15 09:56:24 -07:00
Natalie Tay
d54cd1f602
DEV: Normalize locales that are similar (e.g. en and en_GB) so they do not get translated (#1495)
This commit
- normalizes locales like en_GB and variants to en. With this, the feature will not translate en_GB posts to en (or similarly pt_BR to pt_PT)
- consolidates whether the feature is enabled in `DiscourseAi::Translation.enabled?`
- similarly for backfill in  `DiscourseAi::Translation.backfill_enabled?`
  - turns off backfill if `ai_translation_backfill_max_age_days` is 0 to keep true to what it says. Set it to a high number to backfill everything
2025-07-09 22:21:51 +08:00
Rafael dos Santos Silva
6247906c13
FEATURE: Seamless embedding model upgrades (#1486) 2025-07-04 16:44:03 -03:00
Sam
40fa527633
FIX: cross talk when in ai helper (#1478)
Previous to this change we reused channels for proofreading progress and
ai helper progress

The new changeset ensures each POST to stream progress gets a dedicated
message bus channel

This fixes a class of issues where the wrong information could be displayed
to end users on subsequent proofreading or helper calls

* fix tests

* fix implementation (got to subscribe at 0)
2025-07-01 18:02:16 +10:00
Natalie Tay
740be26625
DEV: Also make sure locale detection skips PMs that are not group PMs when public content only (#1457)
In the earlier PR https://github.com/discourse/discourse-ai/pull/1432, when `SiteSetting.ai_translation_backfill_limit_to_public_content = false`, we **translate** PMs but **skip translating** PMs that do not involve groups.

This commit covers the missing case on **locale detection**.
2025-06-23 19:07:40 +08:00
Natalie Tay
e2d7ca0bb9
DEV: Indicate backfill rate for translations is hourly (#1451)
* DEV: Indicate backfill rate for translations is hourly

* add ai_translation_max_post_length

* default value update
2025-06-21 15:45:09 +08:00
Natalie Tay
3e87e92631
DEV: Remove 'experimental' from translation features (#1439)
* DEV: Remove 'experimental' from translation features

* include compat

* include compat
2025-06-19 12:23:56 +08:00
Natalie Tay
fc83bed7cd
FIX: When allowing private content translation, only translate group PMs and not personal PMs (#1432)
We want to avoid translating PMs that are not group PMs. This condition is applied when `SiteSetting.ai_translation_backfill_limit_to_public_content = false`
2025-06-13 00:55:52 +08:00
Natalie Tay
35d62a659b
FIX: Skip edits if localization exists (#1422)
We will fine tune updating an outdated localization in the future. For now we are seeing that quick edits are happening and we need to prevent the job from being too trigger-happy.
2025-06-11 11:00:22 +08:00
Natalie Tay
8a3a247b11
DEV: Also detect locale of categories and do not translate if already in the locale (#1413)
Previously I had omitted to add `locale` to the category, as categories tended to be just a single word, and I did not find it would be worth to carry locale information.

Due to certain LLMs that do poorer at translation, category descriptions got pretty messy. We added locale support here - https://github.com/discourse/discourse/pull/32962. 

This PR adds the automatic locale detection, and skips translating to the category's locale.
2025-06-06 22:41:48 +08:00
Roman Rizzi
0338dbea23 FEATURE: Use different personas to power AI helper features.
You can now edit each AI helper prompt individually through personas, limit access to specific groups, set different LLMs, etc.
2025-06-04 14:23:00 -03:00
Rafael dos Santos Silva
478f31de47
FEATURE: add inferred concepts system (#1330)
* FEATURE: add inferred concepts system

This commit adds a new inferred concepts system that:
- Creates a model for storing concept labels that can be applied to topics
- Provides AI personas for finding new concepts and matching existing ones
- Adds jobs for generating concepts from popular topics
- Includes a scheduled job that automatically processes engaging topics

* FEATURE: Extend inferred concepts to include posts

* Adds support for concepts to be inferred from and applied to posts
* Replaces daily task with one that handles both topics and posts
* Adds database migration for posts_inferred_concepts join table
* Updates PersonaContext to include inferred concepts



Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
Co-authored-by: Keegan George <kgeorge13@gmail.com>
2025-06-02 14:29:20 -03:00
Natalie Tay
373e2305d6
FEATURE: Automatic translation and localization of posts, topics, categories (#1376)
Related: https://github.com/discourse/discourse-translator/pull/310

This commit includes all the jobs and event hooks to localize posts, topics, and categories.

A few notes:
- `feature_name: "translation"` because the site setting is `ai-translation` and module is `Translation`
- we will switch to proper ai-feature in the near future, and can consider using the persona_user as `localization.localizer_user_id`
- keeping things flat within the module for now as we will be moving to ai-feature soon and have to rearrange
- Settings renamed/introduced are:
  - ai_translation_backfill_rate (0)
  - ai_translation_backfill_limit_to_public_content (true)
  - ai_translation_backfill_max_age_days (5)
  - ai_translation_verbose_logs (false)
2025-05-29 17:28:06 +08:00
Sam
cf220c530c
FIX: Improve MessageBus efficiency and correctly stop streaming (#1362)
* FIX: Improve MessageBus efficiency and correctly stop streaming

This commit enhances the message bus implementation for AI helper streaming by:

- Adding client_id targeting for message bus publications to ensure only the requesting client receives streaming updates
- Limiting MessageBus backlog size (2) and age (60 seconds) to prevent Redis bloat
- Replacing clearTimeout with Ember's cancel method for proper runloop management, we were leaking a stop
- Adding tests for client-specific message delivery

These changes improve memory usage and make streaming more reliable by ensuring messages are properly directed to the requesting client.

* composer suggestion needed a fix as well.

* backlog size of 2 is risky here cause same channel name is reused between clients
2025-05-23 16:23:06 +10:00
Roman Rizzi
c0a2d4c935
DEV: Use structured responses for summaries (#1252)
* DEV: Use structured responses for summaries

* Fix system specs

* Make response_format a first class citizen and update endpoints to support it

* Response format can be specified in the persona

* lint

* switch to jsonb and make column nullable

* Reify structured output chunks. Move JSON parsing to the depths of Completion

* Switch to JsonStreamingTracker for partial JSON parsing
2025-05-06 10:09:39 -03:00
Rafael dos Santos Silva
4eac377987
DEV: Zero delays on fake endpoint used in tests (#1311) 2025-05-05 17:47:32 -03:00
Keegan George
1300cc8a36
FEATURE: Add streaming to composer helper (#1256)
This update adding streaming to the AI helper inside the composer.
2025-04-14 08:18:50 -07:00
Keegan George
4de39a07e5
FEATURE: Configure persona backed features in admin panel (#1245)
In this feature update, we add the UI for the ability to easily configure persona backed AI-features. The feature will still be hidden until structured responses are complete.
2025-04-10 08:16:31 -07:00
Roman Rizzi
fccd072f44
DEV: Don't use delays for streaming summaries. (#1244)
We started used a callback as a buffer in FoldContent, so the Fake endpoint is attempting
to emulate delays in the streaming. However, we don't care about that in these specs.
2025-04-02 13:38:15 -03:00
Roman Rizzi
6765a13a40
FEATURE: Experimental search results from an AI Persona. (#1139)
* FEATURE: Experimental search results from an AI Persona.

When a user searches discourse, we'll send the query to an AI Persona to provide additional context and enrich the results. The feature depends on the user being a member of a group to which the persona has access.

* Update assets/stylesheets/common/ai-blinking-animation.scss

Co-authored-by: Keegan George <kgeorge13@gmail.com>

---------

Co-authored-by: Keegan George <kgeorge13@gmail.com>
2025-02-20 14:37:58 -03:00
Sam
ce79a18790
FEATURE: Native PDF support (#1127)
* FEATURE: Native PDF support

This amends it so we use PDF Reader gem to extract text from PDFs

* This means that our simple pdf eval passes at last

* fix spec

* skip test in CI

* test file support

* Update lib/utils/image_to_text.rb

Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>

* address pr comments

---------

Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2025-02-18 09:22:57 +11:00
Sam
5e80f93e4c
FEATURE: PDF support for rag pipeline (#1118)
This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes:

**1. LLM Model Association for RAG and Personas:**

-   **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM.  Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`.
-   **Migration:**  Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter.
-   **Model Changes:**  The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier.  `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes.
-   **UI Updates:**  The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled).  The RAG options component displays an LLM selector.
-   **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes.

**2. PDF and Image Support for RAG:**

-   **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`.
-   **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled.  Error handling is included for cases where PDF/image indexing is attempted with the setting disabled.
-   **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced.
-   **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs.
-   **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments.
-   **UI Updates:**  The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types.

**3. Refactoring and Improvements:**

-   **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend.
-   **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility.
-   **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based.
-   **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing.
- **Eval Script:** An evaluation script is included.

**4. Testing:**

-    The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
2025-02-14 12:15:07 +11:00
Roman Rizzi
1b1b44353b
FEATURE: Changes to summaries' outdated logic. (#1108)
Before this change, a summary was only outdated when new content appeared, for topics with "best replies", when the query returned different results. The intent behind this change is to detect when a summary is outdated as a result of an edit.

Additionally, we are changing the backfill candidates query to compare "ai_summary_backfill_topic_max_age_days" against "last_posted_at" instead of "created_at", to catch long-lived, active topics. This was discussed here: https://meta.discourse.org/t/ai-summarization-backfill-is-stuck-keeps-regenerating-the-same-topic/347088/14?u=roman_rizzi
2025-02-04 09:31:11 -03:00
Roman Rizzi
f5cf1019fb
FEATURE: configurable embeddings (#1049)
* Use AR model for embeddings features

* endpoints

* Embeddings CRUD UI

* Add presets. Hide a couple more settings

* system specs

* Seed embedding definition from old settings

* Generate search bit index on the fly. cleanup orphaned data

* support for seeded models

* Fix run test for new embedding

* fix selected model not set correctly
2025-01-21 12:23:19 -03:00
Roman Rizzi
46fcdb6ba5
FIX: Make summaries backfill job more resilient. (#1071)
To quickly select backfill candidates without comparing SHAs, we compare the last summarized post to the topic's highest_post_number. However, hiding or deleting a post and adding a small action will update this column, causing the job to stall and re-generate the same summary repeatedly until someone posts a regular reply. On top of this, this is not always true for topics with `best_replies`, as this last reply isn't necessarily included.

Since this is not evident at first glance and each summarization strategy picks its targets differently, I'm opting to simplify the backfill logic and how we track potential candidates.

The first step is dropping `content_range`, which serves no purpose and it's there because summary caching was supposed to work differently at the beginning. So instead, I'm replacing it with a column called `highest_target_number`, which tracks `highest_post_number` for topics and could track other things like channel's `message_count` in the future.

Now that we have this column when selecting every potential backfill candidate, we'll check if the summary is truly outdated by comparing the SHAs, and if it's not, we just update the column and move on
2025-01-16 09:42:53 -03:00
Roman Rizzi
94b85ece80
FIX: Make sure gists are atleast five minutes old before updating them (#1029)
* FIX: Make sure gists are atleast five minutes old before updating them

* Update app/jobs/regular/fast_track_topic_gist.rb

Co-authored-by: Keegan George <kgeorge13@gmail.com>

---------

Co-authored-by: Keegan George <kgeorge13@gmail.com>
2024-12-13 19:36:34 -03:00
Roman Rizzi
1c40a698ca
FIX: get strategy version through vector_rep (#1028) 2024-12-13 18:49:18 -03:00
Roman Rizzi
eae527f99d
REFACTOR: A Simpler way of interacting with embeddings tables. (#1023)
* REFACTOR: A Simpler way of interacting with embeddings' tables.

This change adds a new abstraction called `Schema`, which acts as a repository that supports the same DB features `VectorRepresentation::Base` has, with the exception that removes the need to have duplicated methods per embeddings table.

It is also a bit more flexible when performing a similarity search because you can pass it a block that gives you access to the builder, allowing you to add multiple joins/where conditions.
2024-12-13 10:15:21 -03:00
Roman Rizzi
ce6a2eca21
FEATURE: Backfill posts sentiment. (#982)
* FEATURE: Backfill posts sentiment.

It adds a scheduled job to backfill posts' sentiment, similar to our existing rake task, but with two settings to control the batch size and posts' max-age.

* Make sure model_name order is consistent.
2024-12-03 10:27:03 -03:00
Rafael dos Santos Silva
0ac18d157b
FEATURE: Adjustments to gist summaries (#988)
- makes visible to everyone by default
- backfills gists before full summaries
- adds configurable max age setting to backfill job
2024-12-02 15:22:35 -03:00
Rafael dos Santos Silva
23193ee6f2
FEATURE: Calculate gists from non hot topics too (#958)
Also renames some settings to remove 'hot' references.
2024-11-26 13:44:12 -03:00
Roman Rizzi
fbc74c7467
FEATURE: Extend summary backfill to also generate gists (#896)
Updates default batch size to 0 and max to 10000
2024-11-07 13:40:18 -03:00
Roman Rizzi
9505a8976c
FEATURE: Automatically backfill regular summaries. (#892)
This change introduces a job to summarize topics and cache the results automatically. We provide a setting to control how many topics we'll backfill per hour and what the topic's minimum word count is to qualify.

We'll prioritize topics without summary over outdated ones.
2024-11-04 17:48:11 -03:00
Roman Rizzi
a2b1ea3c63
FEATURE: Fast-track gist regeneration when a hot topic gets a new post (#860)
* FEATURE: Fast-track gist regeneration when a hot topic gets a new post

* DEV: Introduce an upsert-like summarize

* FIX: Only enqueue fast-track gist for hot hot hot topics

---------

Co-authored-by: Rafael Silva <xfalcox@gmail.com>
2024-10-25 12:38:49 -03:00
Roman Rizzi
e768fa877e
FIX: Don't regenerate up to date gists (#843) 2024-10-18 18:49:01 -03:00
Roman Rizzi
27b5542357
FEATURE: Generate topic gists for the hot topics list. (#837)
* Display gists in the hot topics list

* Adjust hot topics gist strategy and add a job to generate gists

* Replace setting with a configurable batch size

* Avoid loading summaries for other topic lists

* Tweak gist prompt to focus on latest posts in the context of the OP

* Remove serializer hack and rely on core change from discourse/discourse#29291

* Update lib/summarization/strategies/hot_topic_gists.rb

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>

---------

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2024-10-18 18:01:39 -03:00
Rafael dos Santos Silva
792703c942
FEATURE: Discord Bot integration (#831)
This adds support for the a Discord bot that can search in a Discourse instance when invoked via slash commands in Discord Guild channel.
2024-10-16 12:41:18 -03:00
Sam
5cbc9190eb
FEATURE: RAG search within tools (#802)
This allows custom tools access to uploads and sophisticated searches using embedding.

It introduces:

 - A shared front end for listing and uploading files (shared with personas)
 -  Backend implementation of index.search function within a custom tool.

Custom tools now may search through uploaded files

function invoke(params) {
   return index.search(params.query)
}

This means that RAG implementers now may preload tools with knowledge and have high fidelity over
the search.

The search function support

    specifying max results
    specifying a subset of files to search (from uploads)

Also

 - Improved documentation for tools (when creating a tool a preamble explains all the functionality)
  - uploads were a bit finicky, fixed an edge case where the UI would not show them as updated
2024-09-30 17:27:50 +10:00
Sam
03eccbe392
FEATURE: Make tool support polymorphic (#798)
Polymorphic RAG means that we will be able to access RAG fragments both from AiPersona and AiCustomTool

In turn this gives us support for richer RAG implementations.
2024-09-16 08:17:17 +10:00
Sam
584753cf60
FIX: we were never reindexing old content (#786)
* FIX: we were never reindexing old content

Embedding backfill contains logic for searching for old content
change and then backfilling.

Unfortunately it was excluding all topics that had embedding
unconditionally, leading to no backfill ever happening.


This change adds a test and ensures we backfill.

* over select results, this ensures we will be more likely to find
ai results when filtered
2024-08-30 14:37:55 +10:00
Keegan George
f72ab12761
DEV: Clearly separate post/composer helper settings (#747) 2024-08-12 15:40:23 -07:00
Keegan George
1d6a6c9f8f
FEATURE: Stream other post helper options (#745) 2024-08-08 11:32:39 -07:00
Sam
1320eed9b2
FEATURE: move summary to use llm_model (#699)
This allows summary to use the new LLM models and migrates of API key based model selection

Claude 3.5 etc... all work now. 

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-07-04 10:48:18 +10:00
Keegan George
1b0ba9197c
DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
Roman Rizzi
8849caf136
DEV: Transition "Select model" settings to only use LlmModels (#675)
We no longer support the "provider:model" format in the "ai_helper_model" and
"ai_embeddings_semantic_search_hyde_model" settings. We'll migrate existing
values and work with our new data-driven LLM configs from now on.
2024-06-19 18:01:35 -03:00
Roman Rizzi
8d5f901a67
DEV: Rewire AI bot internals to use LlmModel (#638)
* DRAFT: Create AI Bot users dynamically and support custom LlmModels

* Get user associated to llm_model

* Track enabled bots with attribute

* Don't store bot username. Minor touches to migrate default values in settings

* Handle scenario where vLLM uses a SRV record

* Made 3.5-turbo-16k the default version so we can remove hack
2024-06-18 14:32:14 -03:00
Sam
a5e4ab2825
FIX: blank metadata leading to errors (#578)
blank metadata block in RAG was leading to an error, this handles the edge case
2024-04-17 13:46:40 +10:00
Sam
f6ac5cd0a8
FEATURE: allow tuning of RAG generation (#565)
* FEATURE: allow tuning of RAG generation

- change chunking to be token based vs char based (which is more accurate)
- allow control over overlap / tokens per chunk and conversation snippets inserted
- UI to control new settings

* improve ui a bit

* fix various reindex issues

* reduce concurrency

* try ultra low queue ... concurrency 1 is too slow.
2024-04-12 10:32:46 -03:00