discourse-ai

Commit Graph

Author	SHA1	Message	Date
Roman Rizzi	0abd4b1244	FIX: Sentiment classification results needs to be transformed before saving (#983 )	2024-11-29 17:31:56 -03:00
Sam	bc0657f478	FEATURE: AI Usage page (#964 ) - Added a new admin interface to track AI usage metrics, including tokens, features, and models. - Introduced a new route `/admin/plugins/discourse-ai/ai-usage` and supporting API endpoint in `AiUsageController`. - Implemented `AiUsageSerializer` for structuring AI usage data. - Integrated CSS stylings for charts and tables under `stylesheets/modules/llms/common/usage.scss`. - Enhanced backend with `AiApiAuditLog` model changes: added `cached_tokens` column (implemented with OpenAI for now) with relevant DB migration and indexing. - Created `Report` module for efficient aggregation and filtering of AI usage metrics. - Updated AI Bot title generation logic to log correctly to user vs bot - Extended test coverage for the new tracking features, ensuring data consistency and access controls.	2024-11-29 06:26:48 +11:00
Rafael dos Santos Silva	23193ee6f2	FEATURE: Calculate gists from non hot topics too (#958 ) Also renames some settings to remove 'hot' references.	2024-11-26 13:44:12 -03:00
Roman Rizzi	95762723de	PERF: Preload only gists when including summaries in topic list (#948 ) * PERF: Preload only gists when including summaries in topic list * Add unique index on summaries and dedup existing records * Make hot topics batch size setting hidden	2024-11-25 12:24:02 -03:00
Natalie Tay	f8231d259b	FEATURE: Add locale detection prompt from translator (#946 )	2024-11-25 08:33:54 +11:00
Sam	0d7f353284	FEATURE: AI artifacts (#898 ) This is a significant PR that introduces AI Artifacts functionality to the discourse-ai plugin along with several other improvements. Here are the key changes: 1. AI Artifacts System: - Adds a new `AiArtifact` model and database migration - Allows creation of web artifacts with HTML, CSS, and JavaScript content - Introduces security settings (`strict`, `lax`, `disabled`) for controlling artifact execution - Implements artifact rendering in iframes with sandbox protection - New `CreateArtifact` tool for AI to generate interactive content 2. Tool System Improvements: - Adds support for partial tool calls, allowing incremental updates during generation - Better handling of tool call states and progress tracking - Improved XML tool processing with CDATA support - Fixes for tool parameter handling and duplicate invocations 3. LLM Provider Updates: - Updates for Anthropic Claude models with correct token limits - Adds support for native/XML tool modes in Gemini integration - Adds new model configurations including Llama 3.1 models - Improvements to streaming response handling 4. UI Enhancements: - New artifact viewer component with expand/collapse functionality - Security controls for artifact execution (click-to-run in strict mode) - Improved dialog and response handling - Better error management for tool execution 5. Security Improvements: - Sandbox controls for artifact execution - Public/private artifact sharing controls - Security settings to control artifact behavior - CSP and frame-options handling for artifacts 6. Technical Improvements: - Better post streaming implementation - Improved error handling in completions - Better memory management for partial tool calls - Enhanced testing coverage 7. Configuration: - New site settings for artifact security - Extended LLM model configurations - Additional tool configuration options This PR significantly enhances the plugin's capabilities for generating and displaying interactive content while maintaining security and providing flexible configuration options for administrators.	2024-11-19 09:22:39 +11:00
Roman Rizzi	9505a8976c	FEATURE: Automatically backfill regular summaries. (#892 ) This change introduces a job to summarize topics and cache the results automatically. We provide a setting to control how many topics we'll backfill per hour and what the topic's minimum word count is to qualify. We'll prioritize topics without summary over outdated ones.	2024-11-04 17:48:11 -03:00
Rafael dos Santos Silva	772ee934ab	Migrate sentiment to a TEI backend (#886 )	2024-11-04 09:14:34 -03:00
Sam	be0b78cacd	FEATURE: new endpoint for directly accessing a persona (#876 ) The new `/admin/plugins/discourse-ai/ai-personas/stream-reply.json` was added. This endpoint streams data direct from a persona and can be used to access a persona from remote systems leaving a paper trail in PMs about the conversation that happened This endpoint is only accessible to admins. --------- Co-authored-by: Gabriel Grubba <70247653+Grubba27@users.noreply.github.com> Co-authored-by: Keegan George <kgeorge13@gmail.com>	2024-10-30 10:28:20 +11:00
Bianca Nenciu	294c364a75	DEV: Fix mismatched column types (#868 ) The primary key is usually a bigint column, but the foreign key columns are usually of integer type. This can lead to issues when joining these columns due to mismatched types and different value ranges. This was using a temporary plugin / test API to make tests pass, but it is safe to alter "ai_document_fragment_embeddings" and "rag_document_fragments" tables because they usually have less than 1M rows and migration is going to be fast. Depending on the size of the community, "classification_results" table may have more than 1M rows and the migration will lock the table for a longer time. However, classification runs in background jobs and they will be automatically retried if they fail due to the lock, which makes it acceptable.	2024-10-28 15:36:42 +02:00
Sam	059d3b6fd2	FEATURE: better logging for automation reports (#853 ) A new feature_context json column was added to ai_api_audit_logs This allows us to store rich json like context on any LLM request made. This new field now stores automation id and name. Additionally allows llm_triage to specify maximum number of tokens This means that you can limit the cost of llm triage by scanning only first N tokens of a post.	2024-10-23 16:49:56 +11:00
Sam	bdf3b6268b	FEATURE: smarter persona tethering (#832 ) Splits persona permissions so you can allow a persona on: - chat dms - personal messages - topic mentions - chat channels (any combination is allowed) Previously we did not have this flexibility. Additionally, adds the ability to "tether" a language model to a persona so it will always be used by the persona. This allows people to use a cheaper language model for one group of people and more expensive one for other people	2024-10-16 07:20:31 +11:00
Roman Rizzi	c7acb4a6a0	REFACTOR: Support of different summarization targets/prompts. (#835 ) * DEV: Add summary types * Refactor for different summary types * Use enum for summary types * Update lib/summarization/strategies/topic_summary.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Update lib/summarization/strategies/topic_gist.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Update lib/summarization/strategies/chat_messages.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Fix chat_messages single prompt * Small tweak to the chat summarization prompt --------- Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>	2024-10-15 13:53:26 -03:00
Rafael dos Santos Silva	791fad1e6a	FEATURE: Index embeddings using bit vectors (#824 ) On very large sites, the rare cache misses for Related Topics can take around 200ms, which affects our p99 metric on the topic page. In order to mitigate this impact, we now have several tools at our disposal. First, one is to migrate the index embedding type from halfvec to bit and change the related topic query to leverage the new bit index by changing the search algorithm from inner product to Hamming distance. This will reduce our index sizes by 90%, severely reducing the impact of embeddings on our storage. By making the related query a bit smarter, we can have zero impact on recall by using the index to over-capture N2 results, then re-ordering those N2 using the full halfvec vectors and taking the top N. The expected impact is to go from 200ms to <20ms for cache misses and from a 2.5GB index to a 250MB index on a large site. Another tool is migrating our index type from IVFFLAT to HNSW, which can increase the cache misses performance even further, eventually putting us in the under 5ms territory. Co-authored-by: Roman Rizzi <roman@discourse.org>	2024-10-14 13:26:03 -03:00
Sam	6c4c96e83c	FEATURE: allow persona to only force tool calls on limited replies (#827 ) This introduces another configuration that allows operators to limit the amount of interactions with forced tool usage. Forced tools are very handy in initial llm interactions, but as conversation progresses they can hinder by slowing down stuff and adding confusion.	2024-10-11 07:23:42 +11:00
Sam	5cbc9190eb	FEATURE: RAG search within tools (#802 ) This allows custom tools access to uploads and sophisticated searches using embedding. It introduces: - A shared front end for listing and uploading files (shared with personas) - Backend implementation of index.search function within a custom tool. Custom tools now may search through uploaded files function invoke(params) { return index.search(params.query) } This means that RAG implementers now may preload tools with knowledge and have high fidelity over the search. The search function support specifying max results specifying a subset of files to search (from uploads) Also - Improved documentation for tools (when creating a tool a preamble explains all the functionality) - uploads were a bit finicky, fixed an edge case where the UI would not show them as updated	2024-09-30 17:27:50 +10:00
Sam	03eccbe392	FEATURE: Make tool support polymorphic (#798 ) Polymorphic RAG means that we will be able to access RAG fragments both from AiPersona and AiCustomTool In turn this gives us support for richer RAG implementations.	2024-09-16 08:17:17 +10:00
Keegan George	f72ab12761	DEV: Clearly separate post/composer helper settings (#747 )	2024-08-12 15:40:23 -07:00
Rafael dos Santos Silva	1686a8a683	DEV: Move to single table per embeddings type (#561 ) Also move us to halfvecs for speed and disk usage gains	2024-08-08 11:55:20 -03:00
Roman Rizzi	20efc9285e	FIX: Correctly save provider-specific params for new models. (#744 ) Creating a new model, either manually or from presets, doesn't initialize the `provider_params` object, meaning their custom params won't persist. Additionally, this change adds some validations for Bedrock params, which are mandatory, and a clear message when a completion fails because we cannot build the URL.	2024-08-07 16:08:56 -03:00
Roman Rizzi	bed044448c	DEV: Remove old code now that features rely on LlmModels. (#729 ) * DEV: Remove old code now that features rely on LlmModels. * Hide old settings and migrate persona llm overrides * Remove shadowing special URL + seeding code. Use srv:// prefix instead.	2024-07-30 13:44:57 -03:00
Natalie Tay	7cd7f71857	DEV: Promote historical post-deploy migrations (#728 )	2024-07-30 01:44:57 +08:00
Rafael dos Santos Silva	665637fbad	FIX: Properly fix ai_summaries table sequence (#727 ) * FIX: Properly fix ai_summaries table sequence Previous attempt at `3815360` could fail due to a race introduced in `1b0ba91` where summaries are migrated to core in a post_migrate erroneously.	2024-07-26 14:45:01 -03:00
Roman Rizzi	5c196bca89	FEATURE: Track if a model can do vision in the llm_models table (#725 ) * FEATURE: Track if a model can do vision in the llm_models table * Data migration	2024-07-24 16:29:47 -03:00
Martin Brennan	5c1ab85583	DEV: More topic title prompt tweaks (#712 ) Followup `8d4a67fbe2` The prompt worked better, but it took the instructions about never using lowercase a little too literally, it wasn't using it for things like LLM or Discourse, also it was almost always framing the title as questions so now I asked it for a mix of questions and statements because that's less ambiguous.	2024-07-11 10:14:53 +10:00
Martin Brennan	8d4a67fbe2	DEV: Tweak topic title generator prompt (#710 ) Changes the title generator prompt to avoid clickbait-y titles and also try to avoid AI's favourite title format, which is "Some Thing: Other Thing" Leaving the chat thread title generator for now, that's not as important, the bizarre titles add to the experience there.	2024-07-10 14:31:59 +10:00
Roman Rizzi	5cb91217bd	FIX: Flaky SRV-backed model seeding. (#708 ) * Seeding the SRV-backed model should happen inside an initializer. * Keep the model up to date when the hidden setting changes. * Use the correct Mixtral model name and fix previous data migration. * URL validation should trigger only when we attempt to update it.	2024-07-08 18:47:10 -03:00
Sam	38153608f8	FIX: repair id sequence identity on summary table (#701 ) 1. Repairs the identity on the summary table, we migrated data without resetting it. 2. Adds an index into ai_summary table to match expected retrieval pattern	2024-07-04 12:23:46 +10:00
Sam	1320eed9b2	FEATURE: move summary to use llm_model (#699 ) This allows summary to use the new LLM models and migrates of API key based model selection Claude 3.5 etc... all work now. --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2024-07-04 10:48:18 +10:00
Keegan George	1b0ba9197c	DEV: Add summarization logic from core (#658 )	2024-07-02 08:51:59 -07:00
Sam	b863ddc94b	FEATURE: custom user defined tools (#677 ) Introduces custom AI tools functionality. 1. Why it was added: The PR adds the ability to create, manage, and use custom AI tools within the Discourse AI system. This feature allows for more flexibility and extensibility in the AI capabilities of the platform. 2. What it does: - Introduces a new `AiTool` model for storing custom AI tools - Adds CRUD (Create, Read, Update, Delete) operations for AI tools - Implements a tool runner system for executing custom tool scripts - Integrates custom tools with existing AI personas - Provides a user interface for managing custom tools in the admin panel 3. Possible use cases: - Creating custom tools for specific tasks or integrations (stock quotes, currency conversion etc...) - Allowing administrators to add new functionalities to AI assistants without modifying core code - Implementing domain-specific tools for particular communities or industries 4. Code structure: The PR introduces several new files and modifies existing ones: a. Models: - `app/models/ai_tool.rb`: Defines the AiTool model - `app/serializers/ai_custom_tool_serializer.rb`: Serializer for AI tools b. Controllers: - `app/controllers/discourse_ai/admin/ai_tools_controller.rb`: Handles CRUD operations for AI tools c. Views and Components: - New Ember.js components for tool management in the admin interface - Updates to existing AI persona management components to support custom tools d. Core functionality: - `lib/ai_bot/tool_runner.rb`: Implements the custom tool execution system - `lib/ai_bot/tools/custom.rb`: Defines the custom tool class e. Routes and configurations: - Updates to route configurations to include new AI tool management pages f. Migrations: - `db/migrate/20240618080148_create_ai_tools.rb`: Creates the ai_tools table g. Tests: - New test files for AI tool functionality and integration The PR integrates the custom tools system with the existing AI persona framework, allowing personas to use both built-in and custom tools. It also includes safety measures such as timeouts and HTTP request limits to prevent misuse of custom tools. Overall, this PR significantly enhances the flexibility and extensibility of the Discourse AI system by allowing administrators to create and manage custom AI tools tailored to their specific needs. Co-authored-by: Martin Brennan <martin@discourse.org>	2024-06-27 17:27:40 +10:00
Loïc Guitaut	e26c5986f2	DEV: Use Rails 7.0 instead of 7.1 in post-migrations	2024-06-26 18:41:38 +02:00
Loïc Guitaut	6f5873b072	DEV: Use Rails 7.0 instead of 7.1 in migrations	2024-06-26 18:32:11 +02:00
Roman Rizzi	f622e2644f	FEATURE: Store provider-specific parameters. (#686 ) Previously, we stored request parameters like the OpenAI organization and Bedrock's access key and region as site settings. This change stores them in the `llm_models` table instead, letting us drop more settings while also becoming more flexible.	2024-06-25 08:26:30 +10:00
Roman Rizzi	558574fa87	DEV: Use LlmModels as options in automation rules (#676 )	2024-06-21 08:07:17 +10:00
Roman Rizzi	8849caf136	DEV: Transition "Select model" settings to only use LlmModels (#675 ) We no longer support the "provider:model" format in the "ai_helper_model" and "ai_embeddings_semantic_search_hyde_model" settings. We'll migrate existing values and work with our new data-driven LLM configs from now on.	2024-06-19 18:01:35 -03:00
Sam	0d6d9a6ef5	FEATURE: allow access to private topics if tool permits (#673 ) Previously read tool only had access to public topics, this allows access to all topics user has access to, if admin opts for the option Also - Fixes VLLM migration - Display which llms have bot enabled	2024-06-19 15:49:36 +10:00
Roman Rizzi	8d5f901a67	DEV: Rewire AI bot internals to use LlmModel (#638 ) * DRAFT: Create AI Bot users dynamically and support custom LlmModels * Get user associated to llm_model * Track enabled bots with attribute * Don't store bot username. Minor touches to migrate default values in settings * Handle scenario where vLLM uses a SRV record * Made 3.5-turbo-16k the default version so we can remove hack	2024-06-18 14:32:14 -03:00
Sam	5abf80cb4e	FIX: do not mark column read only so certain deployments work (#663 ) In some case we may be deploying migrations, seeding and then running post migrations, we need this to work so we give up on this small window of protection	2024-06-11 21:32:49 +10:00
Sam	52a7dd2a4b	FEATURE: optional tool detail blocks (#662 ) This is a rather huge refactor with 1 new feature (tool details can be suppressed) Previously we use the name "Command" to describe "Tools", this unifies all the internal language and simplifies the code. We also amended the persona UI to use less DToggles which aligns with our design guidelines. Co-authored-by: Martin Brennan <martin@discourse.org>	2024-06-11 18:14:14 +10:00
Loïc Guitaut	dd4e305ff7	DEV: Update rubocop-discourse to version 3.8.0 (#641 )	2024-05-28 11:15:42 +02:00
Keegan George	a1c649965f	FEATURE: Auto image captions (#637 )	2024-05-27 10:49:24 -07:00
Sam	baf88e7cfc	FEATURE: improve logging by including llm name (#640 ) Log the language model name when logging api requests	2024-05-27 16:46:01 +10:00
Roman Rizzi	1d786fbaaf	FEATURE: Set endpoint credentials directly from LlmModel. (#625 ) * FEATURE: Set endpoint credentials directly from LlmModel. Drop Llama2Tokenizer since we no longer use it. * Allow http for custom LLMs --------- Co-authored-by: Rafael Silva <xfalcox@gmail.com>	2024-05-16 09:50:22 -03:00
Sam	8eee6893d6	FEATURE: GPT4o support and better auditing (#618 ) - Introduce new support for GPT4o (automation / bot / summary / helper) - Properly account for token counts on OpenAI models - Track feature that was used when generating AI completions - Remove custom llm support for summarization as we need better interfaces to control registration and de-registration	2024-05-14 13:28:46 +10:00
Roman Rizzi	62fc7d6ed0	FEATURE: Configurable LLMs. (#606 ) This PR introduces the concept of "LlmModel" as a new way to quickly add new LLM models without making any code changes. We are releasing this first version and will add incremental improvements, so expect changes. The AI Bot can't fully take advantage of this feature as users are hard-coded. We'll fix this in a separate PR.s	2024-05-13 12:46:42 -03:00
Sam	e4b326c711	FEATURE: support Chat with AI Persona via a DM (#488 ) Add support for chat with AI personas - Allow enabling chat for AI personas that have an associated user - Add new setting `allow_chat` to AI persona to enable/disable chat - When a message is created in a DM channel with an allowed AI persona user, schedule a reply job - AI replies to chat messages using the persona's `max_context_posts` setting to determine context - Store tool calls and custom prompts used to generate a chat reply on the `ChatMessageCustomPrompt` table - Add tests for AI chat replies with tools and context At the moment unlike posts we do not carry tool calls in the context. No @mention support yet for ai personas in channels, this is future work	2024-05-06 09:49:02 +10:00
Sam	32b3004ce9	FEATURE: Add Question Consolidator for robust Upload support in Personas (#596 ) This commit introduces a new feature for AI Personas called the "Question Consolidator LLM". The purpose of the Question Consolidator is to consolidate a user's latest question into a self-contained, context-rich question before querying the vector database for relevant fragments. This helps improve the quality and relevance of the retrieved fragments. Previous to this change we used the last 10 interactions, this is not ideal cause the RAG would "lock on" to an answer. EG: - User: how many cars are there in europe - Model: detailed answer about cars in europe including the term car and vehicle many times - User: Nice, what about trains are there in the US In the above example "trains" and "US" becomes very low signal given there are pages and pages talking about cars and europe. This mean retrieval is sub optimal. Instead, we pass the history to the "question consolidator", it would simply consolidate the question to "How many trains are there in the United States", which would make it fare easier for the vector db to find relevant content. The llm used for question consolidator can often be less powerful than the model you are talking to, we recommend using lighter weight and fast models cause the task is very simple. This is configurable from the persona ui. This PR also removes support for {uploads} placeholder, this is too complicated to get right and we want freedom to shift RAG implementation. Key changes: 1. Added a new `question_consolidator_llm` column to the `ai_personas` table to store the LLM model used for question consolidation. 2. Implemented the `QuestionConsolidator` module which handles the logic for consolidating the user's latest question. It extracts the relevant user and model messages from the conversation history, truncates them if needed to fit within the token limit, and generates a consolidated question prompt. 3. Updated the `Persona` class to use the Question Consolidator LLM (if configured) when crafting the RAG fragments prompt. It passes the conversation context to the consolidator to generate a self-contained question. 4. Added UI elements in the AI Persona editor to allow selecting the Question Consolidator LLM. Also made some UI tweaks to conditionally show/hide certain options based on persona configuration. 5. Wrote unit tests for the QuestionConsolidator module and updated existing persona tests to cover the new functionality. This feature enables AI Personas to better understand the context and intent behind a user's question by consolidating the conversation history into a single, focused question. This can lead to more relevant and accurate responses from the AI assistant.	2024-04-30 13:49:21 +10:00
Sam	f6ac5cd0a8	FEATURE: allow tuning of RAG generation (#565 ) * FEATURE: allow tuning of RAG generation - change chunking to be token based vs char based (which is more accurate) - allow control over overlap / tokens per chunk and conversation snippets inserted - UI to control new settings * improve ui a bit * fix various reindex issues * reduce concurrency * try ultra low queue ... concurrency 1 is too slow.	2024-04-12 10:32:46 -03:00
Rafael dos Santos Silva	eb93b21769	FEATURE: Add BGE-M3 embeddings support (#569 ) BAAI/bge-m3 is an interesting model, that is multilingual and with a context size of 8192. Even with a 16x larger context, it's only 4x slower to compute it's embeddings on the worst case scenario. Also includes a minor refactor of the rake task, including setting model and concurrency levels when running the backfill task.	2024-04-10 17:24:01 -03:00

1 2 3

105 Commits