discourse-ai

Commit Graph

Author	SHA1	Message	Date
Kelv	5e87a50202	DEV: Update more deprecated Font Awesome icon names (#1005 ) * DEV: Update more deprecated Font Awesome icon names * update to trash-can	2024-12-06 07:45:05 +11:00
Kris	70fae2b699	FIX: close link in shared conversation model (#1007 )	2024-12-05 15:19:36 -05:00
Roman Rizzi	4ba74511c2	FIX: Make sure limits are updated and applied on each step (#1002 )	2024-12-05 10:31:39 -03:00
Roman Rizzi	ce6a2eca21	FEATURE: Backfill posts sentiment. (#982 ) * FEATURE: Backfill posts sentiment. It adds a scheduled job to backfill posts' sentiment, similar to our existing rake task, but with two settings to control the batch size and posts' max-age. * Make sure model_name order is consistent.	2024-12-03 10:27:03 -03:00
Sam	7c65dd171f	FIX: regression, no longer sending examples to AI helper (#993 ) For a while now we have not been sending the examples to AI helper, which can lead to inconsistent results. Note: this also means that in non English we did not send English results, so this may end up reducing performance That said first thing we need to do is fix the regression.	2024-12-03 16:03:46 +11:00
Sam	117c06220e	FEATURE: allow artifacts to be updated (#980 ) Add support for versioned artifacts with improved diff handling * Add versioned artifacts support allowing artifacts to be updated and tracked - New `ai_artifact_versions` table to store version history - Support for updating artifacts through a new `UpdateArtifact` tool - Add version-aware artifact rendering in posts - Include change descriptions for version tracking * Enhance artifact rendering and security - Add support for module-type scripts and external JS dependencies - Expand CSP to allow trusted CDN sources (unpkg, cdnjs, jsdelivr, googleapis) - Improve JavaScript handling in artifacts * Implement robust diff handling system (this is dormant but ready to use once LLMs catch up) - Add new DiffUtils module for applying changes to artifacts - Support for unified diff format with multiple hunks - Intelligent handling of whitespace and line endings - Comprehensive error handling for diff operations * Update routes and UI components - Add versioned artifact routes - Update markdown processing for versioned artifacts Also - Tweaks summary prompt - Improves upload support in custom tool to also provide urls	2024-12-03 07:23:31 +11:00
Rafael dos Santos Silva	0ac18d157b	FEATURE: Adjustments to gist summaries (#988 ) - makes visible to everyone by default - backfills gists before full summaries - adds configurable max age setting to backfill job	2024-12-02 15:22:35 -03:00
Rafael dos Santos Silva	3828370679	DEV: Cleanup deprecations (#952 )	2024-12-02 14:18:03 -03:00
Sam	bc0657f478	FEATURE: AI Usage page (#964 ) - Added a new admin interface to track AI usage metrics, including tokens, features, and models. - Introduced a new route `/admin/plugins/discourse-ai/ai-usage` and supporting API endpoint in `AiUsageController`. - Implemented `AiUsageSerializer` for structuring AI usage data. - Integrated CSS stylings for charts and tables under `stylesheets/modules/llms/common/usage.scss`. - Enhanced backend with `AiApiAuditLog` model changes: added `cached_tokens` column (implemented with OpenAI for now) with relevant DB migration and indexing. - Created `Report` module for efficient aggregation and filtering of AI usage metrics. - Updated AI Bot title generation logic to log correctly to user vs bot - Extended test coverage for the new tracking features, ensuring data consistency and access controls.	2024-11-29 06:26:48 +11:00
Roman Rizzi	c980c34d77	REFACTOR: Simplify sentiment classification (#977 ) This change adds a simpler class for sentiment classification, replacing the soon-to-be removed `Classificator` hierarchy. Additionally, it adds a method for classifying concurrently, speeding up the backfill rake task.	2024-11-28 15:38:23 -03:00
Rafael dos Santos Silva	4980a4b2f7	FIX: Multiple concurrent summaries could result in pg index errors (#973 )	2024-11-28 11:53:04 -03:00
Keegan George	4da033c667	FIX: Double render error with thumbnail suggestions (#968 ) This PR fixes a bug where a double render error appears in the logs when thumbnails are suggested	2024-11-27 15:12:27 -08:00
Rafael dos Santos Silva	0d3e6b2726	FIX: Fix ordering of random post embeddings backfill (#965 ) * FIX: Fix ordering of random post embeddings backfill * fix annotations --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2024-11-27 17:01:54 -03:00
Roman Rizzi	ef07fcb308	FIX: Skip records without content to classify (#960 )	2024-11-26 15:54:20 -03:00
Roman Rizzi	ddf2bf7034	DEV: Backfill embeddings concurrently. (#941 ) We are adding a new method for generating and storing embeddings in bulk, which relies on `Concurrent::Promises::Future`. Generating an embedding consists of three steps: Prepare text HTTP call to retrieve the vector Save to DB. Each one is independently executed on whatever thread the pool gives us. We are bringing a custom thread pool instead of the global executor since we want control over how many threads we spawn to limit concurrency. We also avoid firing thousands of HTTP requests when working with large batches.	2024-11-26 14:12:32 -03:00
Rafael dos Santos Silva	23193ee6f2	FEATURE: Calculate gists from non hot topics too (#958 ) Also renames some settings to remove 'hot' references.	2024-11-26 13:44:12 -03:00
Roman Rizzi	95762723de	PERF: Preload only gists when including summaries in topic list (#948 ) * PERF: Preload only gists when including summaries in topic list * Add unique index on summaries and dedup existing records * Make hot topics batch size setting hidden	2024-11-25 12:24:02 -03:00
Natalie Tay	f8231d259b	FEATURE: Add locale detection prompt from translator (#946 )	2024-11-25 08:33:54 +11:00
Sam	86cf4ccba7	FIX: automatically bust cache for share ai assets (#942 ) * FIX: automatically bust cache for share ai assets CDNs can be configured to strip query params in Discourse hosting. This is generally safe, but in this case we had no way of busting the cache using the path. New design properly caches and properly breaks busts the cache if asset changes so we don't need to worry about versions * one day I will set up conditional lint on save :)	2024-11-22 11:23:15 +11:00
Sam	52c644798d	DEV: improve artifact presentation (#932 ) 1. Keep source in a "details" block after rendered so it does not overwhelm users 2. Ensure artifacts are never indexed by robots 3. Cache break our CSS that changed recently	2024-11-20 18:53:19 +11:00
Sam	2652716398	UX: improve artifact styling add direct share link (#930 ) Also remove uneeded sandboxing give this is all handled by artifacts directly	2024-11-20 13:13:03 +11:00
David Taylor	b10be23533	FIX: Ensure artifacts are sandboxed, even when visited directly (#921 ) It's important that artifacts are never given 'same origin' access to the forum domain, so that they cannot access cookies, or make authenticated HTTP requests. So even when visiting the URL directly, we need to wrap them in a sandboxed iframe.	2024-11-19 11:44:17 +00:00
Sam	3ae1e4eaf0	FIX: properly bypass CSP for artifacts (#920 ) Was meant to be bypassed but was not implemented correctly	2024-11-19 20:25:07 +11:00
Sam	755b63f31f	FEATURE: Add support for Mistral models (#919 ) Adds support for mistral models (pixtral and mistral large now have presets) Also corrects token accounting in AWS bedrock models	2024-11-19 17:28:09 +11:00
Sam	0d7f353284	FEATURE: AI artifacts (#898 ) This is a significant PR that introduces AI Artifacts functionality to the discourse-ai plugin along with several other improvements. Here are the key changes: 1. AI Artifacts System: - Adds a new `AiArtifact` model and database migration - Allows creation of web artifacts with HTML, CSS, and JavaScript content - Introduces security settings (`strict`, `lax`, `disabled`) for controlling artifact execution - Implements artifact rendering in iframes with sandbox protection - New `CreateArtifact` tool for AI to generate interactive content 2. Tool System Improvements: - Adds support for partial tool calls, allowing incremental updates during generation - Better handling of tool call states and progress tracking - Improved XML tool processing with CDATA support - Fixes for tool parameter handling and duplicate invocations 3. LLM Provider Updates: - Updates for Anthropic Claude models with correct token limits - Adds support for native/XML tool modes in Gemini integration - Adds new model configurations including Llama 3.1 models - Improvements to streaming response handling 4. UI Enhancements: - New artifact viewer component with expand/collapse functionality - Security controls for artifact execution (click-to-run in strict mode) - Improved dialog and response handling - Better error management for tool execution 5. Security Improvements: - Sandbox controls for artifact execution - Public/private artifact sharing controls - Security settings to control artifact behavior - CSP and frame-options handling for artifacts 6. Technical Improvements: - Better post streaming implementation - Improved error handling in completions - Better memory management for partial tool calls - Enhanced testing coverage 7. Configuration: - New site settings for artifact security - Extended LLM model configurations - Additional tool configuration options This PR significantly enhances the plugin's capabilities for generating and displaying interactive content while maintaining security and providing flexible configuration options for administrators.	2024-11-19 09:22:39 +11:00
Sam	e817b7dc11	FEATURE: improve tool support (#904 ) This re-implements tool support in DiscourseAi::Completions::Llm #generate Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML New implementation has the endpoints return ToolCall objects. Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement decode, decode_chunk (for streaming) It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up. Also (new) Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM Token accounting is fixed for vllm (we were not correctly counting tokens)	2024-11-12 08:14:30 +11:00
Roman Rizzi	fbc74c7467	FEATURE: Extend summary backfill to also generate gists (#896 ) Updates default batch size to 0 and max to 10000	2024-11-07 13:40:18 -03:00
Keegan George	99282612a9	DEV: Prefer ENV key for seeded models (#893 ) This PR ensures we prefer getting the API key from environment variables when it is a seeded model.	2024-11-05 06:19:13 -08:00
Roman Rizzi	9505a8976c	FEATURE: Automatically backfill regular summaries. (#892 ) This change introduces a job to summarize topics and cache the results automatically. We provide a setting to control how many topics we'll backfill per hour and what the topic's minimum word count is to qualify. We'll prioritize topics without summary over outdated ones.	2024-11-04 17:48:11 -03:00
Sam	98022d7d96	FEATURE: support custom instructions for persona streaming (#890 ) This allows us to inject information into the system prompt which can help shape replies without repeating over and over in messages.	2024-11-05 07:43:26 +11:00
Sam	c352054d4e	FIX: encode parameters returned from LLMs correctly (#889 ) Fixes encoding of params on LLM function calls. Previously we would improperly return results if a function parameter returned an HTML tag. Additionally adds some missing HTTP verbs to tool calls.	2024-11-04 10:07:17 +11:00
Sam	34a59b623e	FIX: ensure replies are never double streamed (#879 ) The custom field "discourse_ai_bypass_ai_reply" was added so we can signal the post created hook to bypass replying even if it thinks it should. Otherwise there are cases where we double answer user questions leading to much confusion. This also slightly refactors code making the controller smaller	2024-10-30 20:24:39 +11:00
Sam	be0b78cacd	FEATURE: new endpoint for directly accessing a persona (#876 ) The new `/admin/plugins/discourse-ai/ai-personas/stream-reply.json` was added. This endpoint streams data direct from a persona and can be used to access a persona from remote systems leaving a paper trail in PMs about the conversation that happened This endpoint is only accessible to admins. --------- Co-authored-by: Gabriel Grubba <70247653+Grubba27@users.noreply.github.com> Co-authored-by: Keegan George <kgeorge13@gmail.com>	2024-10-30 10:28:20 +11:00
David Taylor	945f04b089	DEV: Update plugin annotations (#871 )	2024-10-28 14:07:09 +00:00
Roman Rizzi	a2b1ea3c63	FEATURE: Fast-track gist regeneration when a hot topic gets a new post (#860 ) * FEATURE: Fast-track gist regeneration when a hot topic gets a new post * DEV: Introduce an upsert-like summarize * FIX: Only enqueue fast-track gist for hot hot hot topics --------- Co-authored-by: Rafael Silva <xfalcox@gmail.com>	2024-10-25 12:38:49 -03:00
Roman Rizzi	ec97996905	FIX/REFACTOR: FoldContent revamp (#866 ) * FIX/REFACTOR: FoldContent revamp We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic. The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex. To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize. Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with. * Fix fold docs * Use #shift instead of #pop to get the first elem, not the last	2024-10-25 11:51:17 -03:00
Sam	12869f2146	FIX: testing tool was not showing rag results (#867 ) This changeset contains 4 fixes: 1. We were allowing running tests on unsaved tools, this is problematic cause uploads are not yet associated or indexed leading to confusing results. We now only show the test button when tool is saved. 2. We were not properly scoping rag document fragements, this meant that personas and ai tools could get results from other unrelated tools, just to be filtered out later 3. index.search showed options as "optional" but implementation required the second option 4. When testing tools searching through document fragments was not working at all cause we did not properly load the tool	2024-10-25 16:01:25 +11:00
Sam	4923837165	FIX: Llm selector / forced tools / search tool (#862 ) * FIX: Llm selector / forced tools / search tool This fixes a few issues: 1. When search was not finding any semantic results we would break the tool 2. Gemin / Anthropic models did not implement forced tools previously despite it being an API option 3. Mechanics around displaying llm selector were not right. If you disabled LLM selector server side persona PM did not work correctly. 4. Disabling native tools for anthropic model moved out of a site setting. This deliberately does not migrate cause this feature is really rare to need now, people who had it set probably did not need it. 5. Updates anthropic model names to latest release * linting * fix a couple of tests I missed * clean up conditional	2024-10-25 06:24:53 +11:00
Sam	059d3b6fd2	FEATURE: better logging for automation reports (#853 ) A new feature_context json column was added to ai_api_audit_logs This allows us to store rich json like context on any LLM request made. This new field now stores automation id and name. Additionally allows llm_triage to specify maximum number of tokens This means that you can limit the cost of llm triage by scanning only first N tokens of a post.	2024-10-23 16:49:56 +11:00
Sam	a1f859a415	FEATURE: improve visibility of AI usage in LLM page (#845 ) This changeset: 1. Corrects some issues with "force_default_llm" not applying 2. Expands the LLM list page to show LLM usage 3. Clarifies better what "enabling a bot" on an llm means (you get it in the selector)	2024-10-22 11:16:02 +11:00
Roman Rizzi	6d504ab80d	FEATURE: Make hot topic gists opt-in. (#846 ) This change restricts gists to members of specific groups. It also fixes a bug where other lists could display the gist if available.	2024-10-21 15:15:25 -03:00
Roman Rizzi	e768fa877e	FIX: Don't regenerate up to date gists (#843 )	2024-10-18 18:49:01 -03:00
Roman Rizzi	27b5542357	FEATURE: Generate topic gists for the hot topics list. (#837 ) * Display gists in the hot topics list * Adjust hot topics gist strategy and add a job to generate gists * Replace setting with a configurable batch size * Avoid loading summaries for other topic lists * Tweak gist prompt to focus on latest posts in the context of the OP * Remove serializer hack and rely on core change from discourse/discourse#29291 * Update lib/summarization/strategies/hot_topic_gists.rb Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com> --------- Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>	2024-10-18 18:01:39 -03:00
Rafael dos Santos Silva	792703c942	FEATURE: Discord Bot integration (#831 ) This adds support for the a Discord bot that can search in a Discourse instance when invoked via slash commands in Discord Guild channel.	2024-10-16 12:41:18 -03:00
Sam	bdf3b6268b	FEATURE: smarter persona tethering (#832 ) Splits persona permissions so you can allow a persona on: - chat dms - personal messages - topic mentions - chat channels (any combination is allowed) Previously we did not have this flexibility. Additionally, adds the ability to "tether" a language model to a persona so it will always be used by the persona. This allows people to use a cheaper language model for one group of people and more expensive one for other people	2024-10-16 07:20:31 +11:00
Roman Rizzi	c7acb4a6a0	REFACTOR: Support of different summarization targets/prompts. (#835 ) * DEV: Add summary types * Refactor for different summary types * Use enum for summary types * Update lib/summarization/strategies/topic_summary.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Update lib/summarization/strategies/topic_gist.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Update lib/summarization/strategies/chat_messages.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Fix chat_messages single prompt * Small tweak to the chat summarization prompt --------- Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>	2024-10-15 13:53:26 -03:00
Rafael dos Santos Silva	791fad1e6a	FEATURE: Index embeddings using bit vectors (#824 ) On very large sites, the rare cache misses for Related Topics can take around 200ms, which affects our p99 metric on the topic page. In order to mitigate this impact, we now have several tools at our disposal. First, one is to migrate the index embedding type from halfvec to bit and change the related topic query to leverage the new bit index by changing the search algorithm from inner product to Hamming distance. This will reduce our index sizes by 90%, severely reducing the impact of embeddings on our storage. By making the related query a bit smarter, we can have zero impact on recall by using the index to over-capture N2 results, then re-ordering those N2 using the full halfvec vectors and taking the top N. The expected impact is to go from 200ms to <20ms for cache misses and from a 2.5GB index to a 250MB index on a large site. Another tool is migrating our index type from IVFFLAT to HNSW, which can increase the cache misses performance even further, eventually putting us in the under 5ms territory. Co-authored-by: Roman Rizzi <roman@discourse.org>	2024-10-14 13:26:03 -03:00
Hoa Nguyen	94010a5f78	FEATURE: Tools for models from Ollama provider (#819 ) Adds support for Ollama function calling	2024-10-11 07:25:53 +11:00
Sam	6c4c96e83c	FEATURE: allow persona to only force tool calls on limited replies (#827 ) This introduces another configuration that allows operators to limit the amount of interactions with forced tool usage. Forced tools are very handy in initial llm interactions, but as conversation progresses they can hinder by slowing down stuff and adding confusion.	2024-10-11 07:23:42 +11:00
Sam	e1a0eb6131	FEATURE: support chain halting and upload creation support (#821 ) This adds chain halting (ability to terminate llm chain in a tool) and the ability to create uploads in a tool Together this lets us integrate custom image generators into a custom tool.	2024-10-09 08:17:45 +11:00

1 2 3 4 5

205 Commits