discourse-ai

Commit Graph

Author	SHA1	Message	Date
Roman Rizzi	1c40a698ca	FIX: get strategy version through vector_rep (#1028 )	2024-12-13 18:49:18 -03:00
Roman Rizzi	eae527f99d	REFACTOR: A Simpler way of interacting with embeddings tables. (#1023 ) * REFACTOR: A Simpler way of interacting with embeddings' tables. This change adds a new abstraction called `Schema`, which acts as a repository that supports the same DB features `VectorRepresentation::Base` has, with the exception that removes the need to have duplicated methods per embeddings table. It is also a bit more flexible when performing a similarity search because you can pass it a block that gives you access to the builder, allowing you to add multiple joins/where conditions.	2024-12-13 10:15:21 -03:00
Krzysztof Kotlarek	04c4ff8cf0	UX: No admin header for edit personas tools or llms (#1021 ) In this PR, we added functionality to hide the admin header for edit/new actions - https://github.com/discourse/discourse/pull/30175 To make it work properly, we have to rename `show` to `edit` which is also a more accurate name.	2024-12-12 10:48:58 +11:00
Sam	47c1ea337e	FIX: allow scanning of trashed posts and deleted users for test (#1024 ) When a post is trashed we should still be allowed to scan it post.topic will be nil for a trashed topic even if post is trashed	2024-12-12 10:26:05 +11:00
Sam	47f5da7e42	FEATURE: Add AI-powered spam detection for new user posts (#1004 ) This introduces a comprehensive spam detection system that uses LLM models to automatically identify and flag potential spam posts. The system is designed to be both powerful and configurable while preventing false positives. Key Features: * Automatically scans first 3 posts from new users (TL0/TL1) * Creates dedicated AI flagging user to distinguish from system flags * Tracks false positives/negatives for quality monitoring * Supports custom instructions to fine-tune detection * Includes test interface for trying detection on any post Technical Implementation: * New database tables: - ai_spam_logs: Stores scan history and results - ai_moderation_settings: Stores LLM config and custom instructions * Rate limiting and safeguards: - Minimum 10-minute delay between rescans - Only scans significant edits (>10 char difference) - Maximum 3 scans per post - 24-hour maximum age for scannable posts * Admin UI features: - Real-time testing capabilities - 7-day statistics dashboard - Configurable LLM model selection - Custom instruction support Security and Performance: * Respects trust levels - only scans TL0/TL1 users * Skips private messages entirely * Stops scanning users after 3 successful public posts * Includes comprehensive test coverage * Maintains audit log of all scan attempts --------- Co-authored-by: Keegan George <kgeorge13@gmail.com> Co-authored-by: Martin Brennan <martin@discourse.org>	2024-12-12 09:17:25 +11:00
Martin Brennan	ae80494448	UX: Improve rough edges of AI usage page (#1014 ) * UX: Improve rough edges of AI usage page * Ensure all text uses I18n * Change from <button> usage to <DButton> * Use <AdminConfigAreaCard> in place of custom card styles * Format numbers nicely using our number format helper, show full values on hover using title attr * Ensure 0 is always shown for counters, instead of being blank * FEATURE: Load usage data after page load Use ConditionalLoadingSpinner to hide load of usage data, this prevents us hanging on page load with a white screen. * UX: Split users table, and add empty placeholders and page subheader * DEV: Test fix	2024-12-12 08:55:24 +11:00
Keegan George	a4440c507b	UX: Make sentiment trends more readable (#1018 ) Instead of a stacked chart showing a separate series for positive and negative, this PR introduces a simplification to the overall sentiment dashboard. It comprises the sentiment into a single series of the difference between `positive - negative` instead. This should allow for the data to be more easy to scan and look for trends	2024-12-11 09:13:18 -08:00
Roman Rizzi	5fc7a730ef	FIX: Triage rule should append selected tags instead of replacing them (#1022 )	2024-12-11 11:19:44 -03:00
Roman Rizzi	6da35d8e66	FIX: Gemini inference client was missing #instance (#1019 )	2024-12-10 15:42:31 -03:00
Keegan George	700e9de073	Revert "UX: Make sentiment trends more readable in time series data (#1013 )" (#1016 ) This reverts commit `375dd702b2`.	2024-12-10 08:15:27 -08:00
Keegan George	375dd702b2	UX: Make sentiment trends more readable in time series data (#1013 ) Instead of a stacked chart showing a separate series for positive and negative, this PR introduces a simplification to the overall sentiment dashboard. It comprises the sentiment into a single series of the difference between `positive - negative` instead. This should allow for the data to be more easy to scan and look for trends.	2024-12-10 07:22:41 -08:00
Sam	7ca21cc329	FEATURE: first class support for OpenRouter (#1011 ) * FEATURE: first class support for OpenRouter This new implementation supports picking quantization and provider pref Also: - Improve logging for summary generation - Improve error message when contacting LLMs fails * Better support for full screen artifacts on iPad Support back button to close full screen	2024-12-10 05:59:19 +11:00
Roman Rizzi	085dde7042	FEATURE: Select stop sequences from triage script (#1010 )	2024-12-06 11:13:47 -03:00
Roman Rizzi	7ebbcd2de3	FIX: Make sure prompt uploads get included in the prompt when triaging (#1008 )	2024-12-05 21:04:35 -03:00
Sam	a55216773a	FEATURE: Amazon Nova support via bedrock (#997 ) Refactor dialect selection and add Nova API support Change dialect selection to use llm_model object instead of just provider name Add support for Amazon Bedrock's Nova API with native tools Implement Nova-specific message processing and formatting Update specs for Nova and AWS Bedrock endpoints Enhance AWS Bedrock support to handle Nova models Fix Gemini beta API detection logic	2024-12-06 07:45:58 +11:00
Roman Rizzi	b32b1cf241	FIX: Add a digest check to avoid repeteadly generating embeddings (bulk) (#1001 )	2024-12-04 17:47:28 -03:00
Keegan George	d6beac48f8	DEV: Improve explain suggestion footnote replacement (#999 ) Previously, when clicking add footnote on an explain suggestion it would replace the selected word by finding the first occurrence of the word. This results in issues when there are more than one occurrences of a word in a post. This is not trivial to solve, so this PR instead prevents incorrect text replacements by only allowing the replacement if it's unique. We use the same logic here that we use to determine if something can be fast edited. In this PR we also update tests for post helper explain suggestions. For a while, we haven't had tests here due to streaming/timing issues, we've been skipping our system specs. In this PR, we add acceptance tests to handle this which gives us improved ability to publish message bus updates in the testing environment so that it can be better tested without issues.	2024-12-04 11:41:34 -08:00
Kris	8203bdfbc9	UX: move topic summary from DMenu to DModal (#992 ) Co-authored-by: Keegan George <kgeorge13@gmail.com>	2024-12-03 13:30:15 -05:00
Roman Rizzi	ce6a2eca21	FEATURE: Backfill posts sentiment. (#982 ) * FEATURE: Backfill posts sentiment. It adds a scheduled job to backfill posts' sentiment, similar to our existing rake task, but with two settings to control the batch size and posts' max-age. * Make sure model_name order is consistent.	2024-12-03 10:27:03 -03:00
Sam	7c65dd171f	FIX: regression, no longer sending examples to AI helper (#993 ) For a while now we have not been sending the examples to AI helper, which can lead to inconsistent results. Note: this also means that in non English we did not send English results, so this may end up reducing performance That said first thing we need to do is fix the regression.	2024-12-03 16:03:46 +11:00
Rafael dos Santos Silva	e3f5e86dc5	FIX: AI Automation scripts were broken when using seeded models (#991 )	2024-12-02 19:07:05 -03:00
Keegan George	fc88bb08ab	FIX: Tag suggester is suggesting already assigned tags (#990 ) This PR fixes an issue where the tag suggester for edit title topic area was suggesting tags that are already assigned on a post. It also updates the amount of suggested tags to 7 so that there is still a decent amount of tags suggested when tags are already assigned.	2024-12-03 07:25:04 +11:00
Sam	117c06220e	FEATURE: allow artifacts to be updated (#980 ) Add support for versioned artifacts with improved diff handling * Add versioned artifacts support allowing artifacts to be updated and tracked - New `ai_artifact_versions` table to store version history - Support for updating artifacts through a new `UpdateArtifact` tool - Add version-aware artifact rendering in posts - Include change descriptions for version tracking * Enhance artifact rendering and security - Add support for module-type scripts and external JS dependencies - Expand CSP to allow trusted CDN sources (unpkg, cdnjs, jsdelivr, googleapis) - Improve JavaScript handling in artifacts * Implement robust diff handling system (this is dormant but ready to use once LLMs catch up) - Add new DiffUtils module for applying changes to artifacts - Support for unified diff format with multiple hunks - Intelligent handling of whitespace and line endings - Comprehensive error handling for diff operations * Update routes and UI components - Add versioned artifact routes - Update markdown processing for versioned artifacts Also - Tweaks summary prompt - Improves upload support in custom tool to also provide urls	2024-12-03 07:23:31 +11:00
Rafael dos Santos Silva	0ac18d157b	FEATURE: Adjustments to gist summaries (#988 ) - makes visible to everyone by default - backfills gists before full summaries - adds configurable max age setting to backfill job	2024-12-02 15:22:35 -03:00
Rafael dos Santos Silva	3828370679	DEV: Cleanup deprecations (#952 )	2024-12-02 14:18:03 -03:00
Roman Rizzi	0abd4b1244	FIX: Sentiment classification results needs to be transformed before saving (#983 )	2024-11-29 17:31:56 -03:00
Sam	0cb2c413ba	FEATURE: exclude muted categories from category suggester (#979 ) The logic here is that users do not particularly care about topics in the category so we can exclude them from tag and category suggestions	2024-11-29 12:17:28 +11:00
Sam	bc0657f478	FEATURE: AI Usage page (#964 ) - Added a new admin interface to track AI usage metrics, including tokens, features, and models. - Introduced a new route `/admin/plugins/discourse-ai/ai-usage` and supporting API endpoint in `AiUsageController`. - Implemented `AiUsageSerializer` for structuring AI usage data. - Integrated CSS stylings for charts and tables under `stylesheets/modules/llms/common/usage.scss`. - Enhanced backend with `AiApiAuditLog` model changes: added `cached_tokens` column (implemented with OpenAI for now) with relevant DB migration and indexing. - Created `Report` module for efficient aggregation and filtering of AI usage metrics. - Updated AI Bot title generation logic to log correctly to user vs bot - Extended test coverage for the new tracking features, ensuring data consistency and access controls.	2024-11-29 06:26:48 +11:00
Roman Rizzi	c980c34d77	REFACTOR: Simplify sentiment classification (#977 ) This change adds a simpler class for sentiment classification, replacing the soon-to-be removed `Classificator` hierarchy. Additionally, it adds a method for classifying concurrently, speeding up the backfill rake task.	2024-11-28 15:38:23 -03:00
Keegan George	f1c7ee8624	DEV: Better control what prompts can appear in post/composer (#969 ) This PR updates the logic for the location map so it permits only the desired prompts through to the composer/post menu. Anything else won't be shown by default. This PR also adds relevant tests to prevent regression.	2024-11-27 16:14:21 -08:00
Keegan George	4da033c667	FIX: Double render error with thumbnail suggestions (#968 ) This PR fixes a bug where a double render error appears in the logs when thumbnails are suggested	2024-11-27 15:12:27 -08:00
Sam	fbcc8e493a	FEATURE: Skip PM scanning in LLM triage by default (#966 ) Usually people do not want to scan personal messages. Sometimes they may. In that case they can enable triage on personal messages.	2024-11-28 09:25:29 +11:00
Keegan George	6b7d7c1179	REFACTOR: Helper suggestions (#914 ) This PR adds some updates to the Helper suggestions to improve it's functionality and modernize some of the codebase.	2024-11-27 12:21:03 -08:00
Martin Brennan	2f7895bb91	UX: Applying more admin UI guidelines (#956 ) This commit applies further admin UI guidelines, now that they have been more fleshed out in core, to the AI admin UI: * Tools * LLMs * Personas The changes include but are not limited to: * Applying the table CSS classes, for desktop and mobile * Adding a description and learn more link for each tab * Adding an empty list placeholder with CTA using `AdminConfigAreaEmptyList` * Replacing custom headings with `AdminPageSubheader`	2024-11-27 13:34:56 +10:00
Roman Rizzi	ef07fcb308	FIX: Skip records without content to classify (#960 )	2024-11-26 15:54:20 -03:00
Roman Rizzi	ddf2bf7034	DEV: Backfill embeddings concurrently. (#941 ) We are adding a new method for generating and storing embeddings in bulk, which relies on `Concurrent::Promises::Future`. Generating an embedding consists of three steps: Prepare text HTTP call to retrieve the vector Save to DB. Each one is independently executed on whatever thread the pool gives us. We are bringing a custom thread pool instead of the global executor since we want control over how many threads we spawn to limit concurrency. We also avoid firing thousands of HTTP requests when working with large batches.	2024-11-26 14:12:32 -03:00
Rafael dos Santos Silva	23193ee6f2	FEATURE: Calculate gists from non hot topics too (#958 ) Also renames some settings to remove 'hot' references.	2024-11-26 13:44:12 -03:00
Sam	54f2d34ccb	DEV: skip flakey spec (#955 ) This spec fails inconsistently with: -fragment-n14 +You are a helpful Discourse assistant. +You _understand_ and generate Discourse Markdown. +You live in a Discourse Forum Message. + +You live in the forum with the URL: http://test.localhost +The title of your site: test site title +The description is: test site description +The participants in this conversation are: joe, jane +The date now is: 2024-11-25 20:23:02 UTC, much has changed since you were trained. + +You were trained on OLD data, lean on search to get up to date information about this forum +When searching try to SIMPLIFY search terms +Discourse search joins all terms with AND. Reduce and simplify terms to find more results.<guidance> +The following texts will give you additional guidance for your response. +We included them because we believe they are relevant to this conversation topic. + +Texts: + +fragment-n10 +fragment-n9 +fragment-n8 +fragment-n7 +fragment-n6 +fragment-n5 +fragment-n4 +fragment-n3 +fragment-n2 +fragment-n1 +</guidance>	2024-11-26 07:49:33 +11:00
Sam	616b990894	FEATURE: LLM mentions and auto silence (#949 ) * FEATURE: allow mentioning an LLM mid conversation to switch This is a edgecase feature that allow you to start a conversation in a PM with LLM1 and then use LLM2 to evaluation or continue the conversation * FEATURE: allow auto silencing of spam accounts New rule can also allow for silencing an account automatically This can prevent spammers from creating additional posts.	2024-11-26 07:19:56 +11:00
Rafael dos Santos Silva	6c25718a7f	FEATURE: Add links to filtered emotion view on emotion dashboard table (#953 )	2024-11-25 15:51:01 -03:00
Roman Rizzi	79021252e9	REFACTOR: Tidy-up embedding endpoints config. (#937 ) Two changes worth mentioning: `#instance` returns a fully configured embedding endpoint ready to use. All endpoints respond to the same method and have the same signature - `perform!(text)` This makes it easier to reuse them when generating embeddings in bulk.	2024-11-25 13:12:43 -03:00
Natalie Tay	f8231d259b	FEATURE: Add locale detection prompt from translator (#946 )	2024-11-25 08:33:54 +11:00
Roman Rizzi	e54f2da1a5	FIX: Unnecessary complex preloading accidentally filters some topics. (#945 ) The `topic_query_create_list_topics` modifier we append was always meant to avoid an N+1 situation when serializing gists. However, I tried to be too smart and only preload these, which resulted in some topics with only regular summaries getting removed from the list. This issue became apparent now we are adding gists to other lists besides hot. Let's simplify the preloading, which still solves the N+1 issue, and let the serializer get the needed summary.	2024-11-22 12:07:27 -03:00
Joffrey JAFFEUX	2cc8115b48	FIX: disables temporarily ai_summaries filtering (#943 )	2024-11-22 08:34:54 +01:00
Sam	86cf4ccba7	FIX: automatically bust cache for share ai assets (#942 ) * FIX: automatically bust cache for share ai assets CDNs can be configured to strip query params in Discourse hosting. This is generally safe, but in this case we had no way of busting the cache using the path. New design properly caches and properly breaks busts the cache if asset changes so we don't need to worry about versions * one day I will set up conditional lint on save :)	2024-11-22 11:23:15 +11:00
Sam	d56ed53eb1	FIX: cancel functionality regressed (#938 ) The cancel messaging was not floating correctly to the HTTP call leading to impossible to cancel completions This is now fully tested as well.	2024-11-21 17:51:45 +11:00
Sam	52c644798d	DEV: improve artifact presentation (#932 ) 1. Keep source in a "details" block after rendered so it does not overwhelm users 2. Ensure artifacts are never indexed by robots 3. Cache break our CSS that changed recently	2024-11-20 18:53:19 +11:00
Sam	a0aec48606	FIX: gists are not html safe (#931 ) Also allow "Everyone" in ai_hot_topic_gists_allowed_groups	2024-11-20 10:54:49 +11:00
Roman Rizzi	530a795d43	FIX: Instruct AR that we want to use ai_summaries for filtering. (#927 ) We use `includes` instead of `joins` because we want to eager-load summaries, avoiding an extra query when summarizing. However, Rails will complain unless you explicitly inform them you plan to use that inside a `WHERE` clause.	2024-11-19 17:32:13 -03:00
Roman Rizzi	fb80d776d8	FEATURE: Enable gists on all topic lists (#922 )	2024-11-19 11:04:34 -03:00

1 2 3 4 5 ...

495 Commits