discourse-ai

mirror of https://github.com/discourse/discourse-ai.git synced 2025-07-23 06:23:27 +00:00

Author	SHA1	Message	Date
Keegan George	fbbe01e576	DEV: Hide seeded LLMs in the enumerator for now	2025-07-21 07:45:27 -07:00
Keegan George	f420261196	DEV: Apply feedback from review	2025-07-18 12:17:08 -07:00
Keegan George	cc34882074	FIX: rest of specs	2025-07-17 14:34:29 -07:00
Keegan George	d9b53b4a3d	FIX: semantic search	2025-07-17 12:23:38 -07:00
Keegan George	a927fd2270	fix	2025-07-17 12:00:05 -07:00
Keegan George	f077d58abf	FIX: automation spec	2025-07-17 11:48:28 -07:00
Keegan George	273a1fa618	FIX: occurrences of old model setting	2025-07-17 09:43:45 -07:00
Keegan George	7af3ce820d	DEV: Remove references to translation model	2025-07-17 09:27:08 -07:00
Keegan George	5841543bd9	FIX: validator	2025-07-16 14:51:32 -07:00
Keegan George	5ae80645bb	Merge branch 'main' into default-llm-model	2025-07-16 12:07:10 -07:00
Keegan George	b675c4c39b	DEV: Remove custom prefix in specs	2025-07-16 11:39:55 -07:00
Keegan George	0fadf1da1a	DEV: update enumerator	2025-07-16 10:56:18 -07:00
Keegan George	c26d604072	FIX: dependency validator should depend on default LLM setting	2025-07-16 07:08:29 -07:00
Natalie Tay	ad6a8cb812	DEV: Switch translations out of structured output as it returns only a single value (#1503 ) Since translations only require a single key back, there is little point in using structured output. This PR also includes some prompt updates dealing with quotes, details, and code. Related: #1502 This does mean reverting discourse/discourse-translator#257, but we can see how it goes.	2025-07-16 13:38:00 +08:00
Keegan George	7ae61ce877	DEV: update automation to remove `custom:` prefix	2025-07-15 18:08:41 -07:00
Keegan George	f593ab6e1e	DEV: remove `custom:` prefix and update migrations for translations	2025-07-15 18:07:05 -07:00
Keegan George	b58f198ee8	FIX: don't `run_test` in testing env	2025-07-15 11:08:39 -07:00
Keegan George	88811f97bf	fixes	2025-07-15 10:23:58 -07:00
Sam	1b16fc876c	FIX: avoid using structured outputs for report runs (#1502 ) Structured outputs are prone to formatting issue, especially around newlines and custom pieces of text that need escaping. This avoids using it for the automation reporting. Particularly previous to this fix o4-mini based reports were broken	2025-07-15 13:14:52 +10:00
Roman Rizzi	6a4dbd8126	FEATURE: Use Personas in Automation's llm_report script (#1499 )	2025-07-14 12:47:21 -03:00
Keegan George	f366ded03b	DEV: Force default to be set if it was found not to be set!	2025-07-11 08:19:30 -07:00
Keegan George	249aab100c	DEV: Use a simple validator - no need for seeding checks or module dependant checks	2025-07-10 14:44:02 -07:00
Roman Rizzi	89bcf9b1f0	FIX: Process succesfully generated embeddings even if some failed (#1500 )	2025-07-10 17:51:01 -03:00
Keegan George	5a29074799	DEV: Use default LLM model	2025-07-10 11:54:35 -07:00
Natalie Tay	d54cd1f602	DEV: Normalize locales that are similar (e.g. en and en_GB) so they do not get translated (#1495 ) This commit - normalizes locales like en_GB and variants to en. With this, the feature will not translate en_GB posts to en (or similarly pt_BR to pt_PT) - consolidates whether the feature is enabled in `DiscourseAi::Translation.enabled?` - similarly for backfill in `DiscourseAi::Translation.backfill_enabled?` - turns off backfill if `ai_translation_backfill_max_age_days` is 0 to keep true to what it says. Set it to a high number to backfill everything	2025-07-09 22:21:51 +08:00
Keegan George	625442af3c	FIX: title suggestions should return 5 unique titles (#1491 ) This update fixes a regression from https://github.com/discourse/discourse-ai/pull/1484, which caused AI helper title suggestions to begin suggesting numerous non-unique titles because it was looping through structured responses incorrectly.	2025-07-08 06:30:09 -07:00
Natalie Tay	56f025cf44	FIX: Localize description excerpts as they have limits (#1490 )	2025-07-08 10:36:41 +08:00
Natalie Tay	6f8960e549	FIX: Pass topic to context (#1488 )	2025-07-07 14:59:48 +08:00
Rafael dos Santos Silva	6247906c13	FEATURE: Seamless embedding model upgrades (#1486 )	2025-07-04 16:44:03 -03:00
Sam	ab5edae121	FIX: make AI helper more robust (#1484 ) * FIX: make AI helper more robust - If JSON is broken for structured output then lean on a more forgiving parser - Gemini 2.5 flash does not support temp, support opting out - Evals for assistant were broken, fix interface - Add some missing LLMs - Translator was not mapped correctly to the feature - fix that - Don't mix XML in prompt for translator * lint * correct logic * simplify code * implement best effort json parsing direct in the structured output object	2025-07-04 14:47:11 +10:00
Natalie Tay	2b9a4f9232	FIX: Ignore captions and quotes when detecting locale and update prompts (#1483 ) A more deterministic way of making sure the LLM detects the correct language (instead of relying on prompt to LLM to ignore it) is to take the cooked and remove unwanted elements. In this commit - we remove quotes, image captions, etc. and only take the remaining text, falling back to the unadulterated cooked - and update prompts related to detection and translation - /152465/12	2025-07-03 22:57:48 +08:00
Rafael dos Santos Silva	d792919ddf	DEV: Move tokenizers to a gem (#1481 ) Also renames the Mixtral tokenizer to Mistral. See gem at github.com/discourse/discourse_ai-tokenizers Co-authored-by: Roman Rizzi <roman@discourse.org>	2025-07-02 14:43:03 -03:00
Roman Rizzi	75fb37144f	FEATURE: Use personas for generating hypothetical posts (#1482 ) * FEATURE: Use personas for generating hypothetica posts * Update prompt	2025-07-02 10:56:38 -03:00
Sam	40fa527633	FIX: cross talk when in ai helper (#1478 ) Previous to this change we reused channels for proofreading progress and ai helper progress The new changeset ensures each POST to stream progress gets a dedicated message bus channel This fixes a class of issues where the wrong information could be displayed to end users on subsequent proofreading or helper calls * fix tests * fix implementation (got to subscribe at 0)	2025-07-01 18:02:16 +10:00
Roman Rizzi	5ca7d5f256	FIX: Strip uploads from msg when searching for rag fragments (#1475 )	2025-06-30 15:03:17 -03:00
Natalie Tay	a94daa14e2	FIX: Return no topics when embeddings is disabled (#1473 ) When an invalid model is set for embeddings, topics do not load even if embeddings is disabled. Error: ## RuntimeError in TopicsController#show Invalid embeddings selected model This commit checks for valid settings before attempting to load related topics.	2025-06-30 17:45:04 +08:00
Roman Rizzi	57b00526f8	FIX: Clarify spam response expectations. (#1470 )	2025-06-27 16:59:55 -03:00
Roman Rizzi	8d943fa29d	FEATURE: Display spam module on features list. (#1469 )	2025-06-27 14:18:01 -03:00
Roman Rizzi	b35f9bcc7c	FEATURE: Use Persona's when scanning posts for spam (#1465 )	2025-06-27 10:35:47 -03:00
Sam	cc4e9e030f	FIX: normalize keys in structured output (#1468 ) * FIX: normalize keys in structured output Previously we did not validate the hash passed in to structured outputs which could either be string based or symbol base Specifically this broke structured outputs for Gemini in some specific cases. * comment out flake	2025-06-27 15:42:48 +10:00
Sam	73768ce920	FEATURE: Display bot in feature list (#1466 ) - allows features to have multiple llms and multiple personas - sorts module list - adds Bot as a first class module - fixes issue where search module was always configured - some tests	2025-06-27 12:35:41 +10:00
Rafael dos Santos Silva	a40e2d3156	FEATURE: Update OpenAI tokenizer to GPT-4o and later (#1467 )	2025-06-26 15:26:09 -03:00
Sam	3e74f09d06	FEATURE: improve custom tool infra (#1463 ) - Add support for `chain.streamCustomRaw(test)` that can be used to stream text from a JS tool direct to composer - Add support for llm params in `llm.generate` which unlocks stuff like structured outputs - Add discourse.createStagedUser, discourse.createTopic and discourse.createPost - for content creation	2025-06-25 16:25:44 +10:00
Jarek Radosz	5735f063a3	FIX: A typo in bot filtration in ai-bot-header-icon (#1455 ) * FIX: A typo in bot filtration in ai-bot-header-icon * FIX: Show header icon when there's only one persona with a default LLM set --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2025-06-24 10:51:07 -03:00
Sam	471f96f972	FEATURE: allow seeing configured LLM on feature page (#1460 ) This is an interim fix so we can at least tell what feature is being used for what LLM. It also adds some test coverage to the feature page.	2025-06-24 17:42:47 +10:00
Roman Rizzi	eea96d6df9	FIX: Include JSON instructions in Helper default personas (#1458 )	2025-06-23 11:57:50 -03:00
Natalie Tay	683bb5725b	DEV: Split content based on llmmodel's max_output_tokens (#1456 ) In discourse/discourse-translator#249 we introduced splitting content (post.raw) prior to sending to translation as we were using a sync api. Now that we're streaming thanks to #1424, we'll chunk based on the LlmModel.max_output_tokens.	2025-06-23 21:11:20 +08:00
Natalie Tay	e2d7ca0bb9	DEV: Indicate backfill rate for translations is hourly (#1451 ) * DEV: Indicate backfill rate for translations is hourly * add ai_translation_max_post_length * default value update	2025-06-21 15:45:09 +08:00
Sam	eab6dd3f8e	DEV: re-implement bulk sentiment classifier (#1449 ) New implementation uses core concurrent job queue, it is more robust and predictable than the one shipped in Concurrent. Additionally: - Trickles through updates during bulk classification - Reports errors if we fail during a bulk classification * push concurrency down to 40. 100 feels quite high.	2025-06-20 16:06:03 +10:00
Keegan George	baaa3d199a	FIX: streaming related specs (#1448 ) ## 🔍 Overview This update fixes an issue where message bus streaming related specs were not working correctly. To do so we pass the `last_id` when subscribing to `MessageBus` which allows us to unskip those broken tests. --------- Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>	2025-06-19 07:41:18 -07:00

1 2 3 4 5 ...

745 Commits