This index helps to speed up queries that joins the `topics` table
against the `ai_topics_embeddings` table on the `topic_id` column. There
are a number of queries which filters on `ai_topics_embeddings.model_id`
so we are including that in the index as well.
We enforced a hard limit of 700 tokens in this script, which is not enough when using thinking models, which can quickly use all of them.
A temporary solution could be bumping the limit, but there is no guarantee we won't hit it again, and it's hard to find one value that fits all scenarios. Another alternative could be removing it and relying on the LLM config's `max_output_token`, but if you want different rules and want to assign different limits, you are forced to duplicate the config each time.
Considering all this, we are adding a dedicated field for this in the triage script, giving you an easy way to tweak it to your needs. If empty, no limit is applied.
We're seeing that some LLMs are using 65000+ tokens for raw text that is only 10-1000 characters long.
This PR adds a max_token to be passed to the LLM API for each translation based on the length of the text.
Since we use the value for days.ago and are seeing PG::DatetimeFieldOverflow: ERROR: timestamp out of range: "5473790-07-13 08:43:28.497823 BC, set limits to the site setting.
Since translations only require a single key back, there is little point in using structured output. This PR also includes some prompt updates dealing with quotes, details, and code.
Related: #1502
This does mean reverting discourse/discourse-translator#257, but we can see how it goes.
Structured outputs are prone to formatting issue, especially around newlines
and custom pieces of text that need escaping.
This avoids using it for the automation reporting.
Particularly previous to this fix o4-mini based reports were broken
Try fix a flaky spec in /ai_bot/homepage_spec.rb by using ember
data rather than inspecting the DOM directly to see if there
are any in-progress uploads.
Also add missing translation for in progress uploads warning.
This commit
- normalizes locales like en_GB and variants to en. With this, the feature will not translate en_GB posts to en (or similarly pt_BR to pt_PT)
- consolidates whether the feature is enabled in `DiscourseAi::Translation.enabled?`
- similarly for backfill in `DiscourseAi::Translation.backfill_enabled?`
- turns off backfill if `ai_translation_backfill_max_age_days` is 0 to keep true to what it says. Set it to a high number to backfill everything
This update ensures that the `topic_id` related to the error when summarizing is surfaced in the logs, which should help track down the reason for the errors.
This update fixes a regression from https://github.com/discourse/discourse-ai/pull/1484, which caused AI helper title suggestions to begin suggesting numerous non-unique titles because it was looping through structured responses incorrectly.
* FIX: make AI helper more robust
- If JSON is broken for structured output then lean on a more forgiving parser
- Gemini 2.5 flash does not support temp, support opting out
- Evals for assistant were broken, fix interface
- Add some missing LLMs
- Translator was not mapped correctly to the feature - fix that
- Don't mix XML in prompt for translator
* lint
* correct logic
* simplify code
* implement best effort json parsing direct in the structured output object
A more deterministic way of making sure the LLM detects the correct language (instead of relying on prompt to LLM to ignore it) is to take the cooked and remove unwanted elements.
In this commit
- we remove quotes, image captions, etc. and only take the remaining text, falling back to the unadulterated cooked
- and update prompts related to detection and translation
- /152465/12
Also renames the Mixtral tokenizer to Mistral.
See gem at github.com/discourse/discourse_ai-tokenizers
Co-authored-by: Roman Rizzi <roman@discourse.org>
Previous to this change we reused channels for proofreading progress and
ai helper progress
The new changeset ensures each POST to stream progress gets a dedicated
message bus channel
This fixes a class of issues where the wrong information could be displayed
to end users on subsequent proofreading or helper calls
* fix tests
* fix implementation (got to subscribe at 0)
Introducing a typo isn't the right way to bypass the check that blocks the term "private messages".
Nice try, though ;)
I changed it to "personal messages".
When an invalid model is set for embeddings, topics do not load even if embeddings is disabled.
Error:
## RuntimeError in TopicsController#show
Invalid embeddings selected model
This commit checks for valid settings before attempting to load related topics.
* FIX: normalize keys in structured output
Previously we did not validate the hash passed in to structured
outputs which could either be string based or symbol base
Specifically this broke structured outputs for Gemini in some
specific cases.
* comment out flake
- allows features to have multiple llms and multiple personas
- sorts module list
- adds Bot as a first class module
- fixes issue where search module was always configured
- some tests
- Add support for `chain.streamCustomRaw(test)` that can be used to stream text from a JS tool direct to composer
- Add support for llm params in `llm.generate` which unlocks stuff like structured outputs
- Add discourse.createStagedUser, discourse.createTopic and discourse.createPost - for content creation
* FIX: A typo in bot filtration in ai-bot-header-icon
* FIX: Show header icon when there's only one persona with a default LLM set
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>