A more deterministic way of making sure the LLM detects the correct language (instead of relying on prompt to LLM to ignore it) is to take the cooked and remove unwanted elements.
In this commit
- we remove quotes, image captions, etc. and only take the remaining text, falling back to the unadulterated cooked
- and update prompts related to detection and translation
- /152465/12
Related: https://github.com/discourse/discourse-translator/pull/310
This commit includes all the jobs and event hooks to localize posts, topics, and categories.
A few notes:
- `feature_name: "translation"` because the site setting is `ai-translation` and module is `Translation`
- we will switch to proper ai-feature in the near future, and can consider using the persona_user as `localization.localizer_user_id`
- keeping things flat within the module for now as we will be moving to ai-feature soon and have to rearrange
- Settings renamed/introduced are:
- ai_translation_backfill_rate (0)
- ai_translation_backfill_limit_to_public_content (true)
- ai_translation_backfill_max_age_days (5)
- ai_translation_verbose_logs (false)