Before this change, we let you set the embeddings selected model back to " " even with embeddings enabled. This will leave the site in a broken state.
Additionally, it adds a fail-safe for these scenarios to avoid errors on the topics page.
We have removed this flag in core. All plugins now use the "top mode" for their navigation. A backwards-compatible change has been made in core while we remove the usage from plugins.
This change fixes two different problems.
First, we add a data migration to migrate the configuration of sites using Open AI's embedding model. There was a window between the embedding config changes and #1087, where sites could end up in a broken state due to an unconfigured selected model setting, as reported on https://meta.discourse.org/t/-/348964
The second fix drops pre-seeded search indexes of the models we didn't migrate and corrects the ones where the dimensions don't match. Since the index uses the model ID, new embedding configs could use one of these ones even when the dimensions no longer match.
This should give us a better idea on how our scanner is faring across sites.
```
# HELP discourse_discourse_ai_spam_detection AI spam scanning statistics
# TYPE discourse_discourse_ai_spam_detection counter
discourse_discourse_ai_spam_detection{db="default",type="scanned"} 16
discourse_discourse_ai_spam_detection{db="default",type="is_spam"} 7
discourse_discourse_ai_spam_detection{db="default",type="false_positive"} 1
discourse_discourse_ai_spam_detection{db="default",type="false_negative"} 2
```
We have a flag to signal we are shortening the embeddings of a model.
Only used in Open AI's text-embedding-3-*, but we plan to use it for other services.
* Use AR model for embeddings features
* endpoints
* Embeddings CRUD UI
* Add presets. Hide a couple more settings
* system specs
* Seed embedding definition from old settings
* Generate search bit index on the fly. cleanup orphaned data
* support for seeded models
* Fix run test for new embedding
* fix selected model not set correctly
This adds registration and last known IP information and email to scanning context.
This provides another hint for spam scanner about possible malicious users.
For example registered in India, replying from Australia or
email is clearly a throwaway email address.
To quickly select backfill candidates without comparing SHAs, we compare the last summarized post to the topic's highest_post_number. However, hiding or deleting a post and adding a small action will update this column, causing the job to stall and re-generate the same summary repeatedly until someone posts a regular reply. On top of this, this is not always true for topics with `best_replies`, as this last reply isn't necessarily included.
Since this is not evident at first glance and each summarization strategy picks its targets differently, I'm opting to simplify the backfill logic and how we track potential candidates.
The first step is dropping `content_range`, which serves no purpose and it's there because summary caching was supposed to work differently at the beginning. So instead, I'm replacing it with a column called `highest_target_number`, which tracks `highest_post_number` for topics and could track other things like channel's `message_count` in the future.
Now that we have this column when selecting every potential backfill candidate, we'll check if the summary is truly outdated by comparing the SHAs, and if it's not, we just update the column and move on
When enabling spam scanner it there may be old unscanned posts
this can create a risky situation where spam scanner operates
on legit posts during false positives
To keep this a lot safer we no longer try to hide old stuff by
the spammers.
This update fixes an issue when the composer helper menu was not being shown on tablets in desktop mode. Updating the `z-index` to use the modal-dialog case is more appropriate here.
Adds a comprehensive quota management system for LLM models that allows:
- Setting per-group (applied per user in the group) token and usage limits with configurable durations
- Tracking and enforcing token/usage limits across user groups
- Quota reset periods (hourly, daily, weekly, or custom)
- Admin UI for managing quotas with real-time updates
This system provides granular control over LLM API usage by allowing admins
to define limits on both total tokens and number of requests per group.
Supports multiple concurrent quotas per model and automatically handles
quota resets.
Co-authored-by: Keegan George <kgeorge13@gmail.com>
Disabling streaming is required for models such o1 that do not have streaming
enabled yet
It is good to carry this feature around in case various apis decide not to support streaming endpoints and Discourse AI can continue to work just as it did before.
Also: fixes issue where sharing artifacts would miss viewport leading to tiny artifacts on mobile
This update adds some structure for handling errors in the spam config while also handling a specific error related to the spam scanning user not being an admin account.
The seeded LLM setting: `SiteSetting.ai_spam_detection_model_allowed_seeded_models` returns a _string_ with IDs separated by pipes. running `_map` on it will return an array with strings. We were previously checking for the id with custom prefix identifier, but instead we should be checking the stringified ID.