discourse-ai/db/migrate
Sam 47f5da7e42
FEATURE: Add AI-powered spam detection for new user posts (#1004)
This introduces a comprehensive spam detection system that uses LLM models
to automatically identify and flag potential spam posts. The system is
designed to be both powerful and configurable while preventing false positives.

Key Features:
* Automatically scans first 3 posts from new users (TL0/TL1)
* Creates dedicated AI flagging user to distinguish from system flags
* Tracks false positives/negatives for quality monitoring
* Supports custom instructions to fine-tune detection
* Includes test interface for trying detection on any post

Technical Implementation:
* New database tables:
  - ai_spam_logs: Stores scan history and results
  - ai_moderation_settings: Stores LLM config and custom instructions
* Rate limiting and safeguards:
  - Minimum 10-minute delay between rescans
  - Only scans significant edits (>10 char difference)
  - Maximum 3 scans per post
  - 24-hour maximum age for scannable posts
* Admin UI features:
  - Real-time testing capabilities
  - 7-day statistics dashboard
  - Configurable LLM model selection
  - Custom instruction support

Security and Performance:
* Respects trust levels - only scans TL0/TL1 users
* Skips private messages entirely
* Stops scanning users after 3 successful public posts
* Includes comprehensive test coverage
* Maintains audit log of all scan attempts


---------

Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
2024-12-12 09:17:25 +11:00
..
20230224165056_create_classification_results_table.rb DEV: Dedicated table for saving classification results (#1) 2023-02-27 16:21:40 -03:00
20230307125342_created_model_accuracy_table.rb FEATURE: Use dedicated reviewables for AI flags. (#4) 2023-03-07 15:39:28 -03:00
20230314184514_migrate_discourse_ai_reviewables.rb DEV: Promote historical post-deploy migrations (#728) 2024-07-30 01:44:57 +08:00
20230316160714_create_completion_prompt_table.rb REFACTOR: Store prompts in a dedicated table. (#14) 2023-03-17 15:14:19 -03:00
20230320122645_delete_duplicated_seeded_prompts.rb FIX: Remove old seeded prompts. (#17) 2023-03-20 12:55:38 -03:00
20230320185619_multi_message_completion_prompts.rb FIX: Allow null messages to migrate existing rows (#22) 2023-03-21 12:33:30 -03:00
20230320191928_drop_completion_prompt_value.rb DEV: Promote historical post-deploy migrations (#728) 2024-07-30 01:44:57 +08:00
20230322142028_make_dropped_value_column_nullable.rb FIX: Remove `null: false` from dropped column (#24) 2023-03-22 11:40:20 -03:00
20230406135943_add_provider_to_completion_prompts.rb FEATURE: Anthropic Claude for AIHelper and Summarization modules (#39) 2023-04-10 11:04:42 -03:00
20230424055354_create_ai_api_audit_logs.rb FEATURE: add a table to audit OpenAI usage (#45) 2023-04-26 11:44:29 +10:00
20230519003106_post_custom_prompts.rb FEATURE: add support for GPT <-> Forum integration 2023-05-20 17:45:54 +10:00
20230710171141_enable_pg_vector_extension.rb FEATURE: Support for locally infered embeddings in 100 languages (#115) 2023-07-27 15:50:03 -03:00
20230710171142_create_ai_topic_embeddings_table.rb DEV: Migrations shouldn't rely on the app (#253) 2023-10-16 18:50:37 -03:00
20230710171143_migrate_embeddings_from_dedicated_database.rb DEV: Update rubocop-discourse to version 3.8.0 (#641) 2024-05-28 11:15:42 +02:00
20230727170222_create_multilingual_topic_embeddings_table.rb DEV: Migrations shouldn't rely on the app (#253) 2023-10-16 18:50:37 -03:00
20230831033812_rename_ai_helper_add_ai_pm_to_header_setting.rb DEV: rename ai_helper_add_ai_pm_to_header -> ai_bot_add_to_header (#177) 2023-08-31 14:42:28 +10:00
20231003155701_create_bge_topic_embeddings_table.rb DEV: Migrations shouldn't rely on the app (#253) 2023-10-16 18:50:37 -03:00
20231031050538_add_topic_id_post_id_to_ai_audit_log.rb FEATURE: support topic_id and post_id logging in ai audit log (#274) 2023-11-01 08:41:31 +11:00
20231109011155_create_ai_personas.rb FEATURE: basic infrastructure for custom personas (#288) 2023-11-10 11:39:49 +11:00
20231117050928_add_system_and_priority_to_ai_personas.rb FEATURE: UI to update ai personas on admin page (#290) 2023-11-21 16:56:43 +11:00
20231120033747_remove_site_settings.rb FEATURE: UI to update ai personas on admin page (#290) 2023-11-21 16:56:43 +11:00
20231123224203_switch_to_generic_completion_prompts.rb DEV: Promote historical post-deploy migrations (#728) 2024-07-30 01:44:57 +08:00
20231128151234_recreate_generate_titles_prompt.rb DEV: Promote historical post-deploy migrations (#728) 2024-07-30 01:44:57 +08:00
20231202013850_convert_ai_personas_commands_to_json.rb DEV: Fix various typos (#434) 2024-01-19 12:51:26 +01:00
20231227223301_create_gemini_topic_embeddings_table.rb FEATURE: Support for Gemini Embeddings (#382) 2023-12-28 10:28:01 -03:00
20231228213036_create_ai_post_embeddings_tables.rb FEATURE: Per post embeddings (#387) 2023-12-29 12:28:45 -03:00
20240104013944_add_params_to_completion_prompt.rb FIX: AI helper not working correctly with mixtral (#399) 2024-01-04 09:53:47 -03:00
20240126013358_create_openai_text_embedding_tables.rb FEATURE: add support for new OpenAI embedding models (#445) 2024-01-29 13:24:30 -03:00
20240202010752_add_temperature_top_p_to_ai_personas.rb FEATURE: allow personas to supply top_p and temperature params (#459) 2024-02-03 07:09:34 +11:00
20240209044519_add_user_id_mentionable_default_llm_to_ai_personas.rb FEATURE: mentionable personas and random picker tool, context limits (#466) 2024-02-15 16:37:59 +11:00
20240213051213_add_limits_to_ai_persona.rb FEATURE: mentionable personas and random picker tool, context limits (#466) 2024-02-15 16:37:59 +11:00
20240309034751_add_shared_ai_conversations.rb FEATURE: Share conversations with AI via a URL (#521) 2024-03-12 16:51:41 +11:00
20240309034752_create_rag_document_fragment_table.rb FEATURE: AI Bot RAG support. (#537) 2024-04-01 13:43:34 -03:00
20240313165121_embedding_tables_for_rag_uploads.rb FEATURE: AI Bot RAG support. (#537) 2024-04-01 13:43:34 -03:00
20240322035907_add_images_to_ai_personas.rb FEATURE: Add vision support to AI personas (Claude 3) (#546) 2024-03-27 14:30:11 +11:00
20240404000838_add_metadata_to_rag_document_frament.rb FEATURE: Add metadata support for RAG (#553) 2024-04-04 11:02:16 -03:00
20240409035951_add_rag_params_to_ai_persona.rb FEATURE: allow tuning of RAG generation (#565) 2024-04-12 10:32:46 -03:00
20240410170000_add_embeddings_tablesfor_bge_m3.rb FEATURE: Add BGE-M3 embeddings support (#569) 2024-04-10 17:24:01 -03:00
20240424220101_add_auto_image_caption_to_user_options.rb FEATURE: Auto image captions (#637) 2024-05-27 10:49:24 -07:00
20240429065155_add_consolidated_question_llm_to_ai_persona.rb FEATURE: Add Question Consolidator for robust Upload support in Personas (#596) 2024-04-30 13:49:21 +10:00
20240503034946_add_allow_chat_to_ai_persona.rb FEATURE: support Chat with AI Persona via a DM (#488) 2024-05-06 09:49:02 +10:00
20240503042558_add_chat_message_custom_prompt.rb FEATURE: support Chat with AI Persona via a DM (#488) 2024-05-06 09:49:02 +10:00
20240504222307_create_llm_model_table.rb FEATURE: Configurable LLMs. (#606) 2024-05-13 12:46:42 -03:00
20240514001334_add_feature_name_to_ai_api_audit_log.rb FEATURE: GPT4o support and better auditing (#618) 2024-05-14 13:28:46 +10:00
20240514171609_add_endpoint_data_to_llm_model.rb FEATURE: Set endpoint credentials directly from LlmModel. (#625) 2024-05-16 09:50:22 -03:00
20240527054218_add_language_model_to_ai_audit_logs.rb FEATURE: improve logging by including llm name (#640) 2024-05-27 16:46:01 +10:00
20240528132059_add_companion_user_to_llm_model.rb DEV: Rewire AI bot internals to use LlmModel (#638) 2024-06-18 14:32:14 -03:00
20240606151348_create_ai_summaries_table.rb DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
20240609061418_tool_details_and_command_removal.rb FIX: do not mark column read only so certain deployments work (#663) 2024-06-11 21:32:49 +10:00
20240611170904_upgrade_pgvector_070.rb DEV: Move to single table per embeddings type (#561) 2024-08-08 11:55:20 -03:00
20240611170905_move_embeddings_to_single_table_per_type.rb FEATURE: Index embeddings using bit vectors (#824) 2024-10-14 13:26:03 -03:00
20240618080148_create_ai_tools.rb FEATURE: custom user defined tools (#677) 2024-06-27 17:27:40 +10:00
20240624135356_llm_model_custom_params.rb DEV: Use Rails 7.0 instead of 7.1 in migrations 2024-06-26 18:32:11 +02:00
20240704020102_reset_identity_on_ai_summary.rb FIX: repair id sequence identity on summary table (#701) 2024-07-04 12:23:46 +10:00
20240719143453_llm_model_vision_enabled.rb FEATURE: Track if a model can do vision in the llm_models table (#725) 2024-07-24 16:29:47 -03:00
20240726164937_fix_ai_summaries_sequence.rb FIX: Properly fix ai_summaries table sequence (#727) 2024-07-26 14:45:01 -03:00
20240807150605_add_default_to_provider_params.rb FIX: Correctly save provider-specific params for new models. (#744) 2024-08-07 16:08:56 -03:00
20240909180908_add_ai_summary_type_column.rb REFACTOR: Support of different summarization targets/prompts. (#835) 2024-10-15 13:53:26 -03:00
20240912052713_add_target_to_rag_document_fragment.rb FEATURE: Make tool support polymorphic (#798) 2024-09-16 08:17:17 +10:00
20240913054440_add_rag_columns_to_ai_tools.rb FEATURE: RAG search within tools (#802) 2024-09-30 17:27:50 +10:00
20241008054440_create_binary_indexes_for_embeddings.rb FEATURE: Index embeddings using bit vectors (#824) 2024-10-14 13:26:03 -03:00
20241009230724_add_forced_tool_count_to_ai_personas.rb FEATURE: allow persona to only force tool calls on limited replies (#827) 2024-10-11 07:23:42 +11:00
20241014010245_ai_persona_chat_topic_refactor.rb FEATURE: smarter persona tethering (#832) 2024-10-16 07:20:31 +11:00
20241023033955_add_feature_context_to_ai_api_log.rb FEATURE: better logging for automation reports (#853) 2024-10-23 16:49:56 +11:00
20241025135522_alter_ai_ids_to_bigint.rb DEV: Fix mismatched column types (#868) 2024-10-28 15:36:42 +02:00
20241028034232_add_unique_ai_stream_conversation_user_id_index.rb FEATURE: new endpoint for directly accessing a persona (#876) 2024-10-30 10:28:20 +11:00
20241031145203_track_ai_summary_origin.rb FEATURE: Automatically backfill regular summaries. (#892) 2024-11-04 17:48:11 -03:00
20241031180044_set_origin_for_existing_ai_summaries.rb FEATURE: Automatically backfill regular summaries. (#892) 2024-11-04 17:48:11 -03:00
20241104053017_add_ai_artifacts.rb FEATURE: AI artifacts (#898) 2024-11-19 09:22:39 +11:00
20241125132452_unique_ai_summaries.rb PERF: Preload only gists when including summaries in topic list (#948) 2024-11-25 12:24:02 -03:00
20241126033812_rename_ai_gist_batch_setting.rb FEATURE: Calculate gists from non hot topics too (#958) 2024-11-26 13:44:12 -03:00
20241128010221_add_cached_tokens_to_ai_api_audit_log.rb FEATURE: AI Usage page (#964) 2024-11-29 06:26:48 +11:00
20241129190708_fix_classification_data.rb FIX: Sentiment classification results needs to be transformed before saving (#983) 2024-11-29 17:31:56 -03:00
20241130003808_add_artifact_versions.rb FEATURE: allow artifacts to be updated (#980) 2024-12-03 07:23:31 +11:00
20241206030229_add_ai_moderation_settings.rb FEATURE: Add AI-powered spam detection for new user posts (#1004) 2024-12-12 09:17:25 +11:00
20241206051225_add_ai_spam_logs.rb FEATURE: Add AI-powered spam detection for new user posts (#1004) 2024-12-12 09:17:25 +11:00