discourse-ai/config/settings.yml

404 lines
10 KiB
YAML
Raw Permalink Normal View History

discourse_ai:
discourse_ai_enabled:
2023-02-17 11:33:47 -03:00
default: true
client: true
FEATURE: AI artifacts (#898) This is a significant PR that introduces AI Artifacts functionality to the discourse-ai plugin along with several other improvements. Here are the key changes: 1. AI Artifacts System: - Adds a new `AiArtifact` model and database migration - Allows creation of web artifacts with HTML, CSS, and JavaScript content - Introduces security settings (`strict`, `lax`, `disabled`) for controlling artifact execution - Implements artifact rendering in iframes with sandbox protection - New `CreateArtifact` tool for AI to generate interactive content 2. Tool System Improvements: - Adds support for partial tool calls, allowing incremental updates during generation - Better handling of tool call states and progress tracking - Improved XML tool processing with CDATA support - Fixes for tool parameter handling and duplicate invocations 3. LLM Provider Updates: - Updates for Anthropic Claude models with correct token limits - Adds support for native/XML tool modes in Gemini integration - Adds new model configurations including Llama 3.1 models - Improvements to streaming response handling 4. UI Enhancements: - New artifact viewer component with expand/collapse functionality - Security controls for artifact execution (click-to-run in strict mode) - Improved dialog and response handling - Better error management for tool execution 5. Security Improvements: - Sandbox controls for artifact execution - Public/private artifact sharing controls - Security settings to control artifact behavior - CSP and frame-options handling for artifacts 6. Technical Improvements: - Better post streaming implementation - Improved error handling in completions - Better memory management for partial tool calls - Enhanced testing coverage 7. Configuration: - New site settings for artifact security - Extended LLM model configurations - Additional tool configuration options This PR significantly enhances the plugin's capabilities for generating and displaying interactive content while maintaining security and providing flexible configuration options for administrators.
2024-11-19 09:22:39 +11:00
ai_artifact_security:
client: true
type: enum
default: "strict"
choices:
- "disabled"
- "lax"
- "strict"
2023-02-22 20:46:53 -03:00
ai_sentiment_enabled:
default: false
client: true
ai_sentiment_model_configs:
default: ""
json_schema: DiscourseAi::Sentiment::SentimentSiteSettingJsonSchema
ai_sentiment_backfill_maximum_posts_per_hour:
default: 2500
min: 0
max: 10000
hidden: true
ai_sentiment_backfill_post_max_age_days:
default: 60
hidden: true
2024-12-12 09:17:25 +11:00
ai_openai_image_generation_url: "https://api.openai.com/v1/images/generations"
ai_openai_image_edit_url: "https://api.openai.com/v1/images/edits"
ai_openai_embeddings_url:
hidden: true
default: "https://api.openai.com/v1/embeddings"
ai_openai_organization:
default: ""
hidden: true
ai_openai_api_key:
default: ""
secret: true
ai_stability_api_key:
default: ""
secret: true
ai_stability_api_url:
default: "https://api.stability.ai"
ai_stability_engine:
default: "stable-diffusion-xl-1024-v1-0"
type: enum
choices:
- "sd3"
- "sd3-turbo"
- "stable-diffusion-xl-1024-v1-0"
- "stable-diffusion-768-v2-1"
- "stable-diffusion-v1-5"
ai_hugging_face_tei_endpoint:
hidden: true
default: ""
ai_hugging_face_tei_endpoint_srv:
default: ""
hidden: true
ai_hugging_face_tei_api_key:
default: ""
hidden: true
ai_hugging_face_tei_reranker_endpoint:
default: ""
ai_hugging_face_tei_reranker_endpoint_srv:
default: ""
hidden: true
ai_hugging_face_tei_reranker_api_key: ""
ai_google_custom_search_api_key:
default: ""
secret: true
ai_google_custom_search_cx:
default: ""
ai_cloudflare_workers_account_id:
default: ""
secret: true
hidden: true
ai_cloudflare_workers_api_token:
default: ""
secret: true
hidden: true
ai_gemini_api_key:
default: ""
hidden: true
ai_strict_token_counting:
default: false
hidden: true
ai_helper_enabled:
default: false
client: true
validator: "DiscourseAi::Configuration::LlmDependencyValidator"
composer_ai_helper_allowed_groups:
type: group_list
list_type: compact
default: "3|14" # 3: @staff, 14: @trust_level_4
allow_any: false
refresh: true
ai_helper_allowed_in_pm:
default: false
client: true
ai_helper_model:
default: ""
allow_any: false
type: enum
enum: "DiscourseAi::Configuration::LlmEnumerator"
validator: "DiscourseAi::Configuration::LlmValidator"
ai_helper_custom_prompts_allowed_groups:
type: group_list
list_type: compact
default: "3" # 3: @staff
allow_any: false
refresh: true
post_ai_helper_allowed_groups:
type: group_list
list_type: compact
default: "3|14" # 3: @staff, 14: @trust_level_4
allow_any: false
refresh: true
ai_helper_automatic_chat_thread_title:
default: false
ai_helper_automatic_chat_thread_title_delay:
default: 5
ai_helper_illustrate_post_model:
default: disabled
type: enum
choices:
- stable_diffusion_xl
- dall_e_3
- disabled
ai_helper_enabled_features:
client: true
default: "suggestions|context_menu"
type: list
list_type: compact
allow_any: false
refresh: true
choices:
- "suggestions"
- "context_menu"
- "image_caption"
ai_helper_image_caption_model:
default: ""
type: enum
enum: "DiscourseAi::Configuration::LlmVisionEnumerator"
2024-05-27 10:49:24 -07:00
ai_auto_image_caption_allowed_groups:
client: true
type: group_list
list_type: compact
default: "10" # 10: @trust_level_0
allow_any: false
refresh: true
ai_helper_model_allowed_seeded_models:
default: ""
hidden: true
type: list
list_type: compact
ai_helper_image_caption_model_allowed_seeded_models:
default: ""
hidden: true
type: list
list_type: compact
ai_embeddings_enabled:
default: false
client: true
validator: "DiscourseAi::Configuration::EmbeddingsModuleValidator"
ai_embeddings_selected_model:
type: enum
default: ""
allow_any: false
enum: "DiscourseAi::Configuration::EmbeddingDefsEnumerator"
validator: "DiscourseAi::Configuration::EmbeddingDefsValidator"
ai_embeddings_per_post_enabled:
default: false
hidden: true
ai_embeddings_generate_for_pms: false
ai_embeddings_semantic_related_topics_enabled:
default: false
client: true
ai_embeddings_semantic_related_topics: 5
ai_embeddings_semantic_related_include_closed_topics: true
ai_embeddings_backfill_batch_size:
default: 250
hidden: true
ai_embeddings_semantic_search_enabled:
default: false
client: true
validator: "DiscourseAi::Configuration::LlmDependencyValidator"
ai_embeddings_semantic_search_hyde_model:
default: ""
type: enum
allow_any: false
enum: "DiscourseAi::Configuration::LlmEnumerator"
validator: "DiscourseAi::Configuration::LlmValidator"
ai_embeddings_semantic_search_hyde_model_allowed_seeded_models:
default: ""
hidden: true
type: list
list_type: compact
ai_embeddings_semantic_quick_search_enabled:
default: false
client: true
hidden: true
ai_embeddings_discourse_service_api_endpoint:
default: ""
hidden: true
ai_embeddings_discourse_service_api_endpoint_srv:
default: ""
hidden: true
ai_embeddings_discourse_service_api_key:
hidden: true
default: ""
secret: true
ai_embeddings_model:
hidden: true
type: enum
default: "bge-large-en"
allow_any: false
choices:
- all-mpnet-base-v2
- text-embedding-ada-002
- text-embedding-3-small
- text-embedding-3-large
- multilingual-e5-large
- bge-large-en
- gemini
- bge-m3
ai_embeddings_pg_connection_string:
default: ""
hidden: true
ai_summarization_enabled:
default: false
client: true
validator: "DiscourseAi::Configuration::LlmDependencyValidator"
area: "ai-features/summarization"
ai_summarization_model:
default: ""
allow_any: false
type: enum
enum: "DiscourseAi::Configuration::LlmEnumerator"
validator: "DiscourseAi::Configuration::LlmValidator"
hidden: true
ai_summarization_persona:
default: "-11"
type: enum
enum: "DiscourseAi::Configuration::PersonaEnumerator"
area: "ai-features/summarization"
ai_pm_summarization_allowed_groups:
type: group_list
list_type: compact
default: ""
area: "ai-features/summarization"
ai_custom_summarization_allowed_groups: # Deprecated. TODO(roman): Remove 2025-09-01
type: group_list
list_type: compact
default: "3|13" # 3: @staff, 13: @trust_level_3
hidden: true
ai_summary_gists_enabled:
default: false
area: "ai-features/gists"
ai_summary_gists_persona:
default: "-12"
type: enum
enum: "DiscourseAi::Configuration::PersonaEnumerator"
area: "ai-features/gists"
ai_summary_gists_allowed_groups: # Deprecated. TODO(roman): Remove 2025-09-01
type: group_list
list_type: compact
default: "0" #everyone
hidden: true
ai_summarization_model_allowed_seeded_models:
default: ""
hidden: true
type: list
list_type: compact
ai_summary_backfill_topic_max_age_days:
default: 30
min: 1
max: 10000
area: "ai-features/summarization"
ai_summary_backfill_maximum_topics_per_hour:
default: 0
min: 0
max: 10000
area: "ai-features/summarization"
ai_summary_backfill_minimum_word_count:
default: 200
area: "ai-features/summarization"
ai_bot_enabled:
default: false
client: true
area: "ai-features/discoveries"
ai_bot_enable_chat_warning:
default: false
client: true
ai_bot_debugging_allowed_groups:
type: group_list
list_type: compact
default: ""
allow_any: false
ai_bot_allowed_groups:
type: group_list
list_type: compact
default: "3|14" # 3: @staff, 14: @trust_level_4
ai_bot_public_sharing_allowed_groups:
client: false
type: group_list
list_type: compact
default: "1|2" # 1: admins, 2: moderators
allow_any: false
refresh: true
ai_bot_add_to_header:
default: true
client: true
ai_bot_github_access_token:
default: ""
secret: true
ai_bot_allowed_seeded_models:
default: ""
hidden: true
type: list
list_type: compact
ai_bot_discover_persona:
default: ""
type: enum
client: true
enum: "DiscourseAi::Configuration::PersonaEnumerator"
area: "ai-features/discoveries"
ai_automation_max_triage_per_minute:
default: 60
hidden: true
ai_automation_max_triage_per_post_per_minute:
default: 2
hidden: true
ai_automation_allowed_seeded_models:
default: ""
hidden: true
type: list
list_type: compact
ai_discord_search_enabled:
default: false
client: true
area: "ai-features/discord_search"
ai_discord_app_id:
default: ""
client: false
area: "ai-features/discord_search"
ai_discord_app_public_key:
default: ""
client: false
area: "ai-features/discord_search"
ai_discord_search_mode:
default: "search"
type: enum
choices:
- search
- persona
area: "ai-features/discord_search"
ai_discord_search_persona:
default: ""
type: enum
enum: "DiscourseAi::Configuration::PersonaEnumerator"
area: "ai-features/discord_search"
ai_discord_allowed_guilds:
type: list
list_type: compact
default: ""
area: "ai-features/discord_search"
2024-12-12 09:17:25 +11:00
ai_spam_detection_enabled:
default: false
hidden: true
ai_spam_detection_user_id:
default: ""
hidden: true
ai_spam_detection_model_allowed_seeded_models:
default: ""
hidden: true
type: list
FEATURE: PDF support for rag pipeline (#1118) This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes: **1. LLM Model Association for RAG and Personas:** - **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM. Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`. - **Migration:** Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter. - **Model Changes:** The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier. `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes. - **UI Updates:** The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled). The RAG options component displays an LLM selector. - **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes. **2. PDF and Image Support for RAG:** - **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`. - **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled. Error handling is included for cases where PDF/image indexing is attempted with the setting disabled. - **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced. - **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs. - **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments. - **UI Updates:** The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types. **3. Refactoring and Improvements:** - **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend. - **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility. - **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based. - **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing. - **Eval Script:** An evaluation script is included. **4. Testing:** - The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
2025-02-14 12:15:07 +11:00
ai_rag_images_enabled:
FEATURE: PDF support for rag pipeline (#1118) This PR introduces several enhancements and refactorings to the AI Persona and RAG (Retrieval-Augmented Generation) functionalities within the discourse-ai plugin. Here's a breakdown of the changes: **1. LLM Model Association for RAG and Personas:** - **New Database Columns:** Adds `rag_llm_model_id` to both `ai_personas` and `ai_tools` tables. This allows specifying a dedicated LLM for RAG indexing, separate from the persona's primary LLM. Adds `default_llm_id` and `question_consolidator_llm_id` to `ai_personas`. - **Migration:** Includes a migration (`20250210032345_migrate_persona_to_llm_model_id.rb`) to populate the new `default_llm_id` and `question_consolidator_llm_id` columns in `ai_personas` based on the existing `default_llm` and `question_consolidator_llm` string columns, and a post migration to remove the latter. - **Model Changes:** The `AiPersona` and `AiTool` models now `belong_to` an `LlmModel` via `rag_llm_model_id`. The `LlmModel.proxy` method now accepts an `LlmModel` instance instead of just an identifier. `AiPersona` now has `default_llm_id` and `question_consolidator_llm_id` attributes. - **UI Updates:** The AI Persona and AI Tool editors in the admin panel now allow selecting an LLM for RAG indexing (if PDF/image support is enabled). The RAG options component displays an LLM selector. - **Serialization:** The serializers (`AiCustomToolSerializer`, `AiCustomToolListSerializer`, `LocalizedAiPersonaSerializer`) have been updated to include the new `rag_llm_model_id`, `default_llm_id` and `question_consolidator_llm_id` attributes. **2. PDF and Image Support for RAG:** - **Site Setting:** Introduces a new hidden site setting, `ai_rag_pdf_images_enabled`, to control whether PDF and image files can be indexed for RAG. This defaults to `false`. - **File Upload Validation:** The `RagDocumentFragmentsController` now checks the `ai_rag_pdf_images_enabled` setting and allows PDF, PNG, JPG, and JPEG files if enabled. Error handling is included for cases where PDF/image indexing is attempted with the setting disabled. - **PDF Processing:** Adds a new utility class, `DiscourseAi::Utils::PdfToImages`, which uses ImageMagick (`magick`) to convert PDF pages into individual PNG images. A maximum PDF size and conversion timeout are enforced. - **Image Processing:** A new utility class, `DiscourseAi::Utils::ImageToText`, is included to handle OCR for the images and PDFs. - **RAG Digestion Job:** The `DigestRagUpload` job now handles PDF and image uploads. It uses `PdfToImages` and `ImageToText` to extract text and create document fragments. - **UI Updates:** The RAG uploader component now accepts PDF and image file types if `ai_rag_pdf_images_enabled` is true. The UI text is adjusted to indicate supported file types. **3. Refactoring and Improvements:** - **LLM Enumeration:** The `DiscourseAi::Configuration::LlmEnumerator` now provides a `values_for_serialization` method, which returns a simplified array of LLM data (id, name, vision_enabled) suitable for use in serializers. This avoids exposing unnecessary details to the frontend. - **AI Helper:** The `AiHelper::Assistant` now takes optional `helper_llm` and `image_caption_llm` parameters in its constructor, allowing for greater flexibility. - **Bot and Persona Updates:** Several updates were made across the codebase, changing the string based association to a LLM to the new model based. - **Audit Logs:** The `DiscourseAi::Completions::Endpoints::Base` now formats raw request payloads as pretty JSON for easier auditing. - **Eval Script:** An evaluation script is included. **4. Testing:** - The PR introduces a new eval system for LLMs, this allows us to test how functionality works across various LLM providers. This lives in `/evals`
2025-02-14 12:15:07 +11:00
default: false
hidden: true
2025-05-01 12:21:07 +10:00
ai_bot_enable_dedicated_ux:
default: true
client: true
ai_translation_enabled:
default: false
client: true
validator: "DiscourseAi::Configuration::LlmDependencyValidator"
ai_translation_model:
default: ""
type: enum
allow_any: false
enum: "DiscourseAi::Configuration::LlmEnumerator"
validator: "DiscourseAi::Configuration::LlmValidator"