Commit Graph

314 Commits

Author SHA1 Message Date
Roman Rizzi 6af415f7f0
FEATURE: Triage rule can skip posts created via email (#775) 2024-08-27 12:04:55 -03:00
Roman Rizzi c6aeabbfc0
FIX: Malformed message in systemless + inline img scenario (#771) 2024-08-23 16:41:57 -03:00
Roman Rizzi eac83eb619
FIX: Triage's search_for_text should be case-insensitive (#767) 2024-08-22 18:32:42 -03:00
Roman Rizzi 64641b6175
FEATURE: LLM Triage support for systemless models. (#757)
* FEATURE: LLM Triage support for systemless models.

This change adds support for OSS models without support for system messages. LlmTriage's system message field is no longer mandatory. We now send the post contents in a separate user message.

* Models using Ollama can also disable system prompts
2024-08-21 11:41:55 -03:00
Sam 97fc822cb6
FEATURE: Allow specific groups access to summary feature on PMs (#760)
New `ai_pm_summarization_allowed_groups` can be used to allow
visibility of the summarization feature on PMs.

This can be useful on forums where a lot of communication happens
inside PMs.
2024-08-21 07:58:24 +10:00
Roman Rizzi f789d3ee96
FIX: Triage-flagged posts didn't have a score. (#752)
The score will contain the LLM result, and make sure the flag isn't displayed when a minimum score threshold is present.
2024-08-14 15:54:09 -03:00
Keegan George f72ab12761
DEV: Clearly separate post/composer helper settings (#747) 2024-08-12 15:40:23 -07:00
Sam cf1e15ef12
FIX: gemini 0801 tool calls (#748)
Gemini experimental model requires tool_config.

Previously defaults would apply.

This corrects prompts containing multiple tools on gemini.
2024-08-12 16:10:16 +10:00
Keegan George 0be292f247
FIX: `auto_image_caption` not always present for current user. (#746) 2024-08-08 13:37:26 -07:00
Rafael dos Santos Silva 1686a8a683
DEV: Move to single table per embeddings type (#561)
Also move us to halfvecs for speed and disk usage gains
2024-08-08 11:55:20 -03:00
Keegan George 2ff3fe3a9f
UX: Use stacked line chart for post sentiment (#737) 2024-08-02 14:23:29 -07:00
Sam 948cf893a9
FIX: Add tool support to open ai compatible dialect and vllm (#734)
* FIX: Add tool support to open ai compatible dialect and vllm

Automatic tools are in progress in vllm see: https://github.com/vllm-project/vllm/pull/5649

Even when they are supported, initial support will be uneven, only some models have native tool support
notably mistral which has some special tokens for tool support.

After the above PR lands in vllm we will still need to swap to
XML based tools on models without native tool support.

* fix specs
2024-08-02 09:52:33 -03:00
Roman Rizzi bed044448c
DEV: Remove old code now that features rely on LlmModels. (#729)
* DEV: Remove old code now that features rely on LlmModels.

* Hide old settings and migrate persona llm overrides

* Remove shadowing special URL + seeding code. Use srv:// prefix instead.
2024-07-30 13:44:57 -03:00
Roman Rizzi 5c196bca89
FEATURE: Track if a model can do vision in the llm_models table (#725)
* FEATURE: Track if a model can do vision in the llm_models table

* Data migration
2024-07-24 16:29:47 -03:00
Roman Rizzi 0a8195242b
FIX: Limit system message size to 60% of available tokens. (#714)
Using RAG fragments can lead to considerably big system messages, which becomes problematic when models have a smaller context window.

Before this change, we only look at the rest of the conversation to make sure we don't surpass the limit, which could lead to two unwanted scenarios when having large system messages:

All other messages are excluded due to size.
The system message already exceeds the limit.

As a result, I'm putting a hard-limit of 60% of available tokens. We don't want to aggresively truncate because if rag fragments are included, the system message contains a lot of context to improve the model response, but we also want to make room for the recent messages in the conversation.
2024-07-12 15:09:01 -03:00
PangBo 4ebbdc043e
fix: locale handling in assistant.rb (#705) 2024-07-05 11:16:09 +02:00
Roman Rizzi 442681a3d3
FIX: Mixtral models have system role support. (#703)
Using assistant role for system produces an error because
they expect alternating roles like user/assistant/user and so on.
Prompts cannot start with the assistant role.
2024-07-04 13:23:03 -03:00
Keegan George eab2f74b58
DEV: Use site locale for composer helper translations (#698) 2024-07-04 08:23:37 -07:00
Sam 1320eed9b2
FEATURE: move summary to use llm_model (#699)
This allows summary to use the new LLM models and migrates of API key based model selection

Claude 3.5 etc... all work now. 

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-07-04 10:48:18 +10:00
Roman Rizzi fc081d9da6
FIX: Restore ability to fold summaries, which was accidentally removed (#700) 2024-07-03 18:10:31 -03:00
Keegan George 1b0ba9197c
DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
Sam b671ffe7fa
FIX: info not working, not suppressing hidden tags from report (#696)
2 small fixes

1. The info button was not properly working post refactor
2. Suppress any secured tags from report input
2024-07-02 16:38:33 +10:00
Sam b863ddc94b
FEATURE: custom user defined tools (#677)
Introduces custom AI tools functionality. 

1. Why it was added:
   The PR adds the ability to create, manage, and use custom AI tools within the Discourse AI system. This feature allows for more flexibility and extensibility in the AI capabilities of the platform.

2. What it does:
   - Introduces a new `AiTool` model for storing custom AI tools
   - Adds CRUD (Create, Read, Update, Delete) operations for AI tools
   - Implements a tool runner system for executing custom tool scripts
   - Integrates custom tools with existing AI personas
   - Provides a user interface for managing custom tools in the admin panel

3. Possible use cases:
   - Creating custom tools for specific tasks or integrations (stock quotes, currency conversion etc...)
   - Allowing administrators to add new functionalities to AI assistants without modifying core code
   - Implementing domain-specific tools for particular communities or industries

4. Code structure:
   The PR introduces several new files and modifies existing ones:

   a. Models:
      - `app/models/ai_tool.rb`: Defines the AiTool model
      - `app/serializers/ai_custom_tool_serializer.rb`: Serializer for AI tools

   b. Controllers:
      - `app/controllers/discourse_ai/admin/ai_tools_controller.rb`: Handles CRUD operations for AI tools

   c. Views and Components:
      - New Ember.js components for tool management in the admin interface
      - Updates to existing AI persona management components to support custom tools 

   d. Core functionality:
      - `lib/ai_bot/tool_runner.rb`: Implements the custom tool execution system
      - `lib/ai_bot/tools/custom.rb`: Defines the custom tool class

   e. Routes and configurations:
      - Updates to route configurations to include new AI tool management pages

   f. Migrations:
      - `db/migrate/20240618080148_create_ai_tools.rb`: Creates the ai_tools table

   g. Tests:
      - New test files for AI tool functionality and integration

The PR integrates the custom tools system with the existing AI persona framework, allowing personas to use both built-in and custom tools. It also includes safety measures such as timeouts and HTTP request limits to prevent misuse of custom tools.

Overall, this PR significantly enhances the flexibility and extensibility of the Discourse AI system by allowing administrators to create and manage custom AI tools tailored to their specific needs.

Co-authored-by: Martin Brennan <martin@discourse.org>
2024-06-27 17:27:40 +10:00
Sam af4f871096
FIX: never provide tools with invalid UTF-8 strings (#692)
Previous to this change, on truncation we could return invalid
UTF-8 strings to caller

This also allows tools to read up to 30 megs vs the old 4 megs.
2024-06-27 14:06:52 +10:00
Sam 1d5fa0ce6c
FIX: when creating an llm we were not creating user (#685)
This meant that if you toggle ai user early it surprisingly did
not work.

Also remove safety settings from gemini, it is overly cautious
2024-06-24 09:59:42 +10:00
Roman Rizzi 8849caf136
DEV: Transition "Select model" settings to only use LlmModels (#675)
We no longer support the "provider:model" format in the "ai_helper_model" and
"ai_embeddings_semantic_search_hyde_model" settings. We'll migrate existing
values and work with our new data-driven LLM configs from now on.
2024-06-19 18:01:35 -03:00
Sam 0d6d9a6ef5
FEATURE: allow access to private topics if tool permits (#673)
Previously read tool only had access to public topics, this allows
access to all topics user has access to, if admin opts for the option
Also

- Fixes VLLM migration
- Display which llms have bot enabled
2024-06-19 15:49:36 +10:00
Krzysztof Kotlarek 3c45335936
DEV: disable flaky playground_spec (#672) 2024-06-19 10:56:15 +10:00
Roman Rizzi 8d5f901a67
DEV: Rewire AI bot internals to use LlmModel (#638)
* DRAFT: Create AI Bot users dynamically and support custom LlmModels

* Get user associated to llm_model

* Track enabled bots with attribute

* Don't store bot username. Minor touches to migrate default values in settings

* Handle scenario where vLLM uses a SRV record

* Made 3.5-turbo-16k the default version so we can remove hack
2024-06-18 14:32:14 -03:00
Sam 460f5c4553
FIX: display search correctly, bug when stripping XML (#668)
- Display filtered search correctly, so it is not confusing
- When XML stripping, if a chunk was `<` it would crash
- SQL Helper improved to be better aware of Data Explorer
2024-06-14 15:28:40 +10:00
Sam f642a27f11
FIX: Dall E / Artist broken when tool_details is disabled (#667)
We were missing logic to handle custom_html from tools

This also fixes image generation in chat
2024-06-12 17:58:28 +10:00
Sam 52a7dd2a4b
FEATURE: optional tool detail blocks (#662)
This is a rather huge refactor with 1 new feature (tool details can
be suppressed)

Previously we use the name "Command" to describe "Tools", this unifies
all the internal language and simplifies the code.

We also amended the persona UI to use less DToggles which aligns
with our design guidelines.

Co-authored-by: Martin Brennan <martin@discourse.org>
2024-06-11 18:14:14 +10:00
Sam 8b81ff45b8
FIX: switch off native tools on Anthropic Claude Opus (#659)
Native tools do not work well on Opus.

Chain of Thought prompting means it consumes enormous amounts of
tokens and has poor latency.

This commit introduce and XML stripper to remove various chain of
thought XML islands from anthropic prompts when tools are involved.

This mean Opus native tools is now functions (albeit slowly)

From local testing XML just works better now.

Also fixes enum support in Anthropic native tools
2024-06-07 10:52:01 -03:00
Sam 3993c685e1
FEATURE: anthropic function calling (#654)
Adds support for native tool calling (both streaming and non streaming) for Anthropic.

This improves general tool support on the Anthropic models.
2024-06-06 08:34:23 +10:00
Sam 564d2de534
FEATURE: Add native Cohere tool support (#655)
Add native Cohere tool support

- Introduce CohereTools class for tool translation and result processing
- Update Command dialect to integrate with CohereTools
- Modify Cohere endpoint to support passing tools and processing tool calls
- Add spec for testing tool triggering with Cohere endpoint
2024-06-04 08:59:15 +10:00
Sam 834fea672f
FEATURE: improved tooling (#651)
1. New tool to easily find files (and default branch) in a Github repo
2. Improved read tool with clearer params and larger context

* limit can totally mess up the richness semantic search adds, so include the results unconditionally.
2024-05-30 06:33:50 +10:00
Sam 309280cbb6
FEATURE: add aspect ratio support to DallE 3 (#647)
DallE 3 supports tall/square and wide images.

This adds support to the 3 variants. (wide / tall / square)
2024-05-28 16:21:40 +10:00
Sam baf88e7cfc
FEATURE: improve logging by including llm name (#640)
Log the language model name when logging api requests
2024-05-27 16:46:01 +10:00
Sam d5c23f01ff
FIX: correct gemini streaming implementation (#632)
This also implements image support and gemini-flash support
2024-05-22 16:35:29 +10:00
Sam d4116ecfac
FEATURE: Add support for contextualizing a DM to a bot (#627)
This brings the context of the current topic on screen into chat
2024-05-21 17:17:02 +10:00
Sam 232f12eba6
FEATURE: JavaScript evaluation tool (#630)
This is similar to code interpreter by ChatGPT, except that it uses
JavaScript as the execution engine.

Safeguards were added to ensure memory is constrained and evaluation
times out.
2024-05-21 07:57:01 +10:00
Roman Rizzi 1d786fbaaf
FEATURE: Set endpoint credentials directly from LlmModel. (#625)
* FEATURE: Set endpoint credentials directly from LlmModel.

Drop Llama2Tokenizer since we no longer use it.

* Allow http for custom LLMs

---------

Co-authored-by: Rafael Silva <xfalcox@gmail.com>
2024-05-16 09:50:22 -03:00
Bianca Nenciu d5e30592f3
FIX: Load categories from search response (#612)
When lazy load categories is enabled, the list of categories does not
have to fetched from the "site.json" endpoint because it is already
returned by "search.json".

This commit reverts commits 5056502 and 3e54697 because iterating over
all pages of categories is not really necessary.
2024-05-14 17:13:25 +03:00
Sam 8eee6893d6
FEATURE: GPT4o support and better auditing (#618)
- Introduce new support for GPT4o (automation / bot / summary / helper)
- Properly account for token counts on OpenAI models
- Track feature that was used when generating AI completions
- Remove custom llm support for summarization as we need better interfaces to control registration and de-registration
2024-05-14 13:28:46 +10:00
Roman Rizzi 62fc7d6ed0
FEATURE: Configurable LLMs. (#606)
This PR introduces the concept of "LlmModel" as a new way to quickly add new LLM models without making any code changes. We are releasing this first version and will add incremental improvements, so expect changes.

The AI Bot can't fully take advantage of this feature as users are hard-coded. We'll fix this in a separate PR.s
2024-05-13 12:46:42 -03:00
Sam 0069256efd
FIX: improve function call parsing (#613)
- support " / ' wrapped values
- coerce integer to integer
- enforce enum at boundary
2024-05-13 19:40:11 +10:00
Sam 61890b667c
FEATURE: search command now support searching in context of user (#610)
This optional feature allows search to be performed in the context
of the user that executed it.

By default we do not allow this behavior cause it means llm gets
access to potentially secure data.
2024-05-10 11:32:34 +10:00
Sam 514823daca
FIX: streaming broken in bedrock when chunks are not aligned (#609)
Also

- Stop caching llm list - this cause llm list in persona to be incorrect
- Add more UI to debug screen so you can properly see raw response
2024-05-09 12:11:50 +10:00
Sam cf34838a09
FIX: context repairs for @mentioned bot (#608)
When the bot is @mentioned, we need to be a lot more careful
about constructing context otherwise bot gets ultra confused.

This changes multiple things:

1. We were omitting all thread first messages (fixed)
2. Include thread title (if available) in context
3. Construct context in a clearer way separating user request from data
2024-05-08 18:44:04 +10:00
Roman Rizzi 4f1a3effe0
REFACTOR: Migrate Vllm/TGI-served models to the OpenAI format. (#588)
Both endpoints provide OpenAI-compatible servers. The only difference is that Vllm doesn't support passing tools as a separate parameter. Even if the tool param is supported, it ultimately relies on the model's ability to handle native functions, which is not the case with the models we have today.

As a part of this change, we are dropping support for StableBeluga/Llama2 models. They don't have a chat_template, meaning the new API can translate them.

These changes let us remove some of our existing dialects and are a first step in our plan to support any LLM by defining them as data-driven concepts.

 I rewrote the "translate" method to use a template method and extracted the tool support strategies into its classes to simplify the code.

Finally, these changes bring support for Ollama when running in dev mode. It only works with Mistral for now, but it will change soon..
2024-05-07 10:02:16 -03:00
Joffrey JAFFEUX dacc1b9f28
DEV: updates spec following changes in chat (#604)
Also makes this spec to use the `update_message!` helper.
2024-05-07 15:01:06 +02:00
Sam ab78d9b597
REFACTOR: Simplify tool invocation by removing bot_user and llm parameters (#603)
* Well, it was quite a journey but now tools have "context" which
can be critical for the stuff they generate

This entire change was so Dall E and Artist generate images in the correct context

* FIX: improve error handling around image generation

- also corrects image markdown and clarifies code

* fix spec
2024-05-07 21:55:46 +10:00
Sam 88c7427fab
FEATURE: allow @mentioning an ai bot in a channel (#602)
if a persona is mentionable and allows chat allow it to be mentioned in a chat channel
2024-05-07 10:30:39 +10:00
Sam e4b326c711
FEATURE: support Chat with AI Persona via a DM (#488)
Add support for chat with AI personas

- Allow enabling chat for AI personas that have an associated user
- Add new setting `allow_chat` to AI persona to enable/disable chat
- When a message is created in a DM channel with an allowed AI persona user, schedule a reply job
- AI replies to chat messages using the persona's `max_context_posts` setting to determine context
- Store tool calls and custom prompts used to generate a chat reply on the `ChatMessageCustomPrompt` table
- Add tests for AI chat replies with tools and context

At the moment unlike posts we do not carry tool calls in the context.

No @mention support yet for ai personas in channels, this is future work
2024-05-06 09:49:02 +10:00
Sam 6623928b95
FIX: call after tool calls failing on OpenAI / Gemini (#599)
A recent change meant that llm instance got cached internally, repeat calls
to inference would cache data in Endpoint object leading model to
failures.

Both Gemini and Open AI expect a clean endpoint object cause they
set data.

This amends internals to make sure llm.generate will always operate
on clean objects
2024-05-01 17:50:58 +10:00
Sam 32b3004ce9
FEATURE: Add Question Consolidator for robust Upload support in Personas (#596)
This commit introduces a new feature for AI Personas called the "Question Consolidator LLM". The purpose of the Question Consolidator is to consolidate a user's latest question into a self-contained, context-rich question before querying the vector database for relevant fragments. This helps improve the quality and relevance of the retrieved fragments.

Previous to this change we used the last 10 interactions, this is not ideal cause the RAG would "lock on" to an answer. 

EG:

- User: how many cars are there in europe
- Model: detailed answer about cars in europe including the term car and vehicle many times
- User: Nice, what about trains are there in the US

In the above example "trains" and "US" becomes very low signal given there are pages and pages talking about cars and europe. This mean retrieval is sub optimal. 

Instead, we pass the history to the "question consolidator", it would simply consolidate the question to "How many trains are there in the United States", which would make it fare easier for the vector db to find relevant content. 

The llm used for question consolidator can often be less powerful than the model you are talking to, we recommend using lighter weight and fast models cause the task is very simple. This is configurable from the persona ui.

This PR also removes support for {uploads} placeholder, this is too complicated to get right and we want freedom to shift RAG implementation. 

Key changes:

1. Added a new `question_consolidator_llm` column to the `ai_personas` table to store the LLM model used for question consolidation.

2. Implemented the `QuestionConsolidator` module which handles the logic for consolidating the user's latest question. It extracts the relevant user and model messages from the conversation history, truncates them if needed to fit within the token limit, and generates a consolidated question prompt.

3. Updated the `Persona` class to use the Question Consolidator LLM (if configured) when crafting the RAG fragments prompt. It passes the conversation context to the consolidator to generate a self-contained question.

4. Added UI elements in the AI Persona editor to allow selecting the Question Consolidator LLM. Also made some UI tweaks to conditionally show/hide certain options based on persona configuration.

5. Wrote unit tests for the QuestionConsolidator module and updated existing persona tests to cover the new functionality.

This feature enables AI Personas to better understand the context and intent behind a user's question by consolidating the conversation history into a single, focused question. This can lead to more relevant and accurate responses from the AI assistant.
2024-04-30 13:49:21 +10:00
Sam 85734fef52
FIX: properly cache user locale (#593)
This blob is localized according to user locale, so we can end up
bleeding incorrect data in the cache
2024-04-26 09:28:35 -03:00
Roman Rizzi 0c4069ab3f
DEV: Remove non-LLM-based summarization strategies. (#589)
We removed these services from our hosting two weeks ago. It's safe to assume everyone has moved to other LLM-based options.
2024-04-23 12:11:04 -03:00
Sam 4d8b7742da
FIX: many missing topics when categories excluded (#585)
We were forgetting about the NULL parent_category_id handling in
our check for sub categories
2024-04-23 08:53:51 +10:00
Rafael dos Santos Silva 595cde0fd6
FIX: Users with empty locales would error out during prompt localization (#584) 2024-04-22 13:55:10 -03:00
Sam 5ab86923ff
FIX: when excluding categories also exclude children (#583)
This allows you to exclude trees of categories in a simple way

It also means you can no longer exclude "just the parent" but
this is a fair compromise.
2024-04-22 16:05:24 +10:00
Sam a223d18f1a
FIX: more robust function call support (#581)
For quite a few weeks now, some times, when running function calls
on Anthropic we would get a "stray" - "calls" line.

This has been enormously frustrating!

I have been unable to find the source of the bug so instead decoupled
the implementation and create a very clear "function call normalizer"

This new class is extensively tested and guards against the type of
edge cases we saw pre-normalizer.

This also simplifies the implementation of "endpoint" which no longer
needs to handle all this complex logic.
2024-04-19 06:54:54 +10:00
Sam 4a29f8ed1c
FEATURE: Enhance AI debugging capabilities and improve interface adjustments (#577)
* FIX: various RAG edge cases

- Nicer text to describe RAG, avoids the word RAG
- Do not attempt to save persona when removing uploads and it is not created
- Remove old code that avoided touching rag params on create

* FIX: Missing pause button for persona users

* Feature: allow specific users to debug ai request / response chains

This can help users easily tune RAG and figure out what is going
on with requests.

* discourse helper so it does not explode

* fix test

* simplify implementation
2024-04-15 23:22:06 +10:00
Rafael dos Santos Silva 6090580e36
FEATURE: Add basic connection check to DNS SRV resources (#563) 2024-04-12 10:39:19 -03:00
Sam f6ac5cd0a8
FEATURE: allow tuning of RAG generation (#565)
* FEATURE: allow tuning of RAG generation

- change chunking to be token based vs char based (which is more accurate)
- allow control over overlap / tokens per chunk and conversation snippets inserted
- UI to control new settings

* improve ui a bit

* fix various reindex issues

* reduce concurrency

* try ultra low queue ... concurrency 1 is too slow.
2024-04-12 10:32:46 -03:00
Rafael dos Santos Silva 253e0b7b39
FEATURE: Mixtral/Mistral/Haiku Automation Support (#571)
Adds new models to automation, and makes LLM output parsing more robust.
2024-04-11 09:50:46 -03:00
Sam 23d12c8927
FEATURE: GPT-4 turbo vision support (#575)
Recent release of GPT-4 turbo adds vision support, this adds
the pipeline for sending images to Open AI.
2024-04-11 16:22:59 +10:00
Sam a77658e2b1
FIX: tools broke on Claude with no params (#574)
Some tools may have no params, allow that
2024-04-11 15:17:56 +10:00
Sam 3b0cfdbe5c
FIX: throwing away first file in diff (#573)
We were chucking out the first file in a PR diff due to a
logic bug
2024-04-11 13:26:58 +10:00
Sam 0cbbf130b9
FIX: never mention the word JSON in tool preamble (#572)
Just having the word JSON can confuse models when we expect them
to deal solely in XML

Instead provide an example of how string arrays should be returned

Technically the tool framework supports int arrays and more, but
our current implementation only does string arrays.

Also tune the prompt construction not to give any tips about arrays
if none exist
2024-04-11 11:24:22 +10:00
Sam 7f16d3ad43
FEATURE: Cohere Command R support (#558)
- Added Cohere Command models (Command, Command Light, Command R, Command R Plus) to the available model list
- Added a new site setting `ai_cohere_api_key` for configuring the Cohere API key
- Implemented a new `DiscourseAi::Completions::Endpoints::Cohere` class to handle interactions with the Cohere API, including:
   - Translating request parameters to the Cohere API format
   - Parsing Cohere API responses 
   - Supporting streaming and non-streaming completions
   - Supporting "tools" which allow the model to call back to discourse to lookup additional information
- Implemented a new `DiscourseAi::Completions::Dialects::Command` class to translate between the generic Discourse AI prompt format and the Cohere Command format
- Added specs covering the new Cohere endpoint and dialect classes
- Updated `DiscourseAi::AiBot::Bot.guess_model` to map the new Cohere model to the appropriate bot user

In summary, this PR adds support for using the Cohere Command family of models with the Discourse AI plugin. It handles configuring API keys, making requests to the Cohere API, and translating between Discourse's generic prompt format and Cohere's specific format. Thorough test coverage was added for the new functionality.
2024-04-11 07:24:17 +10:00
Bianca Nenciu 505650205d
FIX: Fetch categories data using specific endpoint (#543)
It used to fetch it from /site.json, but /categories.json is the more
appropriate one. This one also implements pagination, so we have to do
one request per page.
2024-04-08 11:33:20 +03:00
Sam 6f5f34184b
FEATURE: add Claude 3 Haiku bot support (#552)
it is close in performance to GPT 4 at a fraction of the cost,
nice to add it to the mix.

Also improves a test case to simulate streaming, I am hunting for
the "calls" word that is jumping into function calls and can't quite
find it.
2024-04-03 16:06:27 +11:00
Roman Rizzi 1f1c94e5c6
FEATURE: AI Bot RAG support. (#537)
This PR lets you associate uploads to an AI persona, which we'll split and generate embeddings from. When building the system prompt to get a bot reply, we'll do a similarity search followed by a re-ranking (if available). This will let us find the most relevant fragments from the body of knowledge you associated with the persona, resulting in better, more informed responses.

For now, we'll only allow plain-text files, but this will change in the future.

Commits:

* FEATURE: RAG embeddings for the AI Bot

This first commit introduces a UI where admins can upload text files, which we'll store, split into fragments,
and generate embeddings of. In a next commit, we'll use those to give the bot additional information during
conversations.

* Basic asymmetric similarity search to provide guidance in system prompt

* Fix tests and lint

* Apply reranker to fragments

* Uploads filter, css adjustments and file validations

* Add placeholder for rag fragments

* Update annotations
2024-04-01 13:43:34 -03:00
Sam fb81307c59
FEATURE: web browsing tool (#548)
This pull request makes several improvements and additions to the GitHub-related tools and personas in the `discourse-ai` repository:

1. It adds the `WebBrowser` tool to the  `Researcher` persona, allowing the AI to visit web pages, retrieve HTML content, extract the main content, and convert it to plain text.

2. It updates the `GithubFileContent`, `GithubPullRequestDiff`, and `GithubSearchCode` tools to handle HTTP responses more robustly (introducing size limits). 

3. It refactors the `send_http_request` method in the `Tool` class to follow redirects when specified, and to read the response body in chunks to avoid memory issues with large responses. (only for WebBrowser)

4. It updates the system prompt for the `Researcher` persona to provide more detailed guidance on when to use Google search vs web browsing, and how to optimize tool usage and reduce redundant requests.

5. It adds a new `web_browser_spec.rb` file with tests for the `WebBrowser` tool, covering various scenarios like handling different HTML structures and following redirects.
2024-03-28 16:01:58 +11:00
Sam 61e4c56e1a
FEATURE: Add vision support to AI personas (Claude 3) (#546)
This commit adds the ability to enable vision for AI personas, allowing them to understand images that are posted in the conversation.

For personas with vision enabled, any images the user has posted will be resized to be within the configured max_pixels limit, base64 encoded and included in the prompt sent to the AI provider.

The persona editor allows enabling/disabling vision and has a dropdown to select the max supported image size (low, medium, high). Vision is disabled by default.

This initial vision support has been tested and implemented with Anthropic's claude-3 models which accept images in a special format as part of the prompt.

Other integrations will need to be updated to support images.

Several specs were added to test the new functionality at the persona, prompt building and API layers.

 - Gemini is omitted, pending API support for Gemini 1.5. Current Gemini bot is not performing well, adding images is unlikely to make it perform any better.

 - Open AI is omitted, vision support on GPT-4 it limited in that the API has no tool support when images are enabled so we would need to full back to a different prompting technique, something that would add lots of complexity


---------

Co-authored-by: Martin Brennan <martin@discourse.org>
2024-03-27 14:30:11 +11:00
Natalie Tay f3b3312ce0
FIX: Avoid replying to the reply user for llm_triage automation (#544)
Avoid replying to the reply user. This causes an infinite conversation from the reply_user to the reply_user
2024-03-22 12:34:18 +08:00
Sam 41f1530078
FIX: mention suppression was not working right (#538)
We were only suppressing non mentions, ones that become spans.

@sam in the test was not resolving to a mention cause the user
did not exist.

depends on: https://github.com/discourse/discourse/pull/26253 for tests to pass.
2024-03-20 13:00:39 +11:00
Sam cc0369dd39
FEATURE: friendlier reply behavior in bot PMs (#535)
- Stop replying as bot, when human replies to another human
- Reply as correct persona when replying directly to a persona
- Fix paper cut where suppressing notifications was not doing so
2024-03-19 20:15:12 +11:00
Sam f62703760f
FEATURE: add Claude 3 sonnet/haiku support for Amazon Bedrock (#534)
This PR consolidates the  implements new Anthropic Messages interface for Bedrock Claude endpoints and adds support for the new Claude 3 models (haiku, opus, sonnet).

Key changes:
- Renamed `AnthropicMessages` and `Anthropic` endpoint classes into a single `Anthropic` class (ditto for ClaudeMessages -> Claude)
- Updated `AwsBedrock` endpoints to use the new `/messages` API format for all Claude models
- Added `claude-3-haiku`, `claude-3-opus` and `claude-3-sonnet` model support in both Anthropic and AWS Bedrock endpoints
- Updated specs for the new consolidated endpoints and Claude 3 model support

This refactor removes support for old non messages API which has been deprecated by anthropic
2024-03-19 06:48:46 +11:00
Sam d7ed8180af
FEATURE: allow suppression of notifications from report generation (#533)
* FEATURE: allow suppression of notifications from report generation

Previously we needed to do this by hand, unfortunately this uses up
too many tokens and is very hard to discover.

New option means that we can trivially disable notifications without
needing any prompt engineering.

* URI.parse is safer, use it
2024-03-16 08:05:03 +11:00
Sam a03bc6ddec
FEATURE: Share conversations with AI via a URL (#521)
This allows users to share a static page of an AI conversation with
the rest of the world.

By default this feature is disabled, it is enabled by turning on
ai_bot_allow_public_sharing via site settings

Precautions are taken when sharing

1. We make a carbonite copy
2. We minimize work generating page
3. We limit to 100 interactions
4. Many security checks - including disallowing if there is a mix
of users in the PM.

* Bonus commit, large PRs like this PR did not work with github tool
large objects would destroy context


Co-authored-by: Martin Brennan <martin@discourse.org>
2024-03-12 16:51:41 +11:00
Sam 79638c2f50
FIX: Tune function calling (#519)
Adds support for "name" on functions which can be used for tool calls

For function calls we need to keep track of id/name and previously
we only supported either

Also attempts to improve sql helper
2024-03-09 08:46:40 +11:00
Sam 2ad743d246
FEATURE: Add GitHub Helper AI Bot persona and tools (#513)
Introduces a new AI Bot persona called 'GitHub Helper' which is specialized in assisting with GitHub-related tasks and questions. It includes the following key changes:

- Implements the GitHub Helper persona class with its system prompt and available tools
   
- Adds three new AI Bot tools for GitHub interactions:
  - github_file_content: Retrieves content of files from a GitHub repository
  - github_pull_request_diff: Retrieves the diff for a GitHub pull request
  - github_search_code: Searches for code in a GitHub repository
    
- Updates the AI Bot dialects to support the new GitHub tools

- Implements multiple function calls for standard tool dialect
2024-03-08 06:37:23 +11:00
Loïc Guitaut 6ae4218a96 DEV: Fix new Rubocop offenses 2024-03-06 15:23:29 +01:00
Sam 8b382d6098
FEATURE: support for claude opus and sonnet (#508)
This provides new support for messages API from Claude.

It is required for latest model access.

Also corrects implementation of function calls.

* Fix message interleving

* fix broken spec

* add new models to automation
2024-03-06 06:04:37 +11:00
Sam b7a96e3bcb
FIX: avoid all bot feedback loops (#507)
We need to ensure that under no circumstances feedback loops between
bots will emerge cause this can eat up a lot of tokens
2024-03-05 10:02:49 +11:00
Sam 77cf9e2cff
FIX: system persona non English save, missing bot pms
- FIX: only update system attributes when updating system persona
- FIX: update participant count by hand so bot messages show in inbox
 

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
2024-03-04 09:56:59 +11:00
Sam c02794cf2e
FIX: support multiple tool calls (#502)
* FIX: support multiple tool calls

Prior to this change we had a hard limit of 1 tool call per llm
round trip. This meant you could not google multiple things at
once or perform searches across two tools.

Also:

- Hint when Google stops working
- Log topic_id / post_id when performing completions

* Also track id for title
2024-03-02 07:53:21 +11:00
Sam 59bab2bba3
FIX: stream messages when directly PMing a persona (#500)
previous to this fix we did not consider personas a bot in the
front end
2024-03-01 07:53:42 +11:00
Sam 9fb1430e40
FIX: support spaces within arguments for Open AI (#499)
Previous to this fix if a tool call ever streamed a SPACE alone,
we would eat it and ignore it, breaking params

Also fixes some tests to ensure they are actually called :)
2024-02-29 12:47:34 +11:00
Rafael dos Santos Silva 1b72a00d2c
FEATURE: Option for AI triage to send a post to the review queue (#498)
Option for AI triage to send a post to the review queue
2024-02-29 12:33:28 +11:00
Sam 484fd1435b
DEV: improve internal design of ai persona and bug fix (#495)
* DEV: improve internal design of ai persona and bug fix

- Fixes bug where OpenAI could not describe images
- Fixes bug where mentionable personas could not be mentioned unless overarching bot was enabled
- Improves internal design of playground and bot to allow better for non "bot" users
- Allow PMs directly to persona users (previously bot user would also have to be in PM)
- Simplify internal code


Co-authored-by: Martin Brennan <martin@discourse.org>
2024-02-28 16:46:32 +11:00
Sam d036f3fb8e
FEATURE: AI helper support in non English languages (#489)
* FEATURE: AI helper support in non English languages

This attempts some prompt engineering to coerce AI helper to answer
in the appropriate language.

Note mileage will vary, in testing GPT-4 produces the best results
GPT-3.5 can return OKish results.

* Extend non english support for GPT-4V image caption

* Update db/fixtures/ai_helper/603_completion_prompts.rb

---------

Co-authored-by: Rafael Silva <xfalcox@gmail.com>
2024-02-27 16:31:51 -03:00
Sam aabff87501
FIX: image generation in gemini was broken (#490)
We need to inject blank model answers after tool calls if absent
otherwise model will reject it.
2024-02-27 18:24:30 +11:00
Roman Rizzi 94ba0dadc2
SECURITY: Place a SSRF protection when calling services from the plugin. (#485)
The Faraday adapter and `FinalDestionation::HTTP` will protect us from admin-initiated SSRF attacks when interacting with the external services powering this plugin features.:
2024-02-21 17:14:50 -03:00
Sam 1f74a77e17
DEV: correct flaky spec (#475)
We were not properly expiring prompt cache
2024-02-19 15:21:55 +11:00
Sam 0fb87b00e2
FEATURE: new Discourse Helper persona (#473)
This persona searches Discourse Meta for help with Discourse and
points users at relevant posts.

It is somewhat similar to using "Forum Helper" on meta, with the
notable difference that we can not lean on semantic search so using
some prompt engineering we try to keep it simple.
2024-02-19 14:52:12 +11:00
Keegan George d66915ecc1
DEV: Make prompts available on `CurrentUserSerializer` (#472) 2024-02-16 10:57:14 -08:00
Sam 3a8d95f6b2
FEATURE: mentionable personas and random picker tool, context limits (#466)
1. Personas are now optionally mentionable, meaning that you can mention them either from public topics or PMs
       - Mentioning from PMs helps "switch" persona mid conversation, meaning if you want to look up sites setting you can invoke the site setting bot, or if you want to generate an image you can invoke dall e
        - Mentioning outside of PMs allows you to inject a bot reply in a topic trivially
     - We also add the support for max_context_posts this allow you to limit the amount of context you feed in, which can help control costs

2. Add support for a "random picker" tool that can be used to pick random numbers 

3. Clean up routing ai_personas -> ai-personas

4. Add Max Context Posts so users can control how much history a persona can consume (this is important for mentionable personas) 

Co-authored-by: Martin Brennan <martin@discourse.org>
2024-02-15 16:37:59 +11:00