We were only suppressing non mentions, ones that become spans.
@sam in the test was not resolving to a mention cause the user
did not exist.
depends on: https://github.com/discourse/discourse/pull/26253 for tests to pass.
- Stop replying as bot, when human replies to another human
- Reply as correct persona when replying directly to a persona
- Fix paper cut where suppressing notifications was not doing so
This PR consolidates the implements new Anthropic Messages interface for Bedrock Claude endpoints and adds support for the new Claude 3 models (haiku, opus, sonnet).
Key changes:
- Renamed `AnthropicMessages` and `Anthropic` endpoint classes into a single `Anthropic` class (ditto for ClaudeMessages -> Claude)
- Updated `AwsBedrock` endpoints to use the new `/messages` API format for all Claude models
- Added `claude-3-haiku`, `claude-3-opus` and `claude-3-sonnet` model support in both Anthropic and AWS Bedrock endpoints
- Updated specs for the new consolidated endpoints and Claude 3 model support
This refactor removes support for old non messages API which has been deprecated by anthropic
* FEATURE: allow suppression of notifications from report generation
Previously we needed to do this by hand, unfortunately this uses up
too many tokens and is very hard to discover.
New option means that we can trivially disable notifications without
needing any prompt engineering.
* URI.parse is safer, use it
This allows users to share a static page of an AI conversation with
the rest of the world.
By default this feature is disabled, it is enabled by turning on
ai_bot_allow_public_sharing via site settings
Precautions are taken when sharing
1. We make a carbonite copy
2. We minimize work generating page
3. We limit to 100 interactions
4. Many security checks - including disallowing if there is a mix
of users in the PM.
* Bonus commit, large PRs like this PR did not work with github tool
large objects would destroy context
Co-authored-by: Martin Brennan <martin@discourse.org>
Adds support for "name" on functions which can be used for tool calls
For function calls we need to keep track of id/name and previously
we only supported either
Also attempts to improve sql helper
Introduces a new AI Bot persona called 'GitHub Helper' which is specialized in assisting with GitHub-related tasks and questions. It includes the following key changes:
- Implements the GitHub Helper persona class with its system prompt and available tools
- Adds three new AI Bot tools for GitHub interactions:
- github_file_content: Retrieves content of files from a GitHub repository
- github_pull_request_diff: Retrieves the diff for a GitHub pull request
- github_search_code: Searches for code in a GitHub repository
- Updates the AI Bot dialects to support the new GitHub tools
- Implements multiple function calls for standard tool dialect
This provides new support for messages API from Claude.
It is required for latest model access.
Also corrects implementation of function calls.
* Fix message interleving
* fix broken spec
* add new models to automation
- FIX: only update system attributes when updating system persona
- FIX: update participant count by hand so bot messages show in inbox
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
* FIX: support multiple tool calls
Prior to this change we had a hard limit of 1 tool call per llm
round trip. This meant you could not google multiple things at
once or perform searches across two tools.
Also:
- Hint when Google stops working
- Log topic_id / post_id when performing completions
* Also track id for title
Previous to this fix if a tool call ever streamed a SPACE alone,
we would eat it and ignore it, breaking params
Also fixes some tests to ensure they are actually called :)
* DEV: improve internal design of ai persona and bug fix
- Fixes bug where OpenAI could not describe images
- Fixes bug where mentionable personas could not be mentioned unless overarching bot was enabled
- Improves internal design of playground and bot to allow better for non "bot" users
- Allow PMs directly to persona users (previously bot user would also have to be in PM)
- Simplify internal code
Co-authored-by: Martin Brennan <martin@discourse.org>
* FEATURE: AI helper support in non English languages
This attempts some prompt engineering to coerce AI helper to answer
in the appropriate language.
Note mileage will vary, in testing GPT-4 produces the best results
GPT-3.5 can return OKish results.
* Extend non english support for GPT-4V image caption
* Update db/fixtures/ai_helper/603_completion_prompts.rb
---------
Co-authored-by: Rafael Silva <xfalcox@gmail.com>
The Faraday adapter and `FinalDestionation::HTTP` will protect us from admin-initiated SSRF attacks when interacting with the external services powering this plugin features.:
This persona searches Discourse Meta for help with Discourse and
points users at relevant posts.
It is somewhat similar to using "Forum Helper" on meta, with the
notable difference that we can not lean on semantic search so using
some prompt engineering we try to keep it simple.
1. Personas are now optionally mentionable, meaning that you can mention them either from public topics or PMs
- Mentioning from PMs helps "switch" persona mid conversation, meaning if you want to look up sites setting you can invoke the site setting bot, or if you want to generate an image you can invoke dall e
- Mentioning outside of PMs allows you to inject a bot reply in a topic trivially
- We also add the support for max_context_posts this allow you to limit the amount of context you feed in, which can help control costs
2. Add support for a "random picker" tool that can be used to pick random numbers
3. Clean up routing ai_personas -> ai-personas
4. Add Max Context Posts so users can control how much history a persona can consume (this is important for mentionable personas)
Co-authored-by: Martin Brennan <martin@discourse.org>
* FIX: Better AI chat thread titles
- Fix quote removal when multi-line
- Use XML tags for better LLM output parsing
- Use stop_sequences for faster and less wasteful LLM calls
- Adds truncation as the last line of defense
* FEATURE: allow personas to supply top_p and temperature params
Code assistance generally are more focused at a lower temperature
This amends it so SQL Helper runs at 0.2 temperature vs the more
common default across LLMs of 1.0.
Reduced temperature leads to more focused, concise and predictable
answers for the SQL Helper
* fix tests
* This is not perfect, but far better than what we do today
Instead of fishing for
1. Draft sequence
2. Draft body
We skip (2), this means the composer "only" needs 1 http request to
open, we also want to eliminate (1) but it is a bit of a trickier
core change, may figure out how to pull it off (defer it to first draft save)
Value of bot drafts < value of opening bot conversations really fast
1. on failure we were queuing a job to generate embeddings, it had the wrong params. This is both fixed and covered in a test.
2. backfill embedding in the order of bumped_at, so newest content is embedded first, cover with a test
3. add a safeguard for hidden site setting that only allows batches of 50k in an embedding job run
Previously old embeddings were updated in a random order, this changes it so we update in a consistent order
* UX: Validations to Llm-backed features (except AI Bot)
This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs.
Validations are:
* You must choose a model before enabling the feature.
* You must turn off the feature before setting the model to blank.
* You must configure each model settings before being able to select it.
* Add provider name to summarization options
* vLLM can technically support same models as HF
* Check we can talk to the selected model
* Check for Bedrock instead of anthropic as a site could have both creds setup
When you trim a prompt we never want to have a state where there
is a "tool" reply without a corresponding tool call, it makes no
sense
Also
- GPT-4-Turbo is 128k, fix that
- Claude was not preserving username in prompt
- We were throwing away unicode usernames instead of adding to
message
Account properly for function calls, don't stream through <details> blocks
- Rush cooked content back to client
- Wait longer (up to 60 seconds) before giving up on streaming
- Clean up message bus channels so we don't have leftover data
- Make ai streamer much more reusable and much easier to read
- If buffer grows quickly, rush update so you are not artificially waiting
- Refine prompt interface
- Fix lost system message when prompt gets long
* REFACTOR: Represent generic prompts with an Object.
* Adds a bit more validation for clarity
* Rewrite bot title prompt and fix quirk handling
---------
Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
This PR introduces 3 things:
1. Fake bot that can be used on local so you can test LLMs, to enable on dev use:
SiteSetting.ai_bot_enabled_chat_bots = "fake"
2. More elegant smooth streaming of progress on LLM completion
This leans on JavaScript to buffer and trickle llm results through. It also amends it so the progress dot is much
more consistently rendered
3. It fixes the Claude dialect
Claude needs newlines **exactly** at the right spot, amended so it is happy
---------
Co-authored-by: Martin Brennan <martin@discourse.org>
Previous to this change it was very hard to tell if completion was
stuck or not.
This introduces a "dot" that follows the completion and starts
flashing after 5 seconds.
* FIX: improve bot behavior
- Provide more information to Gemini context post function execution
- Use system prompts for Claude (fixes Dall E)
- Ensure Assistant is properly separated
- Teach Claude to return arrays in JSON vs XML
Also refactors tests so we do not copy tool preamble everywhere
* System msg is claude-2 only. fix typo
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
We thought Azure's latest API version didn't have tool support yet, but I didn't understand it was complaining about a required field in the tool call message.
* FIX: don't include <details> in context
We need to be careful adding <details> into context of conversations
it can cause LLMs to hallucinate results
* Fix Gemini multi-turn ctx flattening
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>