* DEV: Remove old code now that features rely on LlmModels.
* Hide old settings and migrate persona llm overrides
* Remove shadowing special URL + seeding code. Use srv:// prefix instead.
* DRAFT: Create AI Bot users dynamically and support custom LlmModels
* Get user associated to llm_model
* Track enabled bots with attribute
* Don't store bot username. Minor touches to migrate default values in settings
* Handle scenario where vLLM uses a SRV record
* Made 3.5-turbo-16k the default version so we can remove hack
* Well, it was quite a journey but now tools have "context" which
can be critical for the stuff they generate
This entire change was so Dall E and Artist generate images in the correct context
* FIX: improve error handling around image generation
- also corrects image markdown and clarifies code
* fix spec
* FEATURE: allow personas to supply top_p and temperature params
Code assistance generally are more focused at a lower temperature
This amends it so SQL Helper runs at 0.2 temperature vs the more
common default across LLMs of 1.0.
Reduced temperature leads to more focused, concise and predictable
answers for the SQL Helper
* fix tests
* This is not perfect, but far better than what we do today
Instead of fishing for
1. Draft sequence
2. Draft body
We skip (2), this means the composer "only" needs 1 http request to
open, we also want to eliminate (1) but it is a bit of a trickier
core change, may figure out how to pull it off (defer it to first draft save)
Value of bot drafts < value of opening bot conversations really fast
* REFACTOR: Represent generic prompts with an Object.
* Adds a bit more validation for clarity
* Rewrite bot title prompt and fix quirk handling
---------
Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
Previous to this change it was very hard to tell if completion was
stuck or not.
This introduces a "dot" that follows the completion and starts
flashing after 5 seconds.
It also corrects the syntax around tool support, which was wrong.
Gemini doesn't want us to include messages about previous tool invocations, so I had to shuffle around some code to send the response it generated from those invocations instead. For this, I created the "multi_turn" context, which bundles all the context involved in the interaction.
* DEV: AI bot migration to the Llm pattern.
We added tool and conversation context support to the Llm service in discourse-ai#366, meaning we met all the conditions to migrate this module.
This PR migrates to the new pattern, meaning adding a new bot now requires minimal effort as long as the service supports it. On top of this, we introduce the concept of a "Playground" to separate the PM-specific bits from the completion, allowing us to use the bot in other contexts like chat in the future. Commands are called tools, and we simplified all the placeholder logic to perform updates in a single place, making the flow more one-wayish.
* Followup fixes based on testing
* Cleanup unused inference code
* FIX: text-based tools could be in the middle of a sentence
* GPT-4-turbo support
* Use new LLM API
Previous to this changeset we used a custom system for tools/command
support for Anthropic.
We defined commands by using !command as a signal to execute it
Following Anthropic Claude 2.1, there is an official supported syntax (beta)
for tools execution.
eg:
```
+ <function_calls>
+ <invoke>
+ <tool_name>image</tool_name>
+ <parameters>
+ <prompts>
+ [
+ "an oil painting",
+ "a cute fluffy orange",
+ "3 apple's",
+ "a cat"
+ ]
+ </prompts>
+ </parameters>
+ </invoke>
+ </function_calls>
```
This implements the spec per Anthropic, it should be stable enough
to also work on other LLMs.
Keep in mind that OpenAI is not impacted here at all, as it has its
own custom system for function calls.
Additionally:
- Fixes the title system prompt so it works with latest Anthropic
- Uses new spec for "system" messages by Anthropic
- Tweak forum helper persona to guide Anthropic a tiny be better
Overall results are pretty awesome and Anthropic Claude performs
really well now on Discourse
Introduces a UI to manage customizable personas (admin only feature)
Part of the change was some extensive internal refactoring:
- AIBot now has a persona set in the constructor, once set it never changes
- Command now takes in bot as a constructor param, so it has the correct persona and is not generating AIBot objects on the fly
- Added a .prettierignore file, due to the way ALE is configured in nvim it is a pre-req for prettier to work
- Adds a bunch of validations on the AIPersona model, system personas (artist/creative etc...) are all seeded. We now ensure
- name uniqueness, and only allow certain properties to be touched for system personas.
- (JS note) the client side design takes advantage of nested routes, the parent route for personas gets all the personas via this.store.findAll("ai-persona") then child routes simply reach into this model to find a particular persona.
- (JS note) data is sideloaded into the ai-persona model the meta property supplied from the controller, resultSetMeta
- This removes ai_bot_enabled_personas and ai_bot_enabled_chat_commands, both should be controlled from the UI on a per persona basis
- Fixes a long standing bug in token accounting ... we were doing to_json.length instead of to_json.to_s.length
- Amended it so {commands} are always inserted at the end unconditionally, no need to add it to the template of the system message as it just confuses things
- Adds a concept of required_commands to stock personas, these are commands that must be configured for this stock persona to show up.
- Refactored tests so we stop requiring inference_stubs, it was very confusing to need it, added to plugin.rb for now which at least is clearer
- Migrates the persona selector to gjs
---------
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
Also fixes it so users without bot in header can send it messages.
Previous to this change we would seed all bots with database seeds.
This lead to lots of confusion for people who do not enable ai bot.
Instead:
1. We do not seed any bots **until** user enables the ai_bot_enabled setting
2. If it is disabled we will
a. If no messages were created by bot - delete it
b. Otherwise we will deactivate account
* FIX: properly truncate !command prompts
### What is going on here?
Previous to this change where a command was issued by the LLM it
could hallucinate a continuation eg:
```
This is what tags are
!tags
some nonsense here
```
This change introduces safeguards so `some nonsense here` does not
creep in to the prompt history, poisoning the llm results
This in effect grounds the llm a lot better and results in the llm
forgetting less about results.
The change only impacts Claude at the moment, but will also improve
stuff for llama 2 in future.
Also, this makes it significantly easier to test the bot framework
without an llm cause we avoid a whole bunch of complex stubbing
* blank is not a valid bot response, do not inject into prompt
* FIX: Made bot more robust
This is a collection of small fixes
- Display "Searching for: ..." while searching instead of showing found 0 results.
- Only allow 5 commands in lang chain - 6 feels like too much
- On the 5th command stop informing the engine about functions, so it is forced to complete
- Add another 30 tokens of buffer and explain why
- Typo in command prompt
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
previously you would have to wait quite a while to see the prompt this implements
a very basic implementation of progress so you can see the API is working.
Also:
- Fix google progress.
- Handle the incredibly rare, zero results from google.
- Simplify command so it is less error prone
- replace invoke and attache results with a invoke
- ensure invoke can only ever be run once
- pass in all the information a command needs in constructor
- use new pattern throughout
- test invocation in isolation
The command framework had some confusing dispatching where it would dispatch
JSON blobs, this meant there was lots of parsing required in every command
The refactor handles transforming the args prior to dispatch which makes
consuming far simpler
This is also general prep to supporting some basic command framework in other
llms.
* FEATURE: add ai_bot_enabled_chat commands and tune search
This allows admins to disable/enable GPT command integrations.
Also hones search results which were looping cause the result did not denote
the failure properly (it lost context)
* include more context for google command
include more context for time command
* type
```
prompt << build_message(bot_user.username, reply)
```
Would store a "cooked" prompt which is invalid, instead just store the raw
values which are later passed to build_message
Additionally:
1. Disable summary command which needs honing
2. Stop storing decorations (searched for X) in prompt which leads to straying
3. Ship username directly to model, avoiding "user: content" in prompts. This
was causing GPT to stray
Given latest GPT 3.5 16k which is both better steered and supports functions
we can now support rich bot integration.
Clunky system message based steering is removed and instead we use the
function framework provided by Open AI
Previous to this change we were chaining stuff too late and would execute
commands serially leading to very unexpected results
This corrects this and allows us to run stuff like:
> Search google 3/4 times on various permutations of
QUERY and answer this question.
We limit at 5 commands to ensure there are not pathological user cases
where you lean on the LLM to flood us with results.
For the time being smart commands only work consistently on GPT 4.
Avoid using any smart commands on the earlier models.
Additionally adds better error handling to Claude which sometimes streams
partial json and slightly tunes the search command.
* FIX: guide GPT 3.5 better
This limits search results to 10 cause we were blowing the whole token
budget on search results, additionally it includes a quick exchange at
the start of a session to try and guide GPT 3.5 to follow instructions
Sadly GPT 3.5 drifts off very quickly but this does improve stuff a bit.
It also attempts to correct some issues with anthropic, though it still is
surprisingly hard to ground
* add status:public, this is a bit of a hack but ensures that we can search
for any filter provided
* fix specs
* FEATURE: introduce a more efficient formatter
Previous formatting style was space inefficient given JSON consumes lots
of tokens, the new format is now used consistently across commands
Also fixes
- search limited to 10
- search breaking on limit: non existent directive
* Slight improvement to summarizer
Stop blowing up context with custom prompts
* ensure we include the guiding message
* correct spec
* langchain style summarizer ...
much more accurate (albeit more expensive)
* lint
This change-set connects GPT based chat with the forum it runs on. Allowing it to perform search, lookup tags and categories and summarize topics.
The integration is currently restricted to public portions of the forum.
Changes made:
- Do not run ai reply job for small actions
- Improved composable system prompt
- Trivial summarizer for topics
- Image generator
- Google command for searching via Google
- Corrected trimming of posts raw (was replacing with numbers)
- Bypass of problem specs
The feature works best with GPT-4
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
* FEATURE: Less friction for starting a conversation with an AI bot.
This PR adds a new header icon as a shortcut to start a conversation with one of our AI Bots. After clicking and selecting one from the dropdown menu, we'll open the composer with some fields already filled (recipients and title).
If you leave the title as is, we'll queue a job after five minutes to update it using a bot suggestion.
* Update assets/javascripts/initializers/ai-bot-replies.js
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
* Update assets/javascripts/initializers/ai-bot-replies.js
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
---------
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>