Account properly for function calls, don't stream through <details> blocks
- Rush cooked content back to client
- Wait longer (up to 60 seconds) before giving up on streaming
- Clean up message bus channels so we don't have leftover data
- Make ai streamer much more reusable and much easier to read
- If buffer grows quickly, rush update so you are not artificially waiting
- Refine prompt interface
- Fix lost system message when prompt gets long
This PR introduces 3 things:
1. Fake bot that can be used on local so you can test LLMs, to enable on dev use:
SiteSetting.ai_bot_enabled_chat_bots = "fake"
2. More elegant smooth streaming of progress on LLM completion
This leans on JavaScript to buffer and trickle llm results through. It also amends it so the progress dot is much
more consistently rendered
3. It fixes the Claude dialect
Claude needs newlines **exactly** at the right spot, amended so it is happy
---------
Co-authored-by: Martin Brennan <martin@discourse.org>
Previous to this change it was very hard to tell if completion was
stuck or not.
This introduces a "dot" that follows the completion and starts
flashing after 5 seconds.
It also corrects the syntax around tool support, which was wrong.
Gemini doesn't want us to include messages about previous tool invocations, so I had to shuffle around some code to send the response it generated from those invocations instead. For this, I created the "multi_turn" context, which bundles all the context involved in the interaction.
Chrome and Firefox work with standard clipboardCopy, but since
we are making an ajax call Safari fails cause there is too much
delay detected.
To avoid this issue we trade in promises which are acceptable and
work in iOS.
* FEATURE: allow easy sharing of bot conversations
* Lean on new core API i
* Added system spec for copy functionality
* Update assets/javascripts/initializers/ai-bot-replies.js
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* discourse later insted of setTimeout
* Update spec/system/ai_bot/share_spec.rb
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* feedback from review
just check the whole payload
* remove uneeded code
* fix spec
---------
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
We were limiting to 20 results unconditionally cause we had to make
sure search always fit in an 8k context window.
Models such as GPT 3.5 Turbo (16k) and GPT 4 Turbo / Claude 2.1 (over 150k)
allow us to return a lot more results.
This means we have a much richer understanding cause context is far
larger.
This also allows a persona to tweak this number, in some cases admin
may want to be conservative and save on tokens by limiting results
This also tweaks the `limit` param which GPT-4 liked to set to tell
model only to use it when it needs to (and describes default behavior)
Personas now support providing options for commands.
This PR introduces a single option "base_query" for the SearchCommand. When supplied all searches the persona will perform will also include the pre-supplied filter.
This can allow personas to search a subset of the forum (such as documentation)
This system is extensible we can add options to any command trivially.
* FEATURE: User sentiment on profile summary page
This introduces a new user stat in a user profile summary page.
It will show either neutral/positive/negative according to the dominant
sentiment in the user last interactions.
The user-stat widget is only rendered for staff.
Co-authored-by: Keegan George <kgeorge13@gmail.com>
* FEATURE: Azure OpenAI support for DALL*E 3
Previous to this there was no way to add an inference endpoint for
DALL*E on Azure cause it requires custom URLs
Also:
- On save, when editing a persona it would revert priority and enabled
- More forgiving parsing in command framework for array function calls
- By default generate HD images - they tend to be a bit better
- Improve DALL*E prompt which was getting very annoying and always echoing what it is about to do
- Add a bit of a sleep between retries on image generation
- Fix error handling in image_command
* FIX: no selected persona should pick first prioritized one
Previously we were looking at `.personaId` but there is only an
id attribute so it failed
* FEATURE: new DALL-E-3 persona
This persona generates images using DALL-E-3 API and is enabled
by default
Keep in mind that we are still waiting on seeds/gen_id so we can
not retain style consistently between turns.
This will change as soon as a new Open AI API provides the missing
parameters
Co-authored-by: Martin Brennan <martin@discourse.org>
People tend to keep to 1 persona when working with the bot,
this adds local browser memory for the last persona you interacted
with so you do not need to select it over and over again.
This is per browser, not per user memory.
Also... clean up tests so they do not need to require stubs which
were breaking the build
---------
Co-authored-by: Martin Brennan <martin@discourse.org>
Introduces a UI to manage customizable personas (admin only feature)
Part of the change was some extensive internal refactoring:
- AIBot now has a persona set in the constructor, once set it never changes
- Command now takes in bot as a constructor param, so it has the correct persona and is not generating AIBot objects on the fly
- Added a .prettierignore file, due to the way ALE is configured in nvim it is a pre-req for prettier to work
- Adds a bunch of validations on the AIPersona model, system personas (artist/creative etc...) are all seeded. We now ensure
- name uniqueness, and only allow certain properties to be touched for system personas.
- (JS note) the client side design takes advantage of nested routes, the parent route for personas gets all the personas via this.store.findAll("ai-persona") then child routes simply reach into this model to find a particular persona.
- (JS note) data is sideloaded into the ai-persona model the meta property supplied from the controller, resultSetMeta
- This removes ai_bot_enabled_personas and ai_bot_enabled_chat_commands, both should be controlled from the UI on a per persona basis
- Fixes a long standing bug in token accounting ... we were doing to_json.length instead of to_json.to_s.length
- Amended it so {commands} are always inserted at the end unconditionally, no need to add it to the template of the system message as it just confuses things
- Adds a concept of required_commands to stock personas, these are commands that must be configured for this stock persona to show up.
- Refactored tests so we stop requiring inference_stubs, it was very confusing to need it, added to plugin.rb for now which at least is clearer
- Migrates the persona selector to gjs
---------
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>