Personas now support providing options for commands.
This PR introduces a single option "base_query" for the SearchCommand. When supplied all searches the persona will perform will also include the pre-supplied filter.
This can allow personas to search a subset of the forum (such as documentation)
This system is extensible we can add options to any command trivially.
c.f. de983796e1b66aa2ab039a4fb6e32cec8a65a098
There will soon be additional login_required checks
for Guardian, and the intent of many checks by automated
systems is better fulfilled by using BasicUser, which
simulates a logged in TL0 forum user, rather than an
anon user.
Previous to this change we relied on explicit loading for a files in Discourse AI.
This had a few downsides:
- Busywork whenever you add a file (an extra require relative)
- We were not keeping to conventions internally ... some places were OpenAI others are OpenAi
- Autoloader did not work which lead to lots of full application broken reloads when developing.
This moves all of DiscourseAI into a Zeitwerk compatible structure.
It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there)
To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi
Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections
Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.
We must ensure we can isolate titles, and the models sometimes ignore the example we give them.
Additionally, anons can generate HyDE posts, so we need to check if user is nil when attempting to log requests.
* FEATURE: Azure OpenAI support for DALL*E 3
Previous to this there was no way to add an inference endpoint for
DALL*E on Azure cause it requires custom URLs
Also:
- On save, when editing a persona it would revert priority and enabled
- More forgiving parsing in command framework for array function calls
- By default generate HD images - they tend to be a bit better
- Improve DALL*E prompt which was getting very annoying and always echoing what it is about to do
- Add a bit of a sleep between retries on image generation
- Fix error handling in image_command
* FIX: no selected persona should pick first prioritized one
Previously we were looking at `.personaId` but there is only an
id attribute so it failed
* FEATURE: new DALL-E-3 persona
This persona generates images using DALL-E-3 API and is enabled
by default
Keep in mind that we are still waiting on seeds/gen_id so we can
not retain style consistently between turns.
This will change as soon as a new Open AI API provides the missing
parameters
Co-authored-by: Martin Brennan <martin@discourse.org>
Previous to this changeset we used a custom system for tools/command
support for Anthropic.
We defined commands by using !command as a signal to execute it
Following Anthropic Claude 2.1, there is an official supported syntax (beta)
for tools execution.
eg:
```
+ <function_calls>
+ <invoke>
+ <tool_name>image</tool_name>
+ <parameters>
+ <prompts>
+ [
+ "an oil painting",
+ "a cute fluffy orange",
+ "3 apple's",
+ "a cat"
+ ]
+ </prompts>
+ </parameters>
+ </invoke>
+ </function_calls>
```
This implements the spec per Anthropic, it should be stable enough
to also work on other LLMs.
Keep in mind that OpenAI is not impacted here at all, as it has its
own custom system for function calls.
Additionally:
- Fixes the title system prompt so it works with latest Anthropic
- Uses new spec for "system" messages by Anthropic
- Tweak forum helper persona to guide Anthropic a tiny be better
Overall results are pretty awesome and Anthropic Claude performs
really well now on Discourse
* Revert "FIX: We don't need to prepend anthropic. to bedrock models (#308)"
This reverts commit 8a01751991.
* FIX: Bedrock uses slightly different model names
* DEV: One LLM abstraction to rule them all
* REFACTOR: HyDE search uses new LLM abstraction
* REFACTOR: Summarization uses the LLM abstraction
* Updated documentation and made small fixes. Remove Bedrock claude-2 restriction
People tend to keep to 1 persona when working with the bot,
this adds local browser memory for the last persona you interacted
with so you do not need to select it over and over again.
This is per browser, not per user memory.
Also... clean up tests so they do not need to require stubs which
were breaking the build
---------
Co-authored-by: Martin Brennan <martin@discourse.org>
Introduces a UI to manage customizable personas (admin only feature)
Part of the change was some extensive internal refactoring:
- AIBot now has a persona set in the constructor, once set it never changes
- Command now takes in bot as a constructor param, so it has the correct persona and is not generating AIBot objects on the fly
- Added a .prettierignore file, due to the way ALE is configured in nvim it is a pre-req for prettier to work
- Adds a bunch of validations on the AIPersona model, system personas (artist/creative etc...) are all seeded. We now ensure
- name uniqueness, and only allow certain properties to be touched for system personas.
- (JS note) the client side design takes advantage of nested routes, the parent route for personas gets all the personas via this.store.findAll("ai-persona") then child routes simply reach into this model to find a particular persona.
- (JS note) data is sideloaded into the ai-persona model the meta property supplied from the controller, resultSetMeta
- This removes ai_bot_enabled_personas and ai_bot_enabled_chat_commands, both should be controlled from the UI on a per persona basis
- Fixes a long standing bug in token accounting ... we were doing to_json.length instead of to_json.to_s.length
- Amended it so {commands} are always inserted at the end unconditionally, no need to add it to the template of the system message as it just confuses things
- Adds a concept of required_commands to stock personas, these are commands that must be configured for this stock persona to show up.
- Refactored tests so we stop requiring inference_stubs, it was very confusing to need it, added to plugin.rb for now which at least is clearer
- Migrates the persona selector to gjs
---------
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
- New AiPersona model which can store custom personas
- Persona are restricted via group security
- They can contain custom system messages
- They can support a list of commands optionally
To avoid expensive DB calls in the serializer a Multisite friendly Hash was introduced (which can be expired on transaction commit)
This PR aims to clarify sentiment reports by replacing averages with a count of posts that have one of their values above a threshold (60), meaning we have some level of confidence they are, in fact, positive or negative.
Same thing happen with post emotions, with the difference that a post can have multiple values above it (30). Additionally, we dropped the "Neutral" axis.
We also reworded the tooltip next to each report title, and added an early return to signal we have no data available instead of displaying an empty chart.
This PR adds new reports for displaying information about post sentiments grouped by date and emotions group by TL.
Depends on discourse/discourse#24274
Function calling will start hallucinating if you reshape results.
Previously we were morphing from:
`{ prompts: ["prompt 1", "prompt 2"] }`
to
`{ prompts: { prompt: "prompt 1", seed: 222}, { ... `
This meant that over a few call sequences function_call starts hallucinating an incorrect shape.
This change grounds us even on GPT-3.5
This allows for 2 big features:
1. Artist can ship up to 4 prompts for image generation
2. Artist can regenerate images cause it is aware of seed
This allows for iteration on images maintaining visual style
Adds an AI Helper function when selecting text while viewing a topic.
---------
Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Roman Rizzi <roman@discourse.org>
Also fixes it so users without bot in header can send it messages.
Previous to this change we would seed all bots with database seeds.
This lead to lots of confusion for people who do not enable ai bot.
Instead:
1. We do not seed any bots **until** user enables the ai_bot_enabled setting
2. If it is disabled we will
a. If no messages were created by bot - delete it
b. Otherwise we will deactivate account
Under certain cases, for example:
```
there is this japanese band called kirimi, tell me more about them, try searching 3 times and at least 2 times in japanese before answering.
```
Results come back with blank snippets. This adds protection so this
is allowed and code does not simply blow up.
Per: https://platform.openai.com/docs/api-reference/authentication
There is an organization option which is useful for large orgs
> For users who belong to multiple organizations, you can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's subscription quota.
llm_triage supported claude 2 in triage, this implements it
OpenAI rate limits frequently, this introduces some exponential
backoff (3 attempts - 3 seconds, 9 and 27)
Also reduces temp of classifiers so they have consistent behavior
The new automation rule can be used to perform llm based classification and categorization of topics.
You specify a system prompt (which has %%POST%% as an input), if it returns a particular piece of text then we will apply rules such as tagging, hiding, replying or categorizing.
This can be used as a spam filter, a "oops you are in the wrong place" filter and so on.
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
This adds a new creative persona that has access to the underlying
model and no external integrations.
It allows people to use Claude/GPT models in a Discourse agnostic
way.
* FIX: properly truncate !command prompts
### What is going on here?
Previous to this change where a command was issued by the LLM it
could hallucinate a continuation eg:
```
This is what tags are
!tags
some nonsense here
```
This change introduces safeguards so `some nonsense here` does not
creep in to the prompt history, poisoning the llm results
This in effect grounds the llm a lot better and results in the llm
forgetting less about results.
The change only impacts Claude at the moment, but will also improve
stuff for llama 2 in future.
Also, this makes it significantly easier to test the bot framework
without an llm cause we avoid a whole bunch of complex stubbing
* blank is not a valid bot response, do not inject into prompt
We pass the text to the current LLM and ask them to generate a StableDifussion prompt.
We'll use that to generate 4 samples, temporarily creating uploads and returning their short URLs.
* FIX: Made bot more robust
This is a collection of small fixes
- Display "Searching for: ..." while searching instead of showing found 0 results.
- Only allow 5 commands in lang chain - 6 feels like too much
- On the 5th command stop informing the engine about functions, so it is forced to complete
- Add another 30 tokens of buffer and explain why
- Typo in command prompt
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
Note, we perform permission checks on tag list against anon
to ensure we do not disclose information about private tags
to the llm which could get extracted.
In specific scenarios (no special filters or limits) we will also
always include 5 semantic results (at least) with every query.
This effectively means that all very wide queries will always return
20 results, regardless of how complex they are.
Also:
FIX: embedding backfill rake task not working
We renamed internals, this corrects the implementation
* FEATURE: HyDE-powered semantic search.
It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.
We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.
This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.
Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.
* Missing translation and rate limiting
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
The researcher persona has access to Google and can perform
various internet research tasks. At the moment it can not read
web pages, but that is under consideration
Previous to this change we relied on client side settings to
determine if an end user has access to the ai bot.
This meant that if a user was not aware they are a member of a
group (as it is with restricted visibility ones) they would not
see the bot button.
All checking has now moved to the server side, and tests were
added to cover.
This refactor changes it so we only include minimal data in the
system prompt which leaves us lots of tokens for specific searches
The new search command allows us to pull in settings on demand
Descriptions are include in short search results, and names only
in longer results
Also:
* In dev it is important to tell when calls are made to open ai
this adds a console log to increase awareness around token usage
* PERF: stop counting tokens so often
This changes it so we only count tokens once per response
Previously each time we heard back from open ai we would count
tokens, leading to uneeded delays
* bug fix, commands may reach in for tokenizer
* add logging to console for anthropic calls as well
* Update lib/shared/inference/openai_completions.rb
Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>
Also adds ai_bot_enabled_personas so admins can tweak which stock
personas are enabled.
The new persona has a full listing of all site settings and is
able to get context for each setting.
This means you can ask it to search through settings for something
relevant.
Security wise there is no access to actual configuration of settings
just to the names / description and implementation.
Previously this was part of the forum helper persona however it
just clashes too much with other behaviors, isolating it makes
it far more powerful.
* sneaking this one in, user_emails is a non obvious table in our
structure.
usually one would assume users has emails so the clarifies a bit
better. plus it is a very common table to hit.
This splits out a bunch of code that used to live inside bots
into a dedicated concept called a Persona.
This allows us to start playing with multiple personas for the bot
Ships with:
artist - for making images
sql helper - for helping with data explorer
general - for everything and anything
Also includes a few fixes that make the generic LLM function implementation more robust
This command can be used to extract information about a discourse
site setting directly from source.
To operate it needs the rg binary in the container.
This fixes 2 big issues:
1. No matter how hard you try, grounding anthropic title prompt
is just too hard. This works around by only looking at the last
sentence it returns and treating as title
2. Non English locales would be stuck with "generic" title, this
ensures every bot message gets a title, using a custom field to
track
Also, slightly tunes some anthropic prompts.
Open AI support function calling, this has a very specific shape
that other LLMs have not quite adopted.
This simulates a command framework using system prompts on LLMs
that are not open AI.
Features include:
- Smart system prompt to steer the LLM
- Parameter validation (we ensure all the params are specified correctly)
This is being tested on Anthropic at the moment and intial results
are promising.
Azure requires a single HTTP endpoint per type of completion.
The settings: `ai_openai_gpt35_16k_url` and `ai_openai_gpt4_32k_url` can be
used now to configure the extra endpoints
This amends token limit which was off a bit due to function calls and fixes
a minor JS issue where we were not testing for a property
previously you would have to wait quite a while to see the prompt this implements
a very basic implementation of progress so you can see the API is working.
Also:
- Fix google progress.
- Handle the incredibly rare, zero results from google.
- Simplify command so it is less error prone
- replace invoke and attache results with a invoke
- ensure invoke can only ever be run once
- pass in all the information a command needs in constructor
- use new pattern throughout
- test invocation in isolation
- Attempt to hint reading is done by sending complete:true
- Do not include post_number in result unless it was sent in
- Rush visual feedback when a command is run (ensure we always revise)
- Include hyperlink in read command description
- Stop round tripping to GPT after image generation (speeds up images by a lot)
- Add a test for image command
This command is useful for reading a topics content. It allows us to perform
critical analysis or suggest answers.
Given 8k token limit in GPT-4 I hardcoded reading to 1500 tokens, but we can
follow up and allow larger windows on models that support more tokens.
On local testing even in this limited form this can be very useful.
* FIX: Google command was including full payload
Additionally there was no truncating happening meaning you could blow token
budget easily on a single search.
This made Google search mostly useless and it would mean that after using
Google we would revert to a clean slate which is very confusing.
* no need for nil there
The command framework had some confusing dispatching where it would dispatch
JSON blobs, this meant there was lots of parsing required in every command
The refactor handles transforming the args prior to dispatch which makes
consuming far simpler
This is also general prep to supporting some basic command framework in other
llms.
TopicQuery already provides a lot of safeguards and options for filtering topic, and enforcing permissions. It makes sense to rely on it as other plugins like discourse-assign do.
As a bonus, we now have access to the current_user while serializing these topics, so users will see things like unread posts count just like we do for the lists.
Claude 1 costs the same and is less good than Claude 2. Make use of Claude
2 in all spots ...
This also fixes streaming so it uses the far more efficient streaming protocol.