* FIX: properly truncate !command prompts
### What is going on here?
Previous to this change where a command was issued by the LLM it
could hallucinate a continuation eg:
```
This is what tags are
!tags
some nonsense here
```
This change introduces safeguards so `some nonsense here` does not
creep in to the prompt history, poisoning the llm results
This in effect grounds the llm a lot better and results in the llm
forgetting less about results.
The change only impacts Claude at the moment, but will also improve
stuff for llama 2 in future.
Also, this makes it significantly easier to test the bot framework
without an llm cause we avoid a whole bunch of complex stubbing
* blank is not a valid bot response, do not inject into prompt
We pass the text to the current LLM and ask them to generate a StableDifussion prompt.
We'll use that to generate 4 samples, temporarily creating uploads and returning their short URLs.
* FIX: Made bot more robust
This is a collection of small fixes
- Display "Searching for: ..." while searching instead of showing found 0 results.
- Only allow 5 commands in lang chain - 6 feels like too much
- On the 5th command stop informing the engine about functions, so it is forced to complete
- Add another 30 tokens of buffer and explain why
- Typo in command prompt
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
I thought this wasn't neccessary and we could safely rely on the appEvent during the initial search.
It only fires if #searchEnabled is true, meaning the search term is valid.
Note, we perform permission checks on tag list against anon
to ensure we do not disclose information about private tags
to the llm which could get extracted.
In specific scenarios (no special filters or limits) we will also
always include 5 semantic results (at least) with every query.
This effectively means that all very wide queries will always return
20 results, regardless of how complex they are.
Also:
FIX: embedding backfill rake task not working
We renamed internals, this corrects the implementation
* FEATURE: HyDE-powered semantic search.
It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.
We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.
This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.
Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.
* Missing translation and rate limiting
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
The researcher persona has access to Google and can perform
various internet research tasks. At the moment it can not read
web pages, but that is under consideration
Previous to this change we relied on client side settings to
determine if an end user has access to the ai bot.
This meant that if a user was not aware they are a member of a
group (as it is with restricted visibility ones) they would not
see the bot button.
All checking has now moved to the server side, and tests were
added to cover.
This refactor changes it so we only include minimal data in the
system prompt which leaves us lots of tokens for specific searches
The new search command allows us to pull in settings on demand
Descriptions are include in short search results, and names only
in longer results
Also:
* In dev it is important to tell when calls are made to open ai
this adds a console log to increase awareness around token usage
* PERF: stop counting tokens so often
This changes it so we only count tokens once per response
Previously each time we heard back from open ai we would count
tokens, leading to uneeded delays
* bug fix, commands may reach in for tokenizer
* add logging to console for anthropic calls as well
* Update lib/shared/inference/openai_completions.rb
Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>
Also adds ai_bot_enabled_personas so admins can tweak which stock
personas are enabled.
The new persona has a full listing of all site settings and is
able to get context for each setting.
This means you can ask it to search through settings for something
relevant.
Security wise there is no access to actual configuration of settings
just to the names / description and implementation.
Previously this was part of the forum helper persona however it
just clashes too much with other behaviors, isolating it makes
it far more powerful.
* sneaking this one in, user_emails is a non obvious table in our
structure.
usually one would assume users has emails so the clarifies a bit
better. plus it is a very common table to hit.
This splits out a bunch of code that used to live inside bots
into a dedicated concept called a Persona.
This allows us to start playing with multiple personas for the bot
Ships with:
artist - for making images
sql helper - for helping with data explorer
general - for everything and anything
Also includes a few fixes that make the generic LLM function implementation more robust