llm_triage supported claude 2 in triage, this implements it
OpenAI rate limits frequently, this introduces some exponential
backoff (3 attempts - 3 seconds, 9 and 27)
Also reduces temp of classifiers so they have consistent behavior
The new automation rule can be used to perform llm based classification and categorization of topics.
You specify a system prompt (which has %%POST%% as an input), if it returns a particular piece of text then we will apply rules such as tagging, hiding, replying or categorizing.
This can be used as a spam filter, a "oops you are in the wrong place" filter and so on.
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
If a module LLM model is set to claude-2 and the ai_bedrock variables are all present we will use AWS Bedrock instead of Antrhopic own APIs.
This is quite hacky, but will allow us to test the waters with AWS Bedrock early access with every module.
This situation of "same module, completely different API" is quite a bit far from what we had in the OpenAI/Azure separation, so it's more food for thought for when we start working on the LLM abstraction layer soon this year.
This adds a new creative persona that has access to the underlying
model and no external integrations.
It allows people to use Claude/GPT models in a Discourse agnostic
way.
* FIX: properly truncate !command prompts
### What is going on here?
Previous to this change where a command was issued by the LLM it
could hallucinate a continuation eg:
```
This is what tags are
!tags
some nonsense here
```
This change introduces safeguards so `some nonsense here` does not
creep in to the prompt history, poisoning the llm results
This in effect grounds the llm a lot better and results in the llm
forgetting less about results.
The change only impacts Claude at the moment, but will also improve
stuff for llama 2 in future.
Also, this makes it significantly easier to test the bot framework
without an llm cause we avoid a whole bunch of complex stubbing
* blank is not a valid bot response, do not inject into prompt
We pass the text to the current LLM and ask them to generate a StableDifussion prompt.
We'll use that to generate 4 samples, temporarily creating uploads and returning their short URLs.
* FIX: Made bot more robust
This is a collection of small fixes
- Display "Searching for: ..." while searching instead of showing found 0 results.
- Only allow 5 commands in lang chain - 6 feels like too much
- On the 5th command stop informing the engine about functions, so it is forced to complete
- Add another 30 tokens of buffer and explain why
- Typo in command prompt
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
I thought this wasn't neccessary and we could safely rely on the appEvent during the initial search.
It only fires if #searchEnabled is true, meaning the search term is valid.
Note, we perform permission checks on tag list against anon
to ensure we do not disclose information about private tags
to the llm which could get extracted.
In specific scenarios (no special filters or limits) we will also
always include 5 semantic results (at least) with every query.
This effectively means that all very wide queries will always return
20 results, regardless of how complex they are.
Also:
FIX: embedding backfill rake task not working
We renamed internals, this corrects the implementation
* FEATURE: HyDE-powered semantic search.
It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.
We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.
This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.
Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.
* Missing translation and rate limiting
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>