This PR adds new reports for displaying information about post sentiments grouped by date and emotions group by TL.
Depends on discourse/discourse#24274
Function calling will start hallucinating if you reshape results.
Previously we were morphing from:
`{ prompts: ["prompt 1", "prompt 2"] }`
to
`{ prompts: { prompt: "prompt 1", seed: 222}, { ... `
This meant that over a few call sequences function_call starts hallucinating an incorrect shape.
This change grounds us even on GPT-3.5
This allows for 2 big features:
1. Artist can ship up to 4 prompts for image generation
2. Artist can regenerate images cause it is aware of seed
This allows for iteration on images maintaining visual style
To ease the administrative burden of enabling the embeddings model, this change introduces automatic backfill when the setting is enabled. It also moves the topic visit embedding creation to a lower priority queue in sidekiq and adds an option to skip embedding computation and persistence when we match on the digest.
Previous to this change image generation did not work on multisite
There was a background thread generating the images and it was
getting site settings from the default site in the cluster
This also removes referer header which is not needed
Adds an AI Helper function when selecting text while viewing a topic.
---------
Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Roman Rizzi <roman@discourse.org>
Also fixes it so users without bot in header can send it messages.
Previous to this change we would seed all bots with database seeds.
This lead to lots of confusion for people who do not enable ai bot.
Instead:
1. We do not seed any bots **until** user enables the ai_bot_enabled setting
2. If it is disabled we will
a. If no messages were created by bot - delete it
b. Otherwise we will deactivate account
This PR addresses the effort to use one icon representing discourse-ai.
Removed discourse-sparkles from discourse-ai, now included in core ``vendor/assets/svg-icons/discourse-additional.svg``
Under certain cases, for example:
```
there is this japanese band called kirimi, tell me more about them, try searching 3 times and at least 2 times in japanese before answering.
```
Results come back with blank snippets. This adds protection so this
is allowed and code does not simply blow up.
Per: https://platform.openai.com/docs/api-reference/authentication
There is an organization option which is useful for large orgs
> For users who belong to multiple organizations, you can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's subscription quota.
llm_triage supported claude 2 in triage, this implements it
OpenAI rate limits frequently, this introduces some exponential
backoff (3 attempts - 3 seconds, 9 and 27)
Also reduces temp of classifiers so they have consistent behavior
The new automation rule can be used to perform llm based classification and categorization of topics.
You specify a system prompt (which has %%POST%% as an input), if it returns a particular piece of text then we will apply rules such as tagging, hiding, replying or categorizing.
This can be used as a spam filter, a "oops you are in the wrong place" filter and so on.
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
If a module LLM model is set to claude-2 and the ai_bedrock variables are all present we will use AWS Bedrock instead of Antrhopic own APIs.
This is quite hacky, but will allow us to test the waters with AWS Bedrock early access with every module.
This situation of "same module, completely different API" is quite a bit far from what we had in the OpenAI/Azure separation, so it's more food for thought for when we start working on the LLM abstraction layer soon this year.
This adds a new creative persona that has access to the underlying
model and no external integrations.
It allows people to use Claude/GPT models in a Discourse agnostic
way.
* FIX: properly truncate !command prompts
### What is going on here?
Previous to this change where a command was issued by the LLM it
could hallucinate a continuation eg:
```
This is what tags are
!tags
some nonsense here
```
This change introduces safeguards so `some nonsense here` does not
creep in to the prompt history, poisoning the llm results
This in effect grounds the llm a lot better and results in the llm
forgetting less about results.
The change only impacts Claude at the moment, but will also improve
stuff for llama 2 in future.
Also, this makes it significantly easier to test the bot framework
without an llm cause we avoid a whole bunch of complex stubbing
* blank is not a valid bot response, do not inject into prompt
We pass the text to the current LLM and ask them to generate a StableDifussion prompt.
We'll use that to generate 4 samples, temporarily creating uploads and returning their short URLs.
* FIX: Made bot more robust
This is a collection of small fixes
- Display "Searching for: ..." while searching instead of showing found 0 results.
- Only allow 5 commands in lang chain - 6 feels like too much
- On the 5th command stop informing the engine about functions, so it is forced to complete
- Add another 30 tokens of buffer and explain why
- Typo in command prompt
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
Note, we perform permission checks on tag list against anon
to ensure we do not disclose information about private tags
to the llm which could get extracted.
In specific scenarios (no special filters or limits) we will also
always include 5 semantic results (at least) with every query.
This effectively means that all very wide queries will always return
20 results, regardless of how complex they are.
Also:
FIX: embedding backfill rake task not working
We renamed internals, this corrects the implementation
* FEATURE: HyDE-powered semantic search.
It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.
We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.
This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.
Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.
* Missing translation and rate limiting
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
The researcher persona has access to Google and can perform
various internet research tasks. At the moment it can not read
web pages, but that is under consideration