This PR adds AI semantic search to the search pop available on every page.
It depends on several new and optional settings, like per post embeddings and a reranker model, so this is an experimental endeavour.
---------
Co-authored-by: Rafael Silva <xfalcox@gmail.com>
* DEV: improve internal design of ai persona and bug fix
- Fixes bug where OpenAI could not describe images
- Fixes bug where mentionable personas could not be mentioned unless overarching bot was enabled
- Improves internal design of playground and bot to allow better for non "bot" users
- Allow PMs directly to persona users (previously bot user would also have to be in PM)
- Simplify internal code
Co-authored-by: Martin Brennan <martin@discourse.org>
Utilizes the check for secure upload permissions from core PR
https://github.com/discourse/discourse/pull/25758 and cleans up
controller codes and spec code to reuse existing code and better
reflect reality.
This PR adds a new feature where you can generate captions for images in the composer using AI.
---------
Co-authored-by: Rafael Silva <xfalcox@gmail.com>
1. Personas are now optionally mentionable, meaning that you can mention them either from public topics or PMs
- Mentioning from PMs helps "switch" persona mid conversation, meaning if you want to look up sites setting you can invoke the site setting bot, or if you want to generate an image you can invoke dall e
- Mentioning outside of PMs allows you to inject a bot reply in a topic trivially
- We also add the support for max_context_posts this allow you to limit the amount of context you feed in, which can help control costs
2. Add support for a "random picker" tool that can be used to pick random numbers
3. Clean up routing ai_personas -> ai-personas
4. Add Max Context Posts so users can control how much history a persona can consume (this is important for mentionable personas)
Co-authored-by: Martin Brennan <martin@discourse.org>
* FEATURE: allow personas to supply top_p and temperature params
Code assistance generally are more focused at a lower temperature
This amends it so SQL Helper runs at 0.2 temperature vs the more
common default across LLMs of 1.0.
Reduced temperature leads to more focused, concise and predictable
answers for the SQL Helper
* fix tests
* This is not perfect, but far better than what we do today
Instead of fishing for
1. Draft sequence
2. Draft body
We skip (2), this means the composer "only" needs 1 http request to
open, we also want to eliminate (1) but it is a bit of a trickier
core change, may figure out how to pull it off (defer it to first draft save)
Value of bot drafts < value of opening bot conversations really fast
The idea is to increase the frequency so we can run with smaller batch sizes.
Big batches cause problems when running backups, so it's better to have shorter but
more frequent jobs.
1. on failure we were queuing a job to generate embeddings, it had the wrong params. This is both fixed and covered in a test.
2. backfill embedding in the order of bumped_at, so newest content is embedded first, cover with a test
3. add a safeguard for hidden site setting that only allows batches of 50k in an embedding job run
Previously old embeddings were updated in a random order, this changes it so we update in a consistent order
* REFACTOR: Represent generic prompts with an Object.
* Adds a bit more validation for clarity
* Rewrite bot title prompt and fix quirk handling
---------
Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
Followup 2636efcd1b,
whenever ruby code was changed locally this would break
module loading, giving an "uninitialized constant
DiscourseAi::Embeddings::EntryPoint::SemanticRelated" error.
* DEV: AI bot migration to the Llm pattern.
We added tool and conversation context support to the Llm service in discourse-ai#366, meaning we met all the conditions to migrate this module.
This PR migrates to the new pattern, meaning adding a new bot now requires minimal effort as long as the service supports it. On top of this, we introduce the concept of a "Playground" to separate the PM-specific bits from the completion, allowing us to use the bot in other contexts like chat in the future. Commands are called tools, and we simplified all the placeholder logic to perform updates in a single place, making the flow more one-wayish.
* Followup fixes based on testing
* Cleanup unused inference code
* FIX: text-based tools could be in the middle of a sentence
* GPT-4-turbo support
* Use new LLM API
* FIX: AI helper not working correctly with mixtral
This PR introduces a new function on the generic llm called #generate
This will replace the implementation of completion!
#generate introduces a new way to pass temperature, max_tokens and stop_sequences
Then LLM implementers need to implement #normalize_model_params to
ensure the generic names match the LLM specific endpoint
This also adds temperature and stop_sequences to completion_prompts
this allows for much more robust completion prompts
* port everything over to #generate
* Fix translation
- On anthropic this no longer throws random "This is your translation:"
- On mixtral this actually works
* fix markdown table generation as well
Currently we're seeing 500s when related_topics are getting rendered. We should get the topic's category rather than on the array.
```
ActionView::Template::Error (undefined method `category' for [#<Topic id ... ]
```
Personas now support providing options for commands.
This PR introduces a single option "base_query" for the SearchCommand. When supplied all searches the persona will perform will also include the pre-supplied filter.
This can allow personas to search a subset of the forum (such as documentation)
This system is extensible we can add options to any command trivially.
Previous to this change we relied on explicit loading for a files in Discourse AI.
This had a few downsides:
- Busywork whenever you add a file (an extra require relative)
- We were not keeping to conventions internally ... some places were OpenAI others are OpenAi
- Autoloader did not work which lead to lots of full application broken reloads when developing.
This moves all of DiscourseAI into a Zeitwerk compatible structure.
It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there)
To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi
Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections
Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.
We must ensure we can isolate titles, and the models sometimes ignore the example we give them.
Additionally, anons can generate HyDE posts, so we need to check if user is nil when attempting to log requests.
Introduces a UI to manage customizable personas (admin only feature)
Part of the change was some extensive internal refactoring:
- AIBot now has a persona set in the constructor, once set it never changes
- Command now takes in bot as a constructor param, so it has the correct persona and is not generating AIBot objects on the fly
- Added a .prettierignore file, due to the way ALE is configured in nvim it is a pre-req for prettier to work
- Adds a bunch of validations on the AIPersona model, system personas (artist/creative etc...) are all seeded. We now ensure
- name uniqueness, and only allow certain properties to be touched for system personas.
- (JS note) the client side design takes advantage of nested routes, the parent route for personas gets all the personas via this.store.findAll("ai-persona") then child routes simply reach into this model to find a particular persona.
- (JS note) data is sideloaded into the ai-persona model the meta property supplied from the controller, resultSetMeta
- This removes ai_bot_enabled_personas and ai_bot_enabled_chat_commands, both should be controlled from the UI on a per persona basis
- Fixes a long standing bug in token accounting ... we were doing to_json.length instead of to_json.to_s.length
- Amended it so {commands} are always inserted at the end unconditionally, no need to add it to the template of the system message as it just confuses things
- Adds a concept of required_commands to stock personas, these are commands that must be configured for this stock persona to show up.
- Refactored tests so we stop requiring inference_stubs, it was very confusing to need it, added to plugin.rb for now which at least is clearer
- Migrates the persona selector to gjs
---------
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
- New AiPersona model which can store custom personas
- Persona are restricted via group security
- They can contain custom system messages
- They can support a list of commands optionally
To avoid expensive DB calls in the serializer a Multisite friendly Hash was introduced (which can be expired on transaction commit)
This PR adds new reports for displaying information about post sentiments grouped by date and emotions group by TL.
Depends on discourse/discourse#24274
Adds an AI Helper function when selecting text while viewing a topic.
---------
Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Roman Rizzi <roman@discourse.org>
We pass the text to the current LLM and ask them to generate a StableDifussion prompt.
We'll use that to generate 4 samples, temporarily creating uploads and returning their short URLs.
* FEATURE: HyDE-powered semantic search.
It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.
We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.
This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.
Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.
* Missing translation and rate limiting
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
* FEATURE: Embeddings to main db
This commit moves our embeddings store from an external configurable PostgreSQL
instance back into the main database. This is done to simplify the setup.
There is a migration that will try to import the external embeddings into
the main DB if it is configured and there are rows.
It removes support from embeddings models that aren't all_mpnet_base_v2 or OpenAI
text_embedding_ada_002. However it will now be easier to add new models.
It also now takes into account:
- topic title
- topic category
- topic tags
- replies (as much as the model allows)
We introduce an interface so we can eventually support multiple strategies
for handling long topics.
This PR severely damages the semantic search performance, but this is a
temporary until we can get adapt HyDE to make semantic search use the same
embeddings we have for semantic related with good performance.
Here we also have some ground work to add post level embeddings, but this
will be added in a future PR.
Please note that this PR will also block Discourse from booting / updating if
this plugin is installed and the pgvector extension isn't available on the
PostgreSQL instance Discourse uses.