This provides new support for messages API from Claude.
It is required for latest model access.
Also corrects implementation of function calls.
* Fix message interleving
* fix broken spec
* add new models to automation
This PR adds a new feature where you can generate captions for images in the composer using AI.
---------
Co-authored-by: Rafael Silva <xfalcox@gmail.com>
1. Personas are now optionally mentionable, meaning that you can mention them either from public topics or PMs
- Mentioning from PMs helps "switch" persona mid conversation, meaning if you want to look up sites setting you can invoke the site setting bot, or if you want to generate an image you can invoke dall e
- Mentioning outside of PMs allows you to inject a bot reply in a topic trivially
- We also add the support for max_context_posts this allow you to limit the amount of context you feed in, which can help control costs
2. Add support for a "random picker" tool that can be used to pick random numbers
3. Clean up routing ai_personas -> ai-personas
4. Add Max Context Posts so users can control how much history a persona can consume (this is important for mentionable personas)
Co-authored-by: Martin Brennan <martin@discourse.org>
- Allow users to supply top_p and temperature values, which means people can fine tune randomness
- Fix bad localization string
- Fix bad remapping of max tokens in gemini
- Add support for top_p as a general param to llms
- Amend system prompt so persona stops treating a user as an adversary
* UX: Validations to Llm-backed features (except AI Bot)
This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs.
Validations are:
* You must choose a model before enabling the feature.
* You must turn off the feature before setting the model to blank.
* You must configure each model settings before being able to select it.
* Add provider name to summarization options
* vLLM can technically support same models as HF
* Check we can talk to the selected model
* Check for Bedrock instead of anthropic as a site could have both creds setup
We were not validating input for generate leading to 2 tests not
failing correctly despite functionality being broken.
This ensures that input is validated,and in turn fixes the broken
specs
Account properly for function calls, don't stream through <details> blocks
- Rush cooked content back to client
- Wait longer (up to 60 seconds) before giving up on streaming
- Clean up message bus channels so we don't have leftover data
- Make ai streamer much more reusable and much easier to read
- If buffer grows quickly, rush update so you are not artificially waiting
- Refine prompt interface
- Fix lost system message when prompt gets long
* REFACTOR: Represent generic prompts with an Object.
* Adds a bit more validation for clarity
* Rewrite bot title prompt and fix quirk handling
---------
Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
It also corrects the syntax around tool support, which was wrong.
Gemini doesn't want us to include messages about previous tool invocations, so I had to shuffle around some code to send the response it generated from those invocations instead. For this, I created the "multi_turn" context, which bundles all the context involved in the interaction.
* DEV: AI bot migration to the Llm pattern.
We added tool and conversation context support to the Llm service in discourse-ai#366, meaning we met all the conditions to migrate this module.
This PR migrates to the new pattern, meaning adding a new bot now requires minimal effort as long as the service supports it. On top of this, we introduce the concept of a "Playground" to separate the PM-specific bits from the completion, allowing us to use the bot in other contexts like chat in the future. Commands are called tools, and we simplified all the placeholder logic to perform updates in a single place, making the flow more one-wayish.
* Followup fixes based on testing
* Cleanup unused inference code
* FIX: text-based tools could be in the middle of a sentence
* GPT-4-turbo support
* Use new LLM API
* FIX: AI helper not working correctly with mixtral
This PR introduces a new function on the generic llm called #generate
This will replace the implementation of completion!
#generate introduces a new way to pass temperature, max_tokens and stop_sequences
Then LLM implementers need to implement #normalize_model_params to
ensure the generic names match the LLM specific endpoint
This also adds temperature and stop_sequences to completion_prompts
this allows for much more robust completion prompts
* port everything over to #generate
* Fix translation
- On anthropic this no longer throws random "This is your translation:"
- On mixtral this actually works
* fix markdown table generation as well
This PR adds tool support to available LLMs. We'll buffer tool invocations and return them instead of making users of this service parse the response.
It also adds support for conversation context in the generic prompt. It includes bot messages, user messages, and tool invocations, which we'll trim to make sure it doesn't exceed the prompt limit, then translate them to the correct dialect.
Finally, It adds some buffering when reading chunks to handle cases when streaming is extremely slow.:M
Previous to this change we relied on explicit loading for a files in Discourse AI.
This had a few downsides:
- Busywork whenever you add a file (an extra require relative)
- We were not keeping to conventions internally ... some places were OpenAI others are OpenAi
- Autoloader did not work which lead to lots of full application broken reloads when developing.
This moves all of DiscourseAI into a Zeitwerk compatible structure.
It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there)
To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi
Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections
Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.
* DEV: One LLM abstraction to rule them all
* REFACTOR: HyDE search uses new LLM abstraction
* REFACTOR: Summarization uses the LLM abstraction
* Updated documentation and made small fixes. Remove Bedrock claude-2 restriction