Commit Graph

102 Commits

Author SHA1 Message Date
Sam bd6f5caeac
FEATURE: Stable diffusion 3 support (#582)
- Adds support for sd3 and sd3 turbo models - this requires new endpoints
- Adds a hack to normalize arrays in the tool calls
- Removes some leftover code
- Adds support for aspect ratio as well so you can generate wide or tall images
2024-04-19 18:08:16 +10:00
Sam 50be66ee63
FEATURE: Gemini 1.5 pro support and Claude Opus bedrock support (#580)
- Updated AI Bot to only support Gemini 1.5 (used to support 1.0) - 1.0 was removed cause it is not appropriate for Bot usage
- Summaries and automation can now lean on Gemini 1.5 pro
- Amazon added support for Claude 3 Opus, added internal support for it on bedrock
2024-04-17 15:37:19 +10:00
Sam 4a29f8ed1c
FEATURE: Enhance AI debugging capabilities and improve interface adjustments (#577)
* FIX: various RAG edge cases

- Nicer text to describe RAG, avoids the word RAG
- Do not attempt to save persona when removing uploads and it is not created
- Remove old code that avoided touching rag params on create

* FIX: Missing pause button for persona users

* Feature: allow specific users to debug ai request / response chains

This can help users easily tune RAG and figure out what is going
on with requests.

* discourse helper so it does not explode

* fix test

* simplify implementation
2024-04-15 23:22:06 +10:00
Sam 7f16d3ad43
FEATURE: Cohere Command R support (#558)
- Added Cohere Command models (Command, Command Light, Command R, Command R Plus) to the available model list
- Added a new site setting `ai_cohere_api_key` for configuring the Cohere API key
- Implemented a new `DiscourseAi::Completions::Endpoints::Cohere` class to handle interactions with the Cohere API, including:
   - Translating request parameters to the Cohere API format
   - Parsing Cohere API responses 
   - Supporting streaming and non-streaming completions
   - Supporting "tools" which allow the model to call back to discourse to lookup additional information
- Implemented a new `DiscourseAi::Completions::Dialects::Command` class to translate between the generic Discourse AI prompt format and the Cohere Command format
- Added specs covering the new Cohere endpoint and dialect classes
- Updated `DiscourseAi::AiBot::Bot.guess_model` to map the new Cohere model to the appropriate bot user

In summary, this PR adds support for using the Cohere Command family of models with the Discourse AI plugin. It handles configuring API keys, making requests to the Cohere API, and translating between Discourse's generic prompt format and Cohere's specific format. Thorough test coverage was added for the new functionality.
2024-04-11 07:24:17 +10:00
Rafael dos Santos Silva eb93b21769
FEATURE: Add BGE-M3 embeddings support (#569)
BAAI/bge-m3 is an interesting model, that is multilingual and with a
context size of 8192. Even with a 16x larger context, it's only 4x slower
to compute it's embeddings on the worst case scenario.

Also includes a minor refactor of the rake task, including setting model
and concurrency levels when running the backfill task.
2024-04-10 17:24:01 -03:00
Rafael dos Santos Silva 969fbae21e
DEV: Hide quick search setting since it's experimental (#559) 2024-04-05 12:12:37 -03:00
Sam 6f5f34184b
FEATURE: add Claude 3 Haiku bot support (#552)
it is close in performance to GPT 4 at a fraction of the cost,
nice to add it to the mix.

Also improves a test case to simulate streaming, I am hunting for
the "calls" word that is jumping into function calls and can't quite
find it.
2024-04-03 16:06:27 +11:00
Rafael dos Santos Silva 3b8f900486
FIX: Handle unicode on tokenizer (#515)
* FIX: Handle unicode on tokenizer

Our fast track code broke when strings had characters who are longer in tokens than
in UTF-8.

Admins can set `DISCOURSE_AI_STRICT_TOKEN_COUNTING: true` in app.yml to ensure token counting is strict, even if slower.


Co-authored-by: wozulong <sidle.pax_0e@icloud.com>
2024-03-14 17:33:30 -03:00
Sam a03bc6ddec
FEATURE: Share conversations with AI via a URL (#521)
This allows users to share a static page of an AI conversation with
the rest of the world.

By default this feature is disabled, it is enabled by turning on
ai_bot_allow_public_sharing via site settings

Precautions are taken when sharing

1. We make a carbonite copy
2. We minimize work generating page
3. We limit to 100 interactions
4. Many security checks - including disallowing if there is a mix
of users in the PM.

* Bonus commit, large PRs like this PR did not work with github tool
large objects would destroy context


Co-authored-by: Martin Brennan <martin@discourse.org>
2024-03-12 16:51:41 +11:00
Keegan George b515b4f66d
FEATURE: AI Quick Semantic Search (#501)
This PR adds AI semantic search to the search pop available on every page.

It depends on several new and optional settings, like per post embeddings and a reranker model, so this is an experimental endeavour.


---------

Co-authored-by: Rafael Silva <xfalcox@gmail.com>
2024-03-08 13:02:50 -03:00
Sam 2ad743d246
FEATURE: Add GitHub Helper AI Bot persona and tools (#513)
Introduces a new AI Bot persona called 'GitHub Helper' which is specialized in assisting with GitHub-related tasks and questions. It includes the following key changes:

- Implements the GitHub Helper persona class with its system prompt and available tools
   
- Adds three new AI Bot tools for GitHub interactions:
  - github_file_content: Retrieves content of files from a GitHub repository
  - github_pull_request_diff: Retrieves the diff for a GitHub pull request
  - github_search_code: Searches for code in a GitHub repository
    
- Updates the AI Bot dialects to support the new GitHub tools

- Implements multiple function calls for standard tool dialect
2024-03-08 06:37:23 +11:00
Sam 8b382d6098
FEATURE: support for claude opus and sonnet (#508)
This provides new support for messages API from Claude.

It is required for latest model access.

Also corrects implementation of function calls.

* Fix message interleving

* fix broken spec

* add new models to automation
2024-03-06 06:04:37 +11:00
Rafael dos Santos Silva cf19ce0d72
FEATURE: Handle secure uploads in image caption (#476) 2024-02-19 18:08:19 -03:00
Keegan George a9b2d6a30a
FEATURE: AI image caption (#470)
This PR adds a new feature where you can generate captions for images in the composer using AI.

---------

Co-authored-by: Rafael Silva <xfalcox@gmail.com>
2024-02-19 14:56:28 -03:00
Krzysztof Kotlarek dd6b073fc3
DEV: Make more group-based settings client: false (#474)
Affects the following settings:

ai_toxicity_groups_bypass
ai_helper_allowed_groups
ai_helper_custom_prompts_allowed_groups
post_ai_helper_allowed_groups

This turns off client: true for these group-based settings,
because there is no guarantee that the current user gets all
their group memberships serialized to the client. Better to check
server-side first.
2024-02-19 13:26:24 +11:00
Keegan George 944fd6569c
DEV: Add granular control for AI composer helper features (#458) 2024-02-01 14:58:04 -08:00
Roman Rizzi fba9c1bf2c
UX: Re-introduce embedding settings validations (#457)
* Revert "Revert "UX: Validate embeddings settings (#455)" (#456)"

This reverts commit 392e2e8aef.

* Resstore previous default
2024-02-01 16:54:09 -03:00
Roman Rizzi 392e2e8aef
Revert "UX: Validate embeddings settings (#455)" (#456)
This reverts commit 85fca89e01.
2024-02-01 14:06:51 -03:00
Roman Rizzi 85fca89e01
UX: Validate embeddings settings (#455) 2024-02-01 13:05:38 -03:00
Rafael dos Santos Silva 9543ded3ee
DEV: Make per post embeddings a hidden setting (#450) 2024-01-30 15:51:54 -03:00
Roman Rizzi 0634b85a81
UX: Validations to LLM-backed features (except AI Bot) (#436)
* UX: Validations to Llm-backed features (except AI Bot)

This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs.

Validations are:

* You must choose a model before enabling the feature.
* You must turn off the feature before setting the model to blank.
* You must configure each model settings before being able to select it.

* Add provider name to summarization options

* vLLM can technically support same models as HF

* Check we can talk to the selected model

* Check for Bedrock instead of anthropic as a site could have both creds setup
2024-01-29 16:04:25 -03:00
Sam b2b01185f2
FEATURE: add support for new OpenAI embedding models (#445)
* FEATURE: add support for new OpenAI embedding models

This adds support for just released text_embedding_3_small and large

Note, we have not yet implemented truncation support which is a
new API feature. (triggered using dimensions)

* Tiny side fix, recalc bots when ai is enabled or disabled

* FIX: downsample to 2000 items per vector which is a pgvector limitation
2024-01-29 13:24:30 -03:00
Rafael dos Santos Silva 04bc402aae
FEATURE: Setting to control per post embeddings (#439)
* FEATURE: Setting to control per post embeddings
2024-01-23 22:09:27 -03:00
Rafael dos Santos Silva 3be76ebd7a
FEATURE: Move the default embeddings model to bge-large-en (#417) 2024-01-11 14:16:25 -03:00
Rafael dos Santos Silva 8fcba12fae
FEATURE: Support for SRV records for Discourse services (#414)
This allows admins to configure services with multiple backends using DNS SRV records. This PR also adds support for shared secret auth via headers for TEI and vLLM endpoints, so they are inline with the other ones.
2024-01-10 19:23:07 -03:00
Keegan George 7201d482d5
FEATURE: Add DallE support to AI helper's illustrate post (#404) 2024-01-05 09:03:23 -08:00
Roman Rizzi 971e03bdf2
FEATURE: AI Bot Gemini support. (#402)
It also corrects the syntax around tool support, which was wrong.

Gemini doesn't want us to include messages about previous tool invocations, so I had to shuffle around some code to send the response it generated from those invocations instead. For this, I created the "multi_turn" context, which bundles all the context involved in the interaction.
2024-01-04 18:15:34 -03:00
Roman Rizzi aa56baad37
FEATURE: Add Mixtral support for AI Bot (#396) 2024-01-04 12:22:43 -03:00
Rafael dos Santos Silva 1287ef4428
FEATURE: Support for Gemini Embeddings (#382) 2023-12-28 10:28:01 -03:00
Rafael dos Santos Silva 76f7940b55
Revert "FEATURE: User sentiment on profile summary page (#329)" (#383)
This reverts commit 71c5077228.
2023-12-28 11:01:57 +11:00
Rafael dos Santos Silva 5db7bf6e68
Mixtral (#376)
Add both Mistral and Mixtral support. Also includes vLLM-openAI inference support.

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-12-26 14:49:55 -03:00
Rafael dos Santos Silva 4d7ccdda2f
FEATURE: DNS SRV support for TEI (#363) 2023-12-18 13:21:21 -03:00
Rafael dos Santos Silva 83744bf192
FEATURE: Support for Gemini in AiHelper / Search / Summarization (#358) 2023-12-15 14:32:01 -03:00
Sam 3c9901d43a
FEATURE: implement GPT-4 turbo support (#345)
Keep in mind:

- GPT-4 is only going to be fully released next year - so this hardcodes preview model for now
- Fixes streaming bugs which became a big problem with GPT-4 turbo
- Adds Azure endpoing for turbo as well

Co-authored-by: Martin Brennan <martin@discourse.org>
2023-12-11 14:59:57 +11:00
Rafael dos Santos Silva 7ac1dbb6b6
FEATURE: Llama2 support in AiHelper (#339) 2023-12-06 17:47:53 -03:00
Rafael dos Santos Silva 71c5077228
FEATURE: User sentiment on profile summary page (#329)
* FEATURE: User sentiment on profile summary page

This introduces a new user stat in a user profile summary page.

It will show either neutral/positive/negative according to the dominant
sentiment in the user last interactions.

The user-stat widget is only rendered for staff.


Co-authored-by: Keegan George <kgeorge13@gmail.com>
2023-12-04 18:17:43 -03:00
Rafael dos Santos Silva fd0fb58eca
FEATURE: HuggingFace Text Embeddings Inference compatibility (#323)
* FEATURE: HuggingFace Text Embeddings Inference compatibility

* lint
2023-11-28 17:05:26 -03:00
Sam 5a4598a7b4
FEATURE: Azure OpenAI support for DALL*E 3 (#313)
* FEATURE: Azure OpenAI support for DALL*E 3

Previous to this there was no way to add an inference endpoint for
DALL*E on Azure cause it requires custom URLs

Also:

- On save, when editing a persona it would revert priority and enabled
- More forgiving parsing in command framework for array function calls
- By default generate HD images - they tend to be a bit better
- Improve DALL*E prompt which was getting very annoying and always echoing what it is about to do
- Add a bit of a sleep between retries on image generation
- Fix error handling in image_command
2023-11-27 13:01:05 +11:00
Sam 5b5edb22c6
FEATURE: UI to update ai personas on admin page (#290)
Introduces a UI to manage customizable personas (admin only feature)

Part of the change was some extensive internal refactoring:

- AIBot now has a persona set in the constructor, once set it never changes
- Command now takes in bot as a constructor param, so it has the correct persona and is not generating AIBot objects on the fly
- Added a .prettierignore file, due to the way ALE is configured in nvim it is a pre-req for prettier to work
- Adds a bunch of validations on the AIPersona model, system personas (artist/creative etc...) are all seeded. We now ensure
- name uniqueness, and only allow certain properties to be touched for system personas.
- (JS note) the client side design takes advantage of nested routes, the parent route for personas gets all the personas via this.store.findAll("ai-persona") then child routes simply reach into this model to find a particular persona.
- (JS note) data is sideloaded into the ai-persona model the meta property supplied from the controller, resultSetMeta
- This removes ai_bot_enabled_personas and ai_bot_enabled_chat_commands, both should be controlled from the UI on a per persona basis
- Fixes a long standing bug in token accounting ... we were doing to_json.length instead of to_json.to_s.length
- Amended it so {commands} are always inserted at the end unconditionally, no need to add it to the template of the system message as it just confuses things
- Adds a concept of required_commands to stock personas, these are commands that must be configured for this stock persona to show up.
- Refactored tests so we stop requiring inference_stubs, it was very confusing to need it, added to plugin.rb for now which at least is clearer
- Migrates the persona selector to gjs

---------

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
2023-11-21 16:56:43 +11:00
Rafael dos Santos Silva 3c55ea8fc0
FEATURE: Automatic Chat Thread titles (#269)
* FEATURE: Automatic Chat Thread titles

* do not gen title for empty threads

* make it default disabled for now
2023-10-30 11:56:33 -03:00
Rafael dos Santos Silva 818b20fb6f
FEATURE: Make embeddings turn-key (#261)
To ease the administrative burden of enabling the embeddings model, this change introduces automatic backfill when the setting is enabled. It also moves the topic visit embedding creation to a lower priority queue in sidekiq and adds an option to skip embedding computation and persistence when we match on the digest.
2023-10-26 12:07:37 -03:00
Rafael dos Santos Silva 0e5764617a
FEATURE: AI helper on posts (#244)
Adds an AI Helper function when selecting text while viewing a topic.

---------

Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Roman Rizzi <roman@discourse.org>
2023-10-23 11:41:36 -03:00
Sam 9242da545e
FEATURE: support OpenAI-Organization header (#245)
Per: https://platform.openai.com/docs/api-reference/authentication

There is an organization option which is useful for large orgs

> For users who belong to multiple organizations, you can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's subscription quota.
2023-10-06 10:23:18 +11:00
Rafael dos Santos Silva 84cc369552
FEATURE: Bge-large-en embeddings via Cloudflare Workers AI API (#241)
* FEATURE: Bge-large-en embeddings via Cloudflare Workers AI API

* forgot a file

* lint
2023-10-04 13:47:51 -03:00
Rafael dos Santos Silva 102f47c1c4
FEATURE: Allow Anthropic inference via AWS Bedrock (#235)
If a module LLM model is set to claude-2 and the ai_bedrock variables are all present we will use AWS Bedrock instead of Antrhopic own APIs.

This is quite hacky, but will allow us to test the waters with AWS Bedrock early access with every module.

This situation of "same module, completely different API" is quite a bit far from what we had in the OpenAI/Azure separation, so it's more food for thought for when we start working on the LLM abstraction layer soon this year.
2023-10-02 12:58:36 -03:00
Sam ed7d1f06d1
FIX: improve token counting (#234)
We were running out of tokens under certain conditions (really long
chains)

Add more buffer.
2023-09-28 15:32:22 +10:00
Sam aa463d64f1
FEATURE: Add creative persona (#231)
This adds a new creative persona that has access to the underlying
model and no external integrations.

It allows people to use Claude/GPT models in a Discourse agnostic
way.
2023-09-27 10:48:38 +10:00
Keegan George 2e5a39360a
FEATURE: Create custom prompts with composer AI helper (#214)
* DEV: Add icon support

* DEV: Add basic setup for custom prompt menu

* FEATURE: custom prompt backend

* fix custom prompt param check

* fix custom prompt replace

* WIP

* fix custom prompt usage

* fixes

* DEV: Update front-end

* DEV: No more custom prompt state

* DEV: Add specs

* FIX: Title/Category/Tag suggestions

Suggestion dropdowns broke because it `messages_with_user_input(user_input)` expects a hash now.

* DEV: Apply syntax tree

* DEV: Restrict custom prompts to configured groups

* oops

* fix tests

* lint

* I love tests

* lint is cool tho

---------

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2023-09-25 15:12:54 -03:00
Rafael dos Santos Silva 2c0f535bab
FEATURE: HyDE-powered semantic search. (#136)
* FEATURE: HyDE-powered semantic search.

It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.

We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.

This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.

Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.

* Missing translation and rate limiting

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-09-05 11:08:23 -03:00
Sam e3abbd9f46
FEATURE: add researcher persona (#181)
The researcher persona has access to Google and can perform
various internet research tasks. At the moment it can not read
web pages, but that is under consideration
2023-09-04 12:05:27 +10:00