Commit Graph

53 Commits

Author SHA1 Message Date
Rafael dos Santos Silva 5c50d2aa09
FEATURE: Use stop_sequences for faster HyDE searches with Claude (#203) 2023-09-06 10:06:31 -03:00
Rafael dos Santos Silva 2c0f535bab
FEATURE: HyDE-powered semantic search. (#136)
* FEATURE: HyDE-powered semantic search.

It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.

We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.

This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.

Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.

* Missing translation and rate limiting

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-09-05 11:08:23 -03:00
Sam 181113159b
FIX: setting explorer was exceeding token budget
This refactor changes it so we only include minimal data in the
system prompt which leaves us lots of tokens for specific searches

The new search command allows us to pull in settings on demand

Descriptions are include in short search results, and names only
in longer results

Also: 

* In dev it is important to tell when calls are made to open ai
this adds a console log to increase awareness around token usage

* PERF: stop counting tokens so often

This changes it so we only count tokens once per response

Previously each time we heard back from open ai we would count
tokens, leading to uneeded delays

* bug fix, commands may reach in for tokenizer

* add logging to console for anthropic calls as well

* Update lib/shared/inference/openai_completions.rb

Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>
2023-09-01 11:48:51 +10:00
Sam db19e37748
FEATURE: add initial support for personas (#172)
This splits out a bunch of code that used to live inside bots
into a dedicated concept called a Persona.

This allows us to start playing with multiple personas for the bot

Ships with:

artist - for making images
sql helper - for helping with data explorer
general - for everything and anything
 
Also includes a few fixes that make the generic LLM function implementation  more robust
2023-08-30 16:15:03 +10:00
Sam 7d943be7b2
FIX: automatic bot titles missing sometime (#151)
This fixes 2 big issues:

1. No matter how hard you try, grounding anthropic title prompt
is just too hard. This works around by only looking at the last
sentence it returns and treating as title

2. Non English locales would be stuck with "generic" title, this
ensures every bot message gets a title, using a custom field to
track

Also, slightly tunes some anthropic prompts.
2023-08-24 07:20:24 +10:00
Sam f0e1c72aa7
FEATURE: implement command framework for non Open AI (#147)
Open AI support function calling, this has a very specific shape
that other LLMs have not quite adopted.

This simulates a command framework using system prompts on LLMs
that are not open AI.

Features include:

- Smart system prompt to steer the LLM
- Parameter validation (we ensure all the params are specified correctly)

This is being tested on Anthropic at the moment and intial results
are promising.
2023-08-23 07:49:36 +10:00
Sam b4477ecdcd
FEATURE: support 16k and 32k variants for Azure GPT (#140)
Azure requires a single HTTP endpoint per type of completion.

The settings: `ai_openai_gpt35_16k_url` and `ai_openai_gpt4_32k_url` can be
used now to configure the extra endpoints

This amends token limit which was off a bit due to function calls and fixes
a minor JS issue where we were not testing for a property
2023-08-17 11:00:11 +10:00
Roman Rizzi b076e43d67
FEATURE: streaming mode for the FoldContent strategy. (#134) 2023-08-11 15:08:54 -03:00
Rafael dos Santos Silva eb7fff3a55
FEATURE: Add support for StableBeluga and Upstage Llama2 instruct (#126)
* FEATURE: Add support for StableBeluga and Upstage Llama2 instruct

This means we support all models in the top3 of the Open LLM Leaderboard

Since some of those models have RoPE, we now have a setting so you can
customize the token limit depending which model you use.
2023-08-03 15:29:30 -03:00
Rafael dos Santos Silva 8b157feea5
FEATURE: Compatibility with protected Hugging Face Endpoints (#123)
* FEATURE: Compatibility with protected Hugging Face Endpoints
2023-08-02 17:00:00 -03:00
Sam 602bb843ea
FEATURE: add support for final stable diffusion xl model (#122) 2023-08-02 16:53:28 -03:00
Rafael dos Santos Silva 3e7c99de89
FEATURE: Support for locally infered embeddings in 100 languages (#115)
* FEATURE: Support for locally infered embeddings in 100 languages

* add table
2023-07-27 15:50:03 -03:00
Rafael dos Santos Silva b25daed60b
FEATURE: Llama2 for summarization (#116) 2023-07-27 13:55:32 -03:00
Sam 4b0c077ce5
FEATURE: port to use claude-2 for chat bot (#114)
Claude 1 costs the same and is less good than Claude 2. Make use of Claude
2 in all spots ...

This also fixes streaming so it uses the far more efficient streaming protocol.
2023-07-27 11:24:44 +10:00
Rafael dos Santos Silva 5e3f4e1b78
FEATURE: Embeddings to main db (#99)
* FEATURE: Embeddings to main db

This commit moves our embeddings store from an external configurable PostgreSQL
instance back into the main database. This is done to simplify the setup.

There is a migration that will try to import the external embeddings into
the main DB if it is configured and there are rows.

It removes support from embeddings models that aren't all_mpnet_base_v2 or OpenAI
text_embedding_ada_002. However it will now be easier to add new models.

It also now takes into account:
  - topic title
  - topic category
  - topic tags
  - replies (as much as the model allows)

We introduce an interface so we can eventually support multiple strategies
for handling long topics.

This PR severely damages the semantic search performance, but this is a
temporary until we can get adapt HyDE to make semantic search use the same
embeddings we have for semantic related with good performance.

Here we also have some ground work to add post level embeddings, but this
will be added in a future PR.

Please note that this PR will also block Discourse from booting / updating if 
this plugin is installed and the pgvector extension isn't available on the 
PostgreSQL instance Discourse uses.
2023-07-13 12:41:36 -03:00
Rafael dos Santos Silva 9d10a152b9
FEATURE: Claude 2 for summarization and AIHelper (#101) 2023-07-13 12:32:08 -03:00
Roman Rizzi 1b568f2391
FIX: Claude's max_tookens_to_sample is a required field (#97) 2023-06-27 14:42:33 -03:00
Roman Rizzi 9a79afcdbf
DEV: Better strategies for summarization (#88)
* DEV: Better strategies for summarization

The strategy responsibility needs to be "Given a collection of texts, I know how to summarize them most efficiently, using the minimum amount of requests and maximizing token usage".

There are different token limits for each model, so it all boils down to two different strategies:

Fold all these texts into a single one, doing the summarization in chunks, and then build a summary from those.
Build it by combining texts in a single prompt, and truncate it according to your token limits.

While the latter is less than ideal, we need it for "bart-large-cnn-samsum" and "flan-t5-base-samsum", both with low limits. The rest will rely on folding.

* Expose summarized chunks to users
2023-06-27 12:26:33 -03:00
Sam d1ab79e82f
FEATURE: Add Azure cognitive service support (#93)
The new site settings:

ai_openai_gpt35_url : distribution for GPT 16k
ai_openai_gpt4_url: distribution for GPT 4
ai_openai_embeddings_url: distribution for ada2

If untouched we will simply use OpenAI endpoints.

Azure requires 1 URL per model, OpenAI allows a single URL to serve multiple models. Hence the new settings.
2023-06-21 10:39:51 +10:00
Sam 70c158cae1
FEATURE: add full bot support for GPT 3.5 (#87)
Given latest GPT 3.5 16k which is both better steered and supports functions
we can now support rich bot integration.

Clunky system message based steering is removed and instead we use the
function framework provided by Open AI
2023-06-20 08:45:31 +10:00
Rafael dos Santos Silva e457c687ca
FIX: OpenAI Tokenizer was failing to truncate mid emojis (#91)
* FIX: OpenAI Tokenizer was failing to truncate mid emojis

* Update spec/shared/tokenizer.rb

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>

---------

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
2023-06-16 15:15:36 -03:00
Sam 840968630e
FEATURE: disable smart commands on Claude and GPT 3.5 (#84)
For the time being smart commands only work consistently on GPT 4.
Avoid using any smart commands on the earlier models.

Additionally adds better error handling to Claude which sometimes streams
partial json and slightly tunes the search command.
2023-06-01 09:10:33 +10:00
Rafael dos Santos Silva b213fe7f94
FIX: Give up trying to reuse the DB connection and rely on pgbouncer (#79) 2023-05-23 15:12:59 -03:00
Rafael dos Santos Silva 262ed4753e
FEATURE: Basic StableDiffusion text2img support (#72) 2023-05-20 09:38:08 +10:00
Rafael dos Santos Silva 739b314312
Fixes for embeddings and truncate (#67) 2023-05-18 09:21:28 +10:00
Rafael dos Santos Silva 3c9513e754
Refinements to embeddings and tokenizers (#61)
* Refinements to embeddings and tokenizers

* lint

* Truncate with tokenizers for summary

* fix
2023-05-15 15:10:42 -03:00
Rafael dos Santos Silva 66bf4c74c6
FEATURE: Handle invalid media in NSFW module (#57)
* FEATURE: Handle invalid media in NSFW module

* fix lint
2023-05-11 15:35:39 -03:00
Roman Rizzi 7e3cb0ea16
FEATURE: Multi-model support for the AI Bot module. (#56)
We'll create one bot user for each available model. When listed in the `ai_bot_enabled_chat_bots` setting, they will reply.

This PR lets us use Claude-v1 in stream mode.
2023-05-11 10:03:03 -03:00
Sam e76fc77189
fixes (#53)
* Minor... use username suggester in case username already exists

* FIX: ensure we truncate long prompts

Previously we

1. Used raw length instead of token counts for counting length
2. We totally dropped a prompt if it was too long

New implementation will truncate "raw" if it gets too long maintaining
meaning.
2023-05-06 07:31:53 -03:00
Roman Rizzi 71b105a1bb
FEATURE: Introduce the ai-bot module (#52)
This module lets you chat with our GPT bot inside a PM. The bot only replies to members of the groups listed on the ai_bot_allowed_groups setting and only if you invite it to participate in the PM.
2023-05-05 15:28:31 -03:00
Sam 2cd60a4b3b
FEATURE: add a table to audit OpenAI usage (#45)
Still need to build a job to purge logs
2023-04-26 11:44:29 +10:00
Sam 057fbe1ce6
FEATURE: add internal support for streaming mode (#42)
Also adds some tests around completions and supports additional params
such as top_p, temperature and max_tokens

This also migrates off Faraday to using Net::HTTP directly
2023-04-21 16:54:25 +10:00
Rafael dos Santos Silva 9783e3b025
FEATURE: Add a basic tokenizer API (#37)
* FEATURE: Add a basic tokenizer API

* Add tests

* lint
2023-04-19 11:55:59 -03:00
Rafael dos Santos Silva bb0b829634
FEATURE: Anthropic Claude for AIHelper and Summarization modules (#39) 2023-04-10 11:04:42 -03:00
Rafael dos Santos Silva 5549e4d5b3
FEATURE: Chat channel summarization. (#32)
* start summary module

* chat channel summarization

* FEATURE: modal for channel summarization

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-04-04 11:24:09 -03:00
Rafael dos Santos Silva b942a18298
FEATURE: Support for GPT-4 in AI Helper module (#29) 2023-03-28 23:22:34 -03:00
Roman Rizzi 4c960970fa
DEV: Log information about errors from the completions OpenAI API (#26) 2023-03-22 16:00:28 -03:00
Sam 1d14f7ffaf
FEATURE: Add a markdown table AI helper (#25) 2023-03-22 13:16:29 -03:00
Roman Rizzi 39f7f1f29e
FEATURE: Prompts can consist of multiple messages. (#21)
A prompt with multiple messages leads to better results, as the AI can learn for given examples. Alongside this change, we provide a better default proofreading prompt.
2023-03-21 12:04:59 -03:00
Roman Rizzi fea9041ee1
DEV: Use 10s timeout when using the completions API (#19) 2023-03-20 16:43:51 -03:00
Joffrey JAFFEUX edfdc6dfae
DEV: applies chat namespacing (#12) 2023-03-17 15:15:38 +01:00
Rafael dos Santos Silva 80d662e9e8
FEATURE: Semantic Suggested Topics (#10) 2023-03-15 17:21:45 -03:00
Roman Rizzi f99fe7e1ed
FEATURE: Composer AI helper (#8)
* FEATURE: Composer AI helper

This change introduces a new composer button for the group members listed in the `ai_helper_allowed_groups` site setting.

Users can use chatGPT to review, improve, or translate their posts to English.

* Add a safeguard for PMs and don't rely on parentView
2023-03-15 17:02:20 -03:00
Roman Rizzi aa2fca6086
DEV: DiscourseAI -> DiscourseAi rename to have consistent folders and files (#9) 2023-03-14 16:03:50 -03:00
Rafael dos Santos Silva 510c6487e3
DEV: Preparation work for multiple inference providers (#5) 2023-03-07 16:14:39 -03:00
Roman Rizzi a838116cd5
FEATURE: Use dedicated reviewables for AI flags. (#4)
This change adds two new reviewable types: ReviewableAIPost and ReviewableAIChatMessage. They have the same actions as their existing counterparts: ReviewableFlaggedPost and ReviewableChatMessage.

We'll display the model used and their accuracy when showing these flags in the review queue and adjust the latter after staff performs an action, tracking a global accuracy per existing model in a separate table.


* FEATURE: Dedicated reviewables for AI flags

* Store and adjust model accuracy

* Display accuracy in reviewable templates
2023-03-07 15:39:28 -03:00
Roman Rizzi 676d3ce6b2
DEV: Rename XClassification --> XClassificator to make it more obvious (#3) 2023-02-28 11:17:03 -03:00
Roman Rizzi b9a650fde4
DEV: Dedicated table for saving classification results (#1) 2023-02-27 16:21:40 -03:00
Roman Rizzi 5f9597474c
REFACTOR: Streamline flag and classification process 2023-02-24 13:25:02 -03:00
Roman Rizzi 85768cfb1c
FEATURE: Classify posts looking for NSFW images 2023-02-24 09:11:58 -03:00