Commit Graph

110 Commits

Author SHA1 Message Date
Sam d75e3ca82b
FEATURE: include tag and category context in search (#217)
Previous to this we just included title/body.. tags and category
structure can be very critical for decision making.
2023-09-12 16:09:28 +10:00
Sam b0310f90d3
FEATURE: add tags and categories to read context (#215)
Note, we perform permission checks on tag list against anon
to ensure we do not disclose information about private tags
to the llm which could get extracted.
2023-09-12 11:06:55 +10:00
Roman Rizzi 0828254d61
FIX: Generate embeddings job was broken (#211)
* FIX: Use correct methods to generate embeddings

* FIX: Generate embeddings job was broken
2023-09-07 11:54:43 -03:00
Sam 615eb8b440
FEATURE: add semantic search with hyde bot (#210)
In specific scenarios (no special filters or limits) we will also
always include 5 semantic results (at least) with every query.

This effectively means that all very wide queries will always return
20 results, regardless of how complex they are.

Also: 

FIX: embedding backfill rake task not working
We renamed internals, this corrects the implementation
2023-09-07 13:25:26 +10:00
Keegan George abe96d5533
DEV: Strip out old modal based AI helper (#209) 2023-09-06 13:28:47 -07:00
Keegan George 0733ff7e67
UX: Show suggestion buttons only if sufficient content is present (#204) 2023-09-06 12:20:08 -07:00
Keegan George 8d674c451a
FIX: Flaky spec in AI Helper modal (#208) 2023-09-06 10:15:11 -07:00
Keegan George aa08f2d2a0
FIX: Flaky Spec (#207) 2023-09-06 09:46:03 -07:00
Roman Rizzi 13d63f1f30
FIX: filter allowed categories from semantic search results (#206) 2023-09-06 10:00:20 -03:00
Alan Guo Xiang Tan 920d4d8c0c
DEV: Skip broken test on CI (#205) 2023-09-06 09:33:43 +08:00
Rafael dos Santos Silva 4b42c09814
FEATURE: Tweak HyDE prompts for better grounding in forum subject and limit response size (#200)
* FEATURE: Tweak HyDE prompts for better grounding in forum subject and limit response size

* fix test

* lint
2023-09-05 16:11:07 -03:00
Keegan George ae0238c616
FIX: Flaky spec (#197) 2023-09-05 09:56:12 -07:00
Rafael dos Santos Silva 2c0f535bab
FEATURE: HyDE-powered semantic search. (#136)
* FEATURE: HyDE-powered semantic search.

It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.

We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.

This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.

Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.

* Missing translation and rate limiting

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-09-05 11:08:23 -03:00
Sam 38af2ca63e
FIX: cut completion short after function call is found (#182)
Previous to this change we would keep completing and throw away
result
2023-09-05 10:37:58 +10:00
Rafael dos Santos Silva 3c4a53b2cb
FEATURE: Better link in Claude summaries (#183)
* FEATURE: Better link in Claude summaries

* lint
2023-09-04 12:04:47 -03:00
Sam e3abbd9f46
FEATURE: add researcher persona (#181)
The researcher persona has access to Google and can perform
various internet research tasks. At the moment it can not read
web pages, but that is under consideration
2023-09-04 12:05:27 +10:00
Sam 3f9973586e
FIX: ai_bot_allowed_groups now works with restricted visibility (#180)
Previous to this change we relied on client side settings to
determine if an end user has access to the ai bot.

This meant that if a user was not aware they are a member of a
group (as it is with restricted visibility ones) they would not
see the bot button.

All checking has now moved to the server side, and tests were
added to cover.
2023-09-04 11:52:44 +10:00
Rafael dos Santos Silva 43e485cbd9
FEATURE: Additional AI suggestion options (#176) 2023-09-01 17:10:58 -07:00
Sam 181113159b
FIX: setting explorer was exceeding token budget
This refactor changes it so we only include minimal data in the
system prompt which leaves us lots of tokens for specific searches

The new search command allows us to pull in settings on demand

Descriptions are include in short search results, and names only
in longer results

Also: 

* In dev it is important to tell when calls are made to open ai
this adds a console log to increase awareness around token usage

* PERF: stop counting tokens so often

This changes it so we only count tokens once per response

Previously each time we heard back from open ai we would count
tokens, leading to uneeded delays

* bug fix, commands may reach in for tokenizer

* add logging to console for anthropic calls as well

* Update lib/shared/inference/openai_completions.rb

Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>
2023-09-01 11:48:51 +10:00
Loïc Guitaut 65091690eb DEV: Don’t use `Chat::MessageCreator` in specs
As message creation is being rewritten in
https://github.com/discourse/discourse/pull/22390, a new way of using
the underlying service to create chat messages has been implemented in
https://github.com/discourse/discourse/pull/23222.

This patch uses the new fabricator option which will prevent breaking
specs from this plugin when the main PR will be merged.
2023-08-31 11:30:07 +02:00
Sam 00d69b463e
FEATURE: new site setting explorer persona (#178)
Also adds ai_bot_enabled_personas so admins can tweak which stock
personas are enabled.

The new persona has a full listing of all site settings and is
able to get context for each setting.

This means you can ask it to search through settings for something
relevant.

Security wise there is no access to actual configuration of settings
just to the names / description and implementation.

Previously this was part of the forum helper persona however it
just clashes too much with other behaviors, isolating it makes
it far more powerful.

* sneaking this one in, user_emails is a non obvious table in our
structure.

usually one would assume users has emails so the clarifies a bit
better. plus it is a very common table to hit.
2023-08-31 17:02:03 +10:00
Sam db19e37748
FEATURE: add initial support for personas (#172)
This splits out a bunch of code that used to live inside bots
into a dedicated concept called a Persona.

This allows us to start playing with multiple personas for the bot

Ships with:

artist - for making images
sql helper - for helping with data explorer
general - for everything and anything
 
Also includes a few fixes that make the generic LLM function implementation  more robust
2023-08-30 16:15:03 +10:00
Keegan George 7457feced8
FEATURE: Show suggested title prompt in new location (#171) 2023-08-29 09:45:53 -07:00
Sam 8fdb88604f
FIX: trim first space when getting a reply from anthropic (#164)
Anthropic loves sending a pointless leading space with completions
this throws off the command framework.
2023-08-29 10:57:36 +10:00
Sam b14cb864dc
FEATURE: add setting_context experimental command (#160)
This command can be used to extract information about a discourse
site setting directly from source.

To operate it needs the rg binary in the container.
2023-08-29 10:43:58 +10:00
Keegan George fba419f864
UX: Clicking outside editor should close context menu (#170) 2023-08-28 15:08:51 -07:00
Keegan George 7790313b1b
DEV: Add review menu state (#159) 2023-08-24 17:49:24 -07:00
Keegan George 65c6b5e16c
DEV: Add keybindings (#157)
- Ability to Esc to close context menu
- Ability to Ctrl/Cmd + Z to undo results
2023-08-24 08:35:53 +10:00
Keegan George 78558b9cf5
DEV: Remove context menu timeout (#156) 2023-08-23 15:12:07 -07:00
Sam 7d943be7b2
FIX: automatic bot titles missing sometime (#151)
This fixes 2 big issues:

1. No matter how hard you try, grounding anthropic title prompt
is just too hard. This works around by only looking at the last
sentence it returns and treating as title

2. Non English locales would be stuck with "generic" title, this
ensures every bot message gets a title, using a custom field to
track

Also, slightly tunes some anthropic prompts.
2023-08-24 07:20:24 +10:00
Keegan George 6df850d473
FEATURE: AI Helper Context Menu (#148) 2023-08-23 10:35:40 -07:00
Sam f0e1c72aa7
FEATURE: implement command framework for non Open AI (#147)
Open AI support function calling, this has a very specific shape
that other LLMs have not quite adopted.

This simulates a command framework using system prompts on LLMs
that are not open AI.

Features include:

- Smart system prompt to steer the LLM
- Parameter validation (we ensure all the params are specified correctly)

This is being tested on Anthropic at the moment and intial results
are promising.
2023-08-23 07:49:36 +10:00
Rafael dos Santos Silva ea5a443588
FEATURE: Try to generate OpenAI Summaries in current language (#146)
* FEATURE: Try to generate OpenAI Summaries in current language

* lint
2023-08-21 15:40:32 -03:00
Sam b4477ecdcd
FEATURE: support 16k and 32k variants for Azure GPT (#140)
Azure requires a single HTTP endpoint per type of completion.

The settings: `ai_openai_gpt35_16k_url` and `ai_openai_gpt4_32k_url` can be
used now to configure the extra endpoints

This amends token limit which was off a bit due to function calls and fixes
a minor JS issue where we were not testing for a property
2023-08-17 11:00:11 +10:00
Rafael dos Santos Silva 49f2453c2d
FEATURE: Tweaks to Anthropic Summarization (#138)
* FEATURE: Tweaks to Anthropic Summarization

* fix specs
2023-08-16 15:09:52 -03:00
Rafael dos Santos Silva 0738f67fa4
FIX: Fix embeddings truncation strategy (#139) 2023-08-16 15:09:41 -03:00
Sam 20c1f2d788
FEATURE: basic progress for image generation (#133)
previously you would have to wait quite a while to see the prompt this implements
a very basic implementation of progress so you can see the API is working.

Also: 

- Fix google progress.
- Handle the incredibly rare, zero results from google.
- Simplify command so it is less error prone
- replace invoke and attache results with a invoke
- ensure invoke can only ever be run once
- pass in all the information a command needs in constructor
- use new pattern throughout
- test invocation in isolation
2023-08-14 16:30:12 +10:00
Sam 7eedbf29e0
FIX: refine image and read command (#131)
- Attempt to hint reading is done by sending complete:true
- Do not include post_number in result unless it was sent in
- Rush visual feedback when a command is run (ensure we always revise)
- Include hyperlink in read command description
- Stop round tripping to GPT after image generation (speeds up images by a lot)
- Add a test for image command
2023-08-09 16:01:48 +10:00
Sam 958dfc360e
FEATURE: experimental read command for bot (#129)
This command is useful for reading a topics content. It allows us to perform
critical analysis or suggest answers.

Given 8k token limit in GPT-4 I hardcoded reading to 1500 tokens, but we can
follow up and allow larger windows on models that support more tokens.

On local testing even in this limited form this can be very useful.
2023-08-09 07:19:56 +10:00
Rafael dos Santos Silva 8318c4374c
FIX: Remove muted from Similar list (#127)
* FIX: Remove muted from Similar list
2023-08-08 15:44:10 -03:00
Sam 03e689deb7
FIX: Google command was including full payload (#128)
* FIX: Google command was including full payload

Additionally there was no truncating happening meaning you could blow token
budget easily on a single search.

This made Google search mostly useless and it would mean that after using
Google we would revert to a clean slate which is very confusing.

* no need for nil there
2023-08-08 15:41:57 +10:00
Sam 7edb57c005
DEV: simplify command framework (#125)
The command framework had some confusing dispatching where it would dispatch
JSON blobs, this meant there was lots of parsing required in every command

The refactor handles transforming the args prior to dispatch which makes
consuming far simpler

This is also general prep to supporting some basic command framework in other
llms.
2023-08-04 09:37:58 +10:00
Roman Rizzi 58b96eda6c
REFACTOR: Build related topics using TopicQuery. (#124)
TopicQuery already provides a lot of safeguards and options for filtering topic, and enforcing permissions. It makes sense to rely on it as other plugins like discourse-assign do.

As a bonus, we now have access to the current_user while serializing these topics, so users will see things like unread posts count just like we do for the lists.
2023-08-02 16:58:09 -03:00
Sam 602bb843ea
FEATURE: add support for final stable diffusion xl model (#122) 2023-08-02 16:53:28 -03:00
Rafael dos Santos Silva 3e7c99de89
FEATURE: Support for locally infered embeddings in 100 languages (#115)
* FEATURE: Support for locally infered embeddings in 100 languages

* add table
2023-07-27 15:50:03 -03:00
Rafael dos Santos Silva b25daed60b
FEATURE: Llama2 for summarization (#116) 2023-07-27 13:55:32 -03:00
Sam 4b0c077ce5
FEATURE: port to use claude-2 for chat bot (#114)
Claude 1 costs the same and is less good than Claude 2. Make use of Claude
2 in all spots ...

This also fixes streaming so it uses the far more efficient streaming protocol.
2023-07-27 11:24:44 +10:00
Rafael dos Santos Silva b82074850e
DEV: Add tests to allmpnet tokenizer (#107)
* DEV: Add tests to allmpnet tokenizer

* lint
2023-07-14 11:37:21 -03:00
Roman Rizzi 5f0c617880
REFACTOR: Cohesive narrative for single-chunk summaries. (#103)
Single and multi-chunk summaries end using different prompts for the last summary. This change detects when the summarized content fits in a single chunk and uses a slightly different prompt, which leads to more consistent summary formats.

This PR also moves the chunk-splitting step to the `FoldContent` strategy as preparation for implementing streamed summaries.
2023-07-13 17:05:41 -03:00
Rafael dos Santos Silva 5e3f4e1b78
FEATURE: Embeddings to main db (#99)
* FEATURE: Embeddings to main db

This commit moves our embeddings store from an external configurable PostgreSQL
instance back into the main database. This is done to simplify the setup.

There is a migration that will try to import the external embeddings into
the main DB if it is configured and there are rows.

It removes support from embeddings models that aren't all_mpnet_base_v2 or OpenAI
text_embedding_ada_002. However it will now be easier to add new models.

It also now takes into account:
  - topic title
  - topic category
  - topic tags
  - replies (as much as the model allows)

We introduce an interface so we can eventually support multiple strategies
for handling long topics.

This PR severely damages the semantic search performance, but this is a
temporary until we can get adapt HyDE to make semantic search use the same
embeddings we have for semantic related with good performance.

Here we also have some ground work to add post level embeddings, but this
will be added in a future PR.

Please note that this PR will also block Discourse from booting / updating if 
this plugin is installed and the pgvector extension isn't available on the 
PostgreSQL instance Discourse uses.
2023-07-13 12:41:36 -03:00