Commit Graph

84 Commits

Author SHA1 Message Date
Sam 6ddc17fd61
DEV: port directory structure to Zeitwerk (#319)
Previous to this change we relied on explicit loading for a files in Discourse AI.

This had a few downsides:

- Busywork whenever you add a file (an extra require relative)
- We were not keeping to conventions internally ... some places were OpenAI others are OpenAi
- Autoloader did not work which lead to lots of full application broken reloads when developing.

This moves all of DiscourseAI into a Zeitwerk compatible structure.

It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there)

To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi

Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections

Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.
2023-11-29 15:17:46 +11:00
Keegan George c7665e891b
A11Y: Add title attribute to sparkles icon for AI search results (#317) 2023-11-27 14:25:33 -08:00
Sam 5a4598a7b4
FEATURE: Azure OpenAI support for DALL*E 3 (#313)
* FEATURE: Azure OpenAI support for DALL*E 3

Previous to this there was no way to add an inference endpoint for
DALL*E on Azure cause it requires custom URLs

Also:

- On save, when editing a persona it would revert priority and enabled
- More forgiving parsing in command framework for array function calls
- By default generate HD images - they tend to be a bit better
- Improve DALL*E prompt which was getting very annoying and always echoing what it is about to do
- Add a bit of a sleep between retries on image generation
- Fix error handling in image_command
2023-11-27 13:01:05 +11:00
Sam dff9f33a97
FEATURE: DALL-E-3 persona for image generation (#311)
* FIX: no selected persona should pick first prioritized one

Previously we were looking at `.personaId` but there is only an
id attribute so it failed

* FEATURE: new DALL-E-3 persona

This persona generates images using DALL-E-3 API and is enabled
by default

Keep in mind that we are still waiting on seeds/gen_id so we can
not retain style consistently between turns.

This will change as soon as a new Open AI API provides the missing
parameters

Co-authored-by: Martin Brennan <martin@discourse.org>
2023-11-24 18:08:08 +11:00
Keegan George df8804afcd
DEV: Only allow semantic search on "Relevance" sort mode (#306) 2023-11-23 11:30:17 -08:00
Discourse Translator Bot 493b48477a
Update translations (#300) 2023-11-21 14:36:22 +01:00
Sam 5b5edb22c6
FEATURE: UI to update ai personas on admin page (#290)
Introduces a UI to manage customizable personas (admin only feature)

Part of the change was some extensive internal refactoring:

- AIBot now has a persona set in the constructor, once set it never changes
- Command now takes in bot as a constructor param, so it has the correct persona and is not generating AIBot objects on the fly
- Added a .prettierignore file, due to the way ALE is configured in nvim it is a pre-req for prettier to work
- Adds a bunch of validations on the AIPersona model, system personas (artist/creative etc...) are all seeded. We now ensure
- name uniqueness, and only allow certain properties to be touched for system personas.
- (JS note) the client side design takes advantage of nested routes, the parent route for personas gets all the personas via this.store.findAll("ai-persona") then child routes simply reach into this model to find a particular persona.
- (JS note) data is sideloaded into the ai-persona model the meta property supplied from the controller, resultSetMeta
- This removes ai_bot_enabled_personas and ai_bot_enabled_chat_commands, both should be controlled from the UI on a per persona basis
- Fixes a long standing bug in token accounting ... we were doing to_json.length instead of to_json.to_s.length
- Amended it so {commands} are always inserted at the end unconditionally, no need to add it to the template of the system message as it just confuses things
- Adds a concept of required_commands to stock personas, these are commands that must be configured for this stock persona to show up.
- Refactored tests so we stop requiring inference_stubs, it was very confusing to need it, added to plugin.rb for now which at least is clearer
- Migrates the persona selector to gjs

---------

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
2023-11-21 16:56:43 +11:00
Keegan George f7277d244e
DEV: Mix semantic search results with normal results (#278) 2023-11-17 12:46:59 -08:00
Keegan George d1f21c78f1
UX: Add copy button to generated suggestion (#296) 2023-11-17 09:25:41 -08:00
Discourse Translator Bot c6902c40ce
Update translations (#294) 2023-11-14 14:30:22 +01:00
Roman Rizzi d0198c5c5b
FIX: Changes to the sentiment reports. (#289)
This PR aims to clarify sentiment reports by replacing averages with a count of posts that have one of their values above a threshold (60), meaning we have some level of confidence they are, in fact, positive or negative.

Same thing happen with post emotions, with the difference that a post can have multiple values above it (30). Additionally, we dropped the "Neutral" axis.

We also reworded the tooltip next to each report title, and added an early return to signal we have no data available instead of displaying an empty chart.
2023-11-09 17:23:25 -03:00
Roman Rizzi b172ef11c4
FEATURE: Expose sentiment classifications via the admin dashboard. (#284)
This PR adds new reports for displaying information about post sentiments grouped by date and emotions group by TL.

Depends on discourse/discourse#24274
2023-11-08 10:50:37 -03:00
Discourse Translator Bot e30082dd20
Update translations (#277) 2023-11-01 12:27:29 -03:00
Rafael dos Santos Silva 818b20fb6f
FEATURE: Make embeddings turn-key (#261)
To ease the administrative burden of enabling the embeddings model, this change introduces automatic backfill when the setting is enabled. It also moves the topic visit embedding creation to a lower priority queue in sidekiq and adds an option to skip embedding computation and persistence when we match on the digest.
2023-10-26 12:07:37 -03:00
Discourse Translator Bot 06c1356d86
Update translations (#263) 2023-10-24 15:53:44 +02:00
Rafael dos Santos Silva 0e5764617a
FEATURE: AI helper on posts (#244)
Adds an AI Helper function when selecting text while viewing a topic.

---------

Co-authored-by: Keegan George <kgeorge13@gmail.com>
Co-authored-by: Roman Rizzi <roman@discourse.org>
2023-10-23 11:41:36 -03:00
Discourse Translator Bot 3bced1c6f5
Update translations (#249) 2023-10-11 11:18:14 +02:00
Sam 9242da545e
FEATURE: support OpenAI-Organization header (#245)
Per: https://platform.openai.com/docs/api-reference/authentication

There is an organization option which is useful for large orgs

> For users who belong to multiple organizations, you can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's subscription quota.
2023-10-06 10:23:18 +11:00
Rafael dos Santos Silva 84cc369552
FEATURE: Bge-large-en embeddings via Cloudflare Workers AI API (#241)
* FEATURE: Bge-large-en embeddings via Cloudflare Workers AI API

* forgot a file

* lint
2023-10-04 13:47:51 -03:00
Discourse Translator Bot 05c256f65b
Update translations (#239) 2023-10-04 09:54:32 +02:00
Sam 0cbf14e343
FEATURE: automation rule for triaging posts using LLM (#236)
The new automation rule can be used to perform llm based classification and categorization of topics. 

You specify a system prompt (which has %%POST%% as an input), if it returns a particular piece of text then we will apply rules such as tagging, hiding, replying or categorizing.

This can be used as a spam filter, a "oops you are in the wrong place" filter and so on. 

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
2023-10-03 08:55:30 +11:00
Discourse Translator Bot 782600e64f
Update translations (#229) 2023-09-27 11:03:11 +02:00
Sam aa463d64f1
FEATURE: Add creative persona (#231)
This adds a new creative persona that has access to the underlying
model and no external integrations.

It allows people to use Claude/GPT models in a Discourse agnostic
way.
2023-09-27 10:48:38 +10:00
Keegan George 2e5a39360a
FEATURE: Create custom prompts with composer AI helper (#214)
* DEV: Add icon support

* DEV: Add basic setup for custom prompt menu

* FEATURE: custom prompt backend

* fix custom prompt param check

* fix custom prompt replace

* WIP

* fix custom prompt usage

* fixes

* DEV: Update front-end

* DEV: No more custom prompt state

* DEV: Add specs

* FIX: Title/Category/Tag suggestions

Suggestion dropdowns broke because it `messages_with_user_input(user_input)` expects a hash now.

* DEV: Apply syntax tree

* DEV: Restrict custom prompts to configured groups

* oops

* fix tests

* lint

* I love tests

* lint is cool tho

---------

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2023-09-25 15:12:54 -03:00
Sam 9e94457154
FIX: Made bot more robust (#226)
* FIX: Made bot more robust

This is a collection of small fixes

- Display "Searching for: ..." while searching instead of showing found 0 results.
- Only allow 5 commands in lang chain - 6 feels like too much
- On the 5th command stop informing the engine about functions, so it is forced to complete
- Add another 30 tokens of buffer and explain why
- Typo in command prompt


Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2023-09-14 16:46:56 +10:00
Rafael dos Santos Silva d1642533fb
FIX: Use "Related Topics" label consistently (#221) 2023-09-12 16:23:24 -03:00
Discourse Translator Bot 0d761f4305
Update translations (#218) 2023-09-12 15:27:58 +02:00
Rafael dos Santos Silva 2c0f535bab
FEATURE: HyDE-powered semantic search. (#136)
* FEATURE: HyDE-powered semantic search.

It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.

We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.

This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.

Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.

* Missing translation and rate limiting

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-09-05 11:08:23 -03:00
Discourse Translator Bot 3d83d062a1
Update translations (#186) 2023-09-05 15:42:46 +02:00
Sam e3abbd9f46
FEATURE: add researcher persona (#181)
The researcher persona has access to Google and can perform
various internet research tasks. At the moment it can not read
web pages, but that is under consideration
2023-09-04 12:05:27 +10:00
Rafael dos Santos Silva 43e485cbd9
FEATURE: Additional AI suggestion options (#176) 2023-09-01 17:10:58 -07:00
Sam 181113159b
FIX: setting explorer was exceeding token budget
This refactor changes it so we only include minimal data in the
system prompt which leaves us lots of tokens for specific searches

The new search command allows us to pull in settings on demand

Descriptions are include in short search results, and names only
in longer results

Also: 

* In dev it is important to tell when calls are made to open ai
this adds a console log to increase awareness around token usage

* PERF: stop counting tokens so often

This changes it so we only count tokens once per response

Previously each time we heard back from open ai we would count
tokens, leading to uneeded delays

* bug fix, commands may reach in for tokenizer

* add logging to console for anthropic calls as well

* Update lib/shared/inference/openai_completions.rb

Co-authored-by: Martin Brennan <mjrbrennan@gmail.com>
2023-09-01 11:48:51 +10:00
Sam 00d69b463e
FEATURE: new site setting explorer persona (#178)
Also adds ai_bot_enabled_personas so admins can tweak which stock
personas are enabled.

The new persona has a full listing of all site settings and is
able to get context for each setting.

This means you can ask it to search through settings for something
relevant.

Security wise there is no access to actual configuration of settings
just to the names / description and implementation.

Previously this was part of the forum helper persona however it
just clashes too much with other behaviors, isolating it makes
it far more powerful.

* sneaking this one in, user_emails is a non obvious table in our
structure.

usually one would assume users has emails so the clarifies a bit
better. plus it is a very common table to hit.
2023-08-31 17:02:03 +10:00
Sam 8e4347acba
DEV: rename ai_helper_add_ai_pm_to_header -> ai_bot_add_to_header (#177)
Old name was very unclear, this setting is only used for the bot
so now it follows the same convention others do
2023-08-31 14:42:28 +10:00
Sam db19e37748
FEATURE: add initial support for personas (#172)
This splits out a bunch of code that used to live inside bots
into a dedicated concept called a Persona.

This allows us to start playing with multiple personas for the bot

Ships with:

artist - for making images
sql helper - for helping with data explorer
general - for everything and anything
 
Also includes a few fixes that make the generic LLM function implementation  more robust
2023-08-30 16:15:03 +10:00
Keegan George 4da4b5609f
FIX: Show warning when trying to generate suggestions without content (#175) 2023-08-29 11:58:45 -07:00
Keegan George 7457feced8
FEATURE: Show suggested title prompt in new location (#171) 2023-08-29 09:45:53 -07:00
Discourse Translator Bot 345bfed19f
Update translations (#173) 2023-08-29 15:51:02 +02:00
Sam b14cb864dc
FEATURE: add setting_context experimental command (#160)
This command can be used to extract information about a discourse
site setting directly from source.

To operate it needs the rg binary in the container.
2023-08-29 10:43:58 +10:00
Keegan George 7790313b1b
DEV: Add review menu state (#159) 2023-08-24 17:49:24 -07:00
Keegan George 6df850d473
FEATURE: AI Helper Context Menu (#148) 2023-08-23 10:35:40 -07:00
Discourse Translator Bot 95881fce74
Update translations (#149) 2023-08-22 14:34:48 -03:00
Martin Brennan 486a130c25
DEV: Categorize plugin settings into discourse_ai (#144)
Moving the plugin settings into a more specific category
makes them easier to find in the plugin UI and removes
them from the generic "Plugins" tab.
2023-08-21 14:46:34 -03:00
Sam b4477ecdcd
FEATURE: support 16k and 32k variants for Azure GPT (#140)
Azure requires a single HTTP endpoint per type of completion.

The settings: `ai_openai_gpt35_16k_url` and `ai_openai_gpt4_32k_url` can be
used now to configure the extra endpoints

This amends token limit which was off a bit due to function calls and fixes
a minor JS issue where we were not testing for a property
2023-08-17 11:00:11 +10:00
Sam 01f833f86e
FEATURE: optional warning attached to all AI bot conversations (#137)
* FEATURE: optional warning attached to all AI bot conversations

This commit introduces `ai_bot_enable_chat_warning` which can be used
to warn people prior to starting a chat with the bot.

In particular this is useful if moderators are regularly reading chat
transcripts as it sets expectations early.

By default this is disabled.

Also:

- Stops making ajax call prior to opening composer
- Hides PM title when starting a bot PM

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2023-08-17 06:29:58 +10:00
Discourse Translator Bot 525c8b0913
Update translations (#135) 2023-08-15 21:25:07 +02:00
Régis Hanol 7077c31ab8
Typo in site setting's description (#132) 2023-08-10 14:07:13 -03:00
Sam 7eedbf29e0
FIX: refine image and read command (#131)
- Attempt to hint reading is done by sending complete:true
- Do not include post_number in result unless it was sent in
- Rush visual feedback when a command is run (ensure we always revise)
- Include hyperlink in read command description
- Stop round tripping to GPT after image generation (speeds up images by a lot)
- Add a test for image command
2023-08-09 16:01:48 +10:00
Sam 958dfc360e
FEATURE: experimental read command for bot (#129)
This command is useful for reading a topics content. It allows us to perform
critical analysis or suggest answers.

Given 8k token limit in GPT-4 I hardcoded reading to 1500 tokens, but we can
follow up and allow larger windows on models that support more tokens.

On local testing even in this limited form this can be very useful.
2023-08-09 07:19:56 +10:00
Discourse Translator Bot b1987f279d
Update translations (#130) 2023-08-08 15:42:39 +02:00