Commit Graph

283 Commits

Author SHA1 Message Date
Sam 03e689deb7
FIX: Google command was including full payload (#128)
* FIX: Google command was including full payload

Additionally there was no truncating happening meaning you could blow token
budget easily on a single search.

This made Google search mostly useless and it would mean that after using
Google we would revert to a clean slate which is very confusing.

* no need for nil there
2023-08-08 15:41:57 +10:00
Sam 7edb57c005
DEV: simplify command framework (#125)
The command framework had some confusing dispatching where it would dispatch
JSON blobs, this meant there was lots of parsing required in every command

The refactor handles transforming the args prior to dispatch which makes
consuming far simpler

This is also general prep to supporting some basic command framework in other
llms.
2023-08-04 09:37:58 +10:00
Rafael dos Santos Silva eb7fff3a55
FEATURE: Add support for StableBeluga and Upstage Llama2 instruct (#126)
* FEATURE: Add support for StableBeluga and Upstage Llama2 instruct

This means we support all models in the top3 of the Open LLM Leaderboard

Since some of those models have RoPE, we now have a setting so you can
customize the token limit depending which model you use.
2023-08-03 15:29:30 -03:00
Rafael dos Santos Silva 8b157feea5
FEATURE: Compatibility with protected Hugging Face Endpoints (#123)
* FEATURE: Compatibility with protected Hugging Face Endpoints
2023-08-02 17:00:00 -03:00
Roman Rizzi 58b96eda6c
REFACTOR: Build related topics using TopicQuery. (#124)
TopicQuery already provides a lot of safeguards and options for filtering topic, and enforcing permissions. It makes sense to rely on it as other plugins like discourse-assign do.

As a bonus, we now have access to the current_user while serializing these topics, so users will see things like unread posts count just like we do for the lists.
2023-08-02 16:58:09 -03:00
Sam 602bb843ea
FEATURE: add support for final stable diffusion xl model (#122) 2023-08-02 16:53:28 -03:00
Roman Rizzi 51fdf21143
DEV: Pin plugin for v3.1 (#121)
* DEV: Pin plugin for v3.1

Changes to the topic recommendations list added by discourse/discourse#22896 were reverted from stable.

* Update .discourse-compatibility

Co-authored-by: David Taylor <david@taylorhq.com>

---------

Co-authored-by: David Taylor <david@taylorhq.com>
2023-08-01 12:13:00 -03:00
Discourse Translator Bot c26d48e3b1
Update translations (#119) 2023-08-01 16:05:55 +02:00
Roman Rizzi c8de9495c8
UX: Update related-topics to follow <MoreTopics/> conventions (#118) 2023-07-31 18:33:37 -03:00
Rafael dos Santos Silva 3e7c99de89
FEATURE: Support for locally infered embeddings in 100 languages (#115)
* FEATURE: Support for locally infered embeddings in 100 languages

* add table
2023-07-27 15:50:03 -03:00
Rafael dos Santos Silva b25daed60b
FEATURE: Llama2 for summarization (#116) 2023-07-27 13:55:32 -03:00
Sam 4b0c077ce5
FEATURE: port to use claude-2 for chat bot (#114)
Claude 1 costs the same and is less good than Claude 2. Make use of Claude
2 in all spots ...

This also fixes streaming so it uses the far more efficient streaming protocol.
2023-07-27 11:24:44 +10:00
Discourse Translator Bot 2031388f9c
Update translations (#109) 2023-07-25 17:57:58 +02:00
Roman Rizzi 79289ba231
FIX: Use base 10 when gettings allowed group IDs from settings. (#113)
A missing parameter on the `parseInt` function was causing unexpected UI behavior for the AI helper since it turned an allowed group ID into NaN. We should always use base10 when parsing these IDs.
2023-07-24 11:29:49 -03:00
dependabot[bot] 3f05d3a732
Build(deps): Bump word-wrap from 1.2.3 to 1.2.4 (#112)
Bumps [word-wrap](https://github.com/jonschlinkert/word-wrap) from 1.2.3 to 1.2.4.
- [Release notes](https://github.com/jonschlinkert/word-wrap/releases)
- [Commits](https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4)

---
updated-dependencies:
- dependency-name: word-wrap
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-20 10:19:23 +02:00
Rafael dos Santos Silva f34094f6cc
FIX: Show related topics when scrolling long topics (#111)
* FIX: Show related topics when scrolling long topics

* Update assets/javascripts/initializers/related-topics.js

Co-authored-by: Roman Rizzi <roman@discourse.org>

---------

Co-authored-by: Roman Rizzi <roman@discourse.org>
2023-07-19 19:19:38 -03:00
Rafael dos Santos Silva e3b4a73267
FEATURE: Cache Related Topics for longer (#110) 2023-07-18 11:27:06 -03:00
Discourse Translator Bot 7ea468708c
DEV: Add Crowdin support (#108) 2023-07-15 00:56:15 +02:00
Rafael dos Santos Silva b82074850e
DEV: Add tests to allmpnet tokenizer (#107)
* DEV: Add tests to allmpnet tokenizer

* lint
2023-07-14 11:37:21 -03:00
Roman Rizzi 473732c18a
FIX: Return base prompt instead of nil (#106) 2023-07-13 21:48:25 -03:00
Rafael dos Santos Silva d692ecc7de
FIX: Disable truncation and padding in all-mpnet-base-v2 tokenizer (#105)
The tokenizer was truncating and padding to 128 tokens, and we try append
new post content until we hit 384 tokens. This was causing the tokenizer
to accept all posts in a topic, wasting CPU and memory.
2023-07-13 21:09:46 -03:00
Rafael dos Santos Silva 703762a7a9
PERF: .find_each instead of .find to save us from memory allocation peaks
also Fix embeddings rake task for new db structure
2023-07-13 18:59:25 -03:00
Roman Rizzi 5f0c617880
REFACTOR: Cohesive narrative for single-chunk summaries. (#103)
Single and multi-chunk summaries end using different prompts for the last summary. This change detects when the summarized content fits in a single chunk and uses a slightly different prompt, which leads to more consistent summary formats.

This PR also moves the chunk-splitting step to the `FoldContent` strategy as preparation for implementing streamed summaries.
2023-07-13 17:05:41 -03:00
David Taylor 48d880d3c8
FIX: Rerender related topics correctly when topic changes (#100)
* FIX: Rerender related topics correctly when topic changes


Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2023-07-13 16:34:02 -03:00
Rafael dos Santos Silva 5e3f4e1b78
FEATURE: Embeddings to main db (#99)
* FEATURE: Embeddings to main db

This commit moves our embeddings store from an external configurable PostgreSQL
instance back into the main database. This is done to simplify the setup.

There is a migration that will try to import the external embeddings into
the main DB if it is configured and there are rows.

It removes support from embeddings models that aren't all_mpnet_base_v2 or OpenAI
text_embedding_ada_002. However it will now be easier to add new models.

It also now takes into account:
  - topic title
  - topic category
  - topic tags
  - replies (as much as the model allows)

We introduce an interface so we can eventually support multiple strategies
for handling long topics.

This PR severely damages the semantic search performance, but this is a
temporary until we can get adapt HyDE to make semantic search use the same
embeddings we have for semantic related with good performance.

Here we also have some ground work to add post level embeddings, but this
will be added in a future PR.

Please note that this PR will also block Discourse from booting / updating if 
this plugin is installed and the pgvector extension isn't available on the 
PostgreSQL instance Discourse uses.
2023-07-13 12:41:36 -03:00
Rafael dos Santos Silva 9d10a152b9
FEATURE: Claude 2 for summarization and AIHelper (#101) 2023-07-13 12:32:08 -03:00
dependabot[bot] 1c017c22b0
Build(deps): Bump semver from 6.3.0 to 6.3.1 (#102)
Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1.
- [Release notes](https://github.com/npm/node-semver/releases)
- [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md)
- [Commits](https://github.com/npm/node-semver/compare/v6.3.0...v6.3.1)

---
updated-dependencies:
- dependency-name: semver
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-13 09:33:34 +02:00
Roman Rizzi fbe1bab980
FIX: typo while updating a section (#98) 2023-06-27 17:57:58 -03:00
Roman Rizzi 1b568f2391
FIX: Claude's max_tookens_to_sample is a required field (#97) 2023-06-27 14:42:33 -03:00
Roman Rizzi 9a79afcdbf
DEV: Better strategies for summarization (#88)
* DEV: Better strategies for summarization

The strategy responsibility needs to be "Given a collection of texts, I know how to summarize them most efficiently, using the minimum amount of requests and maximizing token usage".

There are different token limits for each model, so it all boils down to two different strategies:

Fold all these texts into a single one, doing the summarization in chunks, and then build a summary from those.
Build it by combining texts in a single prompt, and truncate it according to your token limits.

While the latter is less than ideal, we need it for "bart-large-cnn-samsum" and "flan-t5-base-samsum", both with low limits. The rest will rely on folding.

* Expose summarized chunks to users
2023-06-27 12:26:33 -03:00
Sam 9390fba768
FIX: adjust token limits to account for functions (#96)
Reduce maximum replies to 2500 tokens and make them even for both GPT-3.5
and 4

Account for 400+ tokens in function definitions (this was unaccounted for)
2023-06-23 10:02:04 +10:00
Sam f8cabfad6b
FEATURE: Try to hone search so it reduces search terms in subsequent rounds (#95)
Teach via system message that you can reduce search terms to get more
results
2023-06-21 20:07:55 +10:00
Sam a028309cbd
FEATURE: add ai_bot_enabled_chat commands and tune search (#94)
* FEATURE: add ai_bot_enabled_chat commands and tune search

This allows admins to disable/enable GPT command integrations.

Also hones search results which were looping cause the result did not denote
the failure properly (it lost context)

* include more context for google command
include more context for time command

* type
2023-06-21 17:10:30 +10:00
Sam d1ab79e82f
FEATURE: Add Azure cognitive service support (#93)
The new site settings:

ai_openai_gpt35_url : distribution for GPT 16k
ai_openai_gpt4_url: distribution for GPT 4
ai_openai_embeddings_url: distribution for ada2

If untouched we will simply use OpenAI endpoints.

Azure requires 1 URL per model, OpenAI allows a single URL to serve multiple models. Hence the new settings.
2023-06-21 10:39:51 +10:00
Sam 30778d8af8
FIX: avoid storing corrupt prompts (#92)
```
prompt << build_message(bot_user.username, reply)
```

Would store a "cooked" prompt which is invalid, instead just store the raw
values which are later passed to build_message

Additionally:

1. Disable summary command which needs honing
2. Stop storing decorations (searched for X) in prompt which leads to straying
3. Ship username directly to model, avoiding "user: content" in prompts. This
 was causing GPT to stray
2023-06-20 15:44:03 +10:00
Sam 70c158cae1
FEATURE: add full bot support for GPT 3.5 (#87)
Given latest GPT 3.5 16k which is both better steered and supports functions
we can now support rich bot integration.

Clunky system message based steering is removed and instead we use the
function framework provided by Open AI
2023-06-20 08:45:31 +10:00
Rafael dos Santos Silva e457c687ca
FIX: OpenAI Tokenizer was failing to truncate mid emojis (#91)
* FIX: OpenAI Tokenizer was failing to truncate mid emojis

* Update spec/shared/tokenizer.rb

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>

---------

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
2023-06-16 15:15:36 -03:00
Roman Rizzi 9e901dbfbf
FIX: Serialize channel title for DMs (#90) 2023-06-16 14:37:16 -03:00
David Taylor 68bc80e19d
DEV: Fix Zeitwerk reloading error (#89)
Using `.append` can lead to a 'two routes with the same name' error
2023-06-16 11:32:54 +01:00
Rafael dos Santos Silva 8742535024
FEATURE: Allow using large context OpenAI models for summarization (#86) 2023-06-13 15:23:48 -03:00
Roman Rizzi 3364fec425
DEV: Remove the summarization feature (#83)
* DEV: Remove the summarization feature

Instead, we'll register summarization implementations for OpenAI, Anthropic, and Discourse AI using the API defined in discourse/discourse#21813.

Core and chat will implement features on top of these implementations instead of this plugin extending them.

* Register instances that contain the model, requiring less site settings
2023-06-13 14:32:26 -03:00
Sam 081231a6eb
FIX: support multiple command executions (#85)
Previous to this change we were chaining stuff too late and would execute
commands serially leading to very unexpected results

This corrects this and allows us to run stuff like:

> Search google 3/4 times on various permutations of
QUERY and answer this question.

We limit at 5 commands to ensure there are not pathological user cases
where you lean on the LLM to flood us with results.
2023-06-06 07:09:33 +10:00
Sam 840968630e
FEATURE: disable smart commands on Claude and GPT 3.5 (#84)
For the time being smart commands only work consistently on GPT 4.
Avoid using any smart commands on the earlier models.

Additionally adds better error handling to Claude which sometimes streams
partial json and slightly tunes the search command.
2023-06-01 09:10:33 +10:00
Sam 96d521198b
FIX: missing localization (#81)
blog.start_gpt_chat -> was on my blog

This also slightly tunes the search prompt to support filtering by oldest
and try a tiny bit harder to guide GPT 3.5 which is a bit of a losing battle

Co-authored-by: Krzysztof Kotlarek <kotlarek.krzysztof@gmail.com>
2023-05-25 11:05:02 +10:00
Rafael dos Santos Silva cfc6e388df
FIX: Ensure embeddings database outages are handled gracefully (#80)
The rails_failover middleware will intercept all `PG::ConnectionBad` errors and put the cluster into readonly mode. It does not have any handling for multiple databases. Therefore, an issue with the embeddings database was taking the whole cluster into readonly.

This commit fixes the issue by rescuing `PG::Error` from all AI database accesses, and re-raises errors with a different class. It also adds a spec to ensure that an embeddings database outage does not affect the functionality of the topics/show route.

Co-authored-by: David Taylor <david@taylorhq.com>
2023-05-23 22:57:52 +01:00
Rafael dos Santos Silva b213fe7f94
FIX: Give up trying to reuse the DB connection and rely on pgbouncer (#79) 2023-05-23 15:12:59 -03:00
Roman Rizzi c582e3b848
DEV: Fix toxicity test (#78) 2023-05-23 11:02:11 -03:00
Sam d85b503ed4
FIX: guide GPT 3.5 better (#77)
* FIX: guide GPT 3.5 better

This limits search results to 10 cause we were blowing the whole token
budget on search results, additionally it includes a quick exchange at
the start of a session to try and guide GPT 3.5 to follow instructions

Sadly GPT 3.5 drifts off very quickly but this does improve stuff a bit.

It also attempts to correct some issues with anthropic, though it still is
surprisingly hard to ground

* add status:public, this is a bit of a hack but ensures that we can search
for any filter provided

* fix specs
2023-05-23 23:08:17 +10:00
Sam b82fc1e692
FIX: ensure we only attempt embedding once every 15 minutes (#76)
This also heavily reduced log noise and ensures our exception handling is
more surgical.
2023-05-23 10:43:24 +10:00
Sam 074d00ca32
FEATURE: improve search prompt (#75)
- We only support searching public topics - make it clear
- Stop using bug/feature, cause is poisons system - these may not exist
- Add after: and before: which are very handy for bounding search results
2023-05-23 07:52:14 +10:00