Commit Graph

29 Commits

Author SHA1 Message Date
Rafael dos Santos Silva 23193ee6f2
FEATURE: Calculate gists from non hot topics too (#958)
Also renames some settings to remove 'hot' references.
2024-11-26 13:44:12 -03:00
Roman Rizzi e54f2da1a5
FIX: Unnecessary complex preloading accidentally filters some topics. (#945)
The `topic_query_create_list_topics` modifier we append was always meant to avoid an N+1 situation when serializing gists. However, I tried to be too smart and only preload these, which resulted in some topics with *only* regular summaries getting removed from the list. This issue became apparent now we are adding gists to other lists besides hot.

Let's simplify the preloading, which still solves the N+1 issue, and let the serializer get the needed summary.
2024-11-22 12:07:27 -03:00
Joffrey JAFFEUX 2cc8115b48
FIX: disables temporarily ai_summaries filtering (#943) 2024-11-22 08:34:54 +01:00
Roman Rizzi 530a795d43
FIX: Instruct AR that we want to use ai_summaries for filtering. (#927)
We use `includes` instead of `joins` because we want to eager-load summaries, avoiding an extra query when summarizing. However, Rails will complain unless you explicitly inform them you plan to use that inside a `WHERE` clause.
2024-11-19 17:32:13 -03:00
Roman Rizzi fb80d776d8
FEATURE: Enable gists on all topic lists (#922) 2024-11-19 11:04:34 -03:00
Roman Rizzi 9505a8976c
FEATURE: Automatically backfill regular summaries. (#892)
This change introduces a job to summarize topics and cache the results automatically. We provide a setting to control how many topics we'll backfill per hour and what the topic's minimum word count is to qualify.

We'll prioritize topics without summary over outdated ones.
2024-11-04 17:48:11 -03:00
Roman Rizzi ec97996905
FIX/REFACTOR: FoldContent revamp (#866)
* FIX/REFACTOR: FoldContent revamp

We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic.

The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex.

To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize.

Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with.

* Fix fold docs

* Use #shift instead of #pop to get the first elem, not the last
2024-10-25 11:51:17 -03:00
Roman Rizzi 6d504ab80d
FEATURE: Make hot topic gists opt-in. (#846)
This change restricts gists to members of specific groups. It also fixes a bug where other lists could display the gist if available.
2024-10-21 15:15:25 -03:00
Roman Rizzi 27b5542357
FEATURE: Generate topic gists for the hot topics list. (#837)
* Display gists in the hot topics list

* Adjust hot topics gist strategy and add a job to generate gists

* Replace setting with a configurable batch size

* Avoid loading summaries for other topic lists

* Tweak gist prompt to focus on latest posts in the context of the OP

* Remove serializer hack and rely on core change from discourse/discourse#29291

* Update lib/summarization/strategies/hot_topic_gists.rb

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>

---------

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2024-10-18 18:01:39 -03:00
Roman Rizzi c7acb4a6a0
REFACTOR: Support of different summarization targets/prompts. (#835)
* DEV: Add summary types

* Refactor for different summary types

* Use enum for summary types

* Update lib/summarization/strategies/topic_summary.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Update lib/summarization/strategies/topic_gist.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Update lib/summarization/strategies/chat_messages.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Fix chat_messages single prompt

* Small tweak to the chat summarization prompt

---------

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2024-10-15 13:53:26 -03:00
Sam 1320eed9b2
FEATURE: move summary to use llm_model (#699)
This allows summary to use the new LLM models and migrates of API key based model selection

Claude 3.5 etc... all work now. 

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-07-04 10:48:18 +10:00
Roman Rizzi fc081d9da6
FIX: Restore ability to fold summaries, which was accidentally removed (#700) 2024-07-03 18:10:31 -03:00
Keegan George 1b0ba9197c
DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
Roman Rizzi 0c4069ab3f
DEV: Remove non-LLM-based summarization strategies. (#589)
We removed these services from our hosting two weeks ago. It's safe to assume everyone has moved to other LLM-based options.
2024-04-23 12:11:04 -03:00
Roman Rizzi 0634b85a81
UX: Validations to LLM-backed features (except AI Bot) (#436)
* UX: Validations to Llm-backed features (except AI Bot)

This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs.

Validations are:

* You must choose a model before enabling the feature.
* You must turn off the feature before setting the model to blank.
* You must configure each model settings before being able to select it.

* Add provider name to summarization options

* vLLM can technically support same models as HF

* Check we can talk to the selected model

* Check for Bedrock instead of anthropic as a site could have both creds setup
2024-01-29 16:04:25 -03:00
Roman Rizzi 450ec915d8
FIX: Make FoldContent strategy more resilient when using models with low token count. (#341)
We'll recursively summarize  the content into smaller chunks until we are sure we can concatenate
them without going over the token limit.
2023-12-06 19:00:24 -03:00
Roman Rizzi 3bc010b686
FIX: call the right method to summarize with truncation (#328) 2023-12-01 10:17:24 -03:00
Sam 6ddc17fd61
DEV: port directory structure to Zeitwerk (#319)
Previous to this change we relied on explicit loading for a files in Discourse AI.

This had a few downsides:

- Busywork whenever you add a file (an extra require relative)
- We were not keeping to conventions internally ... some places were OpenAI others are OpenAi
- Autoloader did not work which lead to lots of full application broken reloads when developing.

This moves all of DiscourseAI into a Zeitwerk compatible structure.

It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there)

To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi

Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections

Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.
2023-11-29 15:17:46 +11:00
Roman Rizzi 3064d4c288
REFACTOR: Summarization and HyDE now use an LLM abstraction. (#297)
* DEV: One LLM abstraction to rule them all

* REFACTOR: HyDE search uses new LLM abstraction

* REFACTOR: Summarization uses the LLM abstraction

* Updated documentation and made small fixes. Remove Bedrock claude-2 restriction
2023-11-23 12:58:54 -03:00
Roman Rizzi e0691e70e8
DEV: Updates to the summarization strategy API (#301)
Introduced by discourse/discourse#24489

In the future, this change will let us log who requested the summary in the `AiApiAuditLog`.:
2023-11-21 13:27:35 -03:00
Sam 5b5edb22c6
FEATURE: UI to update ai personas on admin page (#290)
Introduces a UI to manage customizable personas (admin only feature)

Part of the change was some extensive internal refactoring:

- AIBot now has a persona set in the constructor, once set it never changes
- Command now takes in bot as a constructor param, so it has the correct persona and is not generating AIBot objects on the fly
- Added a .prettierignore file, due to the way ALE is configured in nvim it is a pre-req for prettier to work
- Adds a bunch of validations on the AIPersona model, system personas (artist/creative etc...) are all seeded. We now ensure
- name uniqueness, and only allow certain properties to be touched for system personas.
- (JS note) the client side design takes advantage of nested routes, the parent route for personas gets all the personas via this.store.findAll("ai-persona") then child routes simply reach into this model to find a particular persona.
- (JS note) data is sideloaded into the ai-persona model the meta property supplied from the controller, resultSetMeta
- This removes ai_bot_enabled_personas and ai_bot_enabled_chat_commands, both should be controlled from the UI on a per persona basis
- Fixes a long standing bug in token accounting ... we were doing to_json.length instead of to_json.to_s.length
- Amended it so {commands} are always inserted at the end unconditionally, no need to add it to the template of the system message as it just confuses things
- Adds a concept of required_commands to stock personas, these are commands that must be configured for this stock persona to show up.
- Refactored tests so we stop requiring inference_stubs, it was very confusing to need it, added to plugin.rb for now which at least is clearer
- Migrates the persona selector to gjs

---------

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Co-authored-by: Martin Brennan <martin@discourse.org>
2023-11-21 16:56:43 +11:00
Rafael dos Santos Silva 3c4a53b2cb
FEATURE: Better link in Claude summaries (#183)
* FEATURE: Better link in Claude summaries

* lint
2023-09-04 12:04:47 -03:00
Rafael dos Santos Silva ea5a443588
FEATURE: Try to generate OpenAI Summaries in current language (#146)
* FEATURE: Try to generate OpenAI Summaries in current language

* lint
2023-08-21 15:40:32 -03:00
Rafael dos Santos Silva 49f2453c2d
FEATURE: Tweaks to Anthropic Summarization (#138)
* FEATURE: Tweaks to Anthropic Summarization

* fix specs
2023-08-16 15:09:52 -03:00
Sam 4b0c077ce5
FEATURE: port to use claude-2 for chat bot (#114)
Claude 1 costs the same and is less good than Claude 2. Make use of Claude
2 in all spots ...

This also fixes streaming so it uses the far more efficient streaming protocol.
2023-07-27 11:24:44 +10:00
Roman Rizzi 5f0c617880
REFACTOR: Cohesive narrative for single-chunk summaries. (#103)
Single and multi-chunk summaries end using different prompts for the last summary. This change detects when the summarized content fits in a single chunk and uses a slightly different prompt, which leads to more consistent summary formats.

This PR also moves the chunk-splitting step to the `FoldContent` strategy as preparation for implementing streamed summaries.
2023-07-13 17:05:41 -03:00
Roman Rizzi 1b568f2391
FIX: Claude's max_tookens_to_sample is a required field (#97) 2023-06-27 14:42:33 -03:00
Roman Rizzi 9a79afcdbf
DEV: Better strategies for summarization (#88)
* DEV: Better strategies for summarization

The strategy responsibility needs to be "Given a collection of texts, I know how to summarize them most efficiently, using the minimum amount of requests and maximizing token usage".

There are different token limits for each model, so it all boils down to two different strategies:

Fold all these texts into a single one, doing the summarization in chunks, and then build a summary from those.
Build it by combining texts in a single prompt, and truncate it according to your token limits.

While the latter is less than ideal, we need it for "bart-large-cnn-samsum" and "flan-t5-base-samsum", both with low limits. The rest will rely on folding.

* Expose summarized chunks to users
2023-06-27 12:26:33 -03:00
Roman Rizzi 3364fec425
DEV: Remove the summarization feature (#83)
* DEV: Remove the summarization feature

Instead, we'll register summarization implementations for OpenAI, Anthropic, and Discourse AI using the API defined in discourse/discourse#21813.

Core and chat will implement features on top of these implementations instead of this plugin extending them.

* Register instances that contain the model, requiring less site settings
2023-06-13 14:32:26 -03:00