Commit Graph

23 Commits

Author SHA1 Message Date
Roman Rizzi 7e3a543f6f
FEATURE: Double gist length to 40 words (#888) 2024-11-01 13:09:03 -03:00
Roman Rizzi e8f0633141
DEV: Extend truncation to all summarizable content (#884) 2024-10-31 12:17:42 -03:00
Roman Rizzi e8eed710e0
FIX: Truncate OP for gists to help the model focus on the latest posts (#883) 2024-10-31 10:54:56 -03:00
Roman Rizzi dd404c924a
DEV: Use different feature_names for summarization strategies (#875) 2024-10-29 08:45:14 -03:00
Rafael dos Santos Silva 33da27e231
FIX: Change hot gist prompt to avoid title repeating #859 (#859)
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-10-25 12:12:33 -03:00
Roman Rizzi ec97996905
FIX/REFACTOR: FoldContent revamp (#866)
* FIX/REFACTOR: FoldContent revamp

We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic.

The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex.

To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize.

Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with.

* Fix fold docs

* Use #shift instead of #pop to get the first elem, not the last
2024-10-25 11:51:17 -03:00
Roman Rizzi 3533814870
UX: Avoid introductory phrases and summarize topics without replies (#848) 2024-10-21 17:53:48 -03:00
Roman Rizzi 27b5542357
FEATURE: Generate topic gists for the hot topics list. (#837)
* Display gists in the hot topics list

* Adjust hot topics gist strategy and add a job to generate gists

* Replace setting with a configurable batch size

* Avoid loading summaries for other topic lists

* Tweak gist prompt to focus on latest posts in the context of the OP

* Remove serializer hack and rely on core change from discourse/discourse#29291

* Update lib/summarization/strategies/hot_topic_gists.rb

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>

---------

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2024-10-18 18:01:39 -03:00
Roman Rizzi c7acb4a6a0
REFACTOR: Support of different summarization targets/prompts. (#835)
* DEV: Add summary types

* Refactor for different summary types

* Use enum for summary types

* Update lib/summarization/strategies/topic_summary.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Update lib/summarization/strategies/topic_gist.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Update lib/summarization/strategies/chat_messages.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Fix chat_messages single prompt

* Small tweak to the chat summarization prompt

---------

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2024-10-15 13:53:26 -03:00
Sam 1320eed9b2
FEATURE: move summary to use llm_model (#699)
This allows summary to use the new LLM models and migrates of API key based model selection

Claude 3.5 etc... all work now. 

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-07-04 10:48:18 +10:00
Roman Rizzi fc081d9da6
FIX: Restore ability to fold summaries, which was accidentally removed (#700) 2024-07-03 18:10:31 -03:00
Keegan George 1b0ba9197c
DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
Sam 8eee6893d6
FEATURE: GPT4o support and better auditing (#618)
- Introduce new support for GPT4o (automation / bot / summary / helper)
- Properly account for token counts on OpenAI models
- Track feature that was used when generating AI completions
- Remove custom llm support for summarization as we need better interfaces to control registration and de-registration
2024-05-14 13:28:46 +10:00
Roman Rizzi 0c4069ab3f
DEV: Remove non-LLM-based summarization strategies. (#589)
We removed these services from our hosting two weeks ago. It's safe to assume everyone has moved to other LLM-based options.
2024-04-23 12:11:04 -03:00
Sam e8b2a200c1
FIX: prompt engineering for summary prompt (#539)
Prompt was steering incorrectly into the wrong language.

New prompt attempts to be more concise and clear and provides
better guidance about size of summary and how to format it.
2024-03-20 16:33:05 +11:00
Roman Rizzi 0634b85a81
UX: Validations to LLM-backed features (except AI Bot) (#436)
* UX: Validations to Llm-backed features (except AI Bot)

This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs.

Validations are:

* You must choose a model before enabling the feature.
* You must turn off the feature before setting the model to blank.
* You must configure each model settings before being able to select it.

* Add provider name to summarization options

* vLLM can technically support same models as HF

* Check we can talk to the selected model

* Check for Bedrock instead of anthropic as a site could have both creds setup
2024-01-29 16:04:25 -03:00
Roman Rizzi 04eae76f68
REFACTOR: Represent generic prompts with an Object. (#416)
* REFACTOR: Represent generic prompts with an Object.

* Adds a bit more validation for clarity

* Rewrite bot title prompt and fix quirk handling

---------

Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
2024-01-12 14:36:44 -03:00
Rafael dos Santos Silva 8fcba12fae
FEATURE: Support for SRV records for Discourse services (#414)
This allows admins to configure services with multiple backends using DNS SRV records. This PR also adds support for shared secret auth via headers for TEI and vLLM endpoints, so they are inline with the other ones.
2024-01-10 19:23:07 -03:00
Sam 03fc94684b
FIX: AI helper not working correctly with mixtral (#399)
* FIX: AI helper not working correctly with mixtral

This PR introduces a new function on the generic llm called #generate

This will replace the implementation of completion!

#generate introduces a new way to pass temperature, max_tokens and stop_sequences

Then LLM implementers need to implement #normalize_model_params to
ensure the generic names match the LLM specific endpoint

This also adds temperature and stop_sequences to completion_prompts
this allows for much more robust completion prompts

* port everything over to #generate

* Fix translation

- On anthropic this no longer throws random "This is your translation:"
- On mixtral this actually works

* fix markdown table generation as well
2024-01-04 09:53:47 -03:00
Rafael dos Santos Silva 83744bf192
FEATURE: Support for Gemini in AiHelper / Search / Summarization (#358) 2023-12-15 14:32:01 -03:00
Roman Rizzi 450ec915d8
FIX: Make FoldContent strategy more resilient when using models with low token count. (#341)
We'll recursively summarize  the content into smaller chunks until we are sure we can concatenate
them without going over the token limit.
2023-12-06 19:00:24 -03:00
Roman Rizzi 3bc010b686
FIX: call the right method to summarize with truncation (#328) 2023-12-01 10:17:24 -03:00
Sam 6ddc17fd61
DEV: port directory structure to Zeitwerk (#319)
Previous to this change we relied on explicit loading for a files in Discourse AI.

This had a few downsides:

- Busywork whenever you add a file (an extra require relative)
- We were not keeping to conventions internally ... some places were OpenAI others are OpenAi
- Autoloader did not work which lead to lots of full application broken reloads when developing.

This moves all of DiscourseAI into a Zeitwerk compatible structure.

It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there)

To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi

Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections

Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.
2023-11-29 15:17:46 +11:00