discourse-ai

Commit Graph

Author	SHA1	Message	Date
Roman Rizzi	7e3a543f6f	FEATURE: Double gist length to 40 words (#888 )	2024-11-01 13:09:03 -03:00
Roman Rizzi	e8f0633141	DEV: Extend truncation to all summarizable content (#884 )	2024-10-31 12:17:42 -03:00
Roman Rizzi	e8eed710e0	FIX: Truncate OP for gists to help the model focus on the latest posts (#883 )	2024-10-31 10:54:56 -03:00
Roman Rizzi	dd404c924a	DEV: Use different feature_names for summarization strategies (#875 )	2024-10-29 08:45:14 -03:00
Rafael dos Santos Silva	33da27e231	FIX: Change hot gist prompt to avoid title repeating #859 (#859 ) Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2024-10-25 12:12:33 -03:00
Roman Rizzi	ec97996905	FIX/REFACTOR: FoldContent revamp (#866 ) * FIX/REFACTOR: FoldContent revamp We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic. The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex. To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize. Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with. * Fix fold docs * Use #shift instead of #pop to get the first elem, not the last	2024-10-25 11:51:17 -03:00
Roman Rizzi	3533814870	UX: Avoid introductory phrases and summarize topics without replies (#848 )	2024-10-21 17:53:48 -03:00
Roman Rizzi	27b5542357	FEATURE: Generate topic gists for the hot topics list. (#837 ) * Display gists in the hot topics list * Adjust hot topics gist strategy and add a job to generate gists * Replace setting with a configurable batch size * Avoid loading summaries for other topic lists * Tweak gist prompt to focus on latest posts in the context of the OP * Remove serializer hack and rely on core change from discourse/discourse#29291 * Update lib/summarization/strategies/hot_topic_gists.rb Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com> --------- Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>	2024-10-18 18:01:39 -03:00
Roman Rizzi	c7acb4a6a0	REFACTOR: Support of different summarization targets/prompts. (#835 ) * DEV: Add summary types * Refactor for different summary types * Use enum for summary types * Update lib/summarization/strategies/topic_summary.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Update lib/summarization/strategies/topic_gist.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Update lib/summarization/strategies/chat_messages.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Fix chat_messages single prompt * Small tweak to the chat summarization prompt --------- Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>	2024-10-15 13:53:26 -03:00
Sam	1320eed9b2	FEATURE: move summary to use llm_model (#699 ) This allows summary to use the new LLM models and migrates of API key based model selection Claude 3.5 etc... all work now. --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2024-07-04 10:48:18 +10:00
Roman Rizzi	fc081d9da6	FIX: Restore ability to fold summaries, which was accidentally removed (#700 )	2024-07-03 18:10:31 -03:00
Keegan George	1b0ba9197c	DEV: Add summarization logic from core (#658 )	2024-07-02 08:51:59 -07:00
Sam	8eee6893d6	FEATURE: GPT4o support and better auditing (#618 ) - Introduce new support for GPT4o (automation / bot / summary / helper) - Properly account for token counts on OpenAI models - Track feature that was used when generating AI completions - Remove custom llm support for summarization as we need better interfaces to control registration and de-registration	2024-05-14 13:28:46 +10:00
Roman Rizzi	0c4069ab3f	DEV: Remove non-LLM-based summarization strategies. (#589 ) We removed these services from our hosting two weeks ago. It's safe to assume everyone has moved to other LLM-based options.	2024-04-23 12:11:04 -03:00
Sam	e8b2a200c1	FIX: prompt engineering for summary prompt (#539 ) Prompt was steering incorrectly into the wrong language. New prompt attempts to be more concise and clear and provides better guidance about size of summary and how to format it.	2024-03-20 16:33:05 +11:00
Roman Rizzi	0634b85a81	UX: Validations to LLM-backed features (except AI Bot) (#436 ) * UX: Validations to Llm-backed features (except AI Bot) This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs. Validations are: * You must choose a model before enabling the feature. * You must turn off the feature before setting the model to blank. * You must configure each model settings before being able to select it. * Add provider name to summarization options * vLLM can technically support same models as HF * Check we can talk to the selected model * Check for Bedrock instead of anthropic as a site could have both creds setup	2024-01-29 16:04:25 -03:00
Roman Rizzi	04eae76f68	REFACTOR: Represent generic prompts with an Object. (#416 ) * REFACTOR: Represent generic prompts with an Object. * Adds a bit more validation for clarity * Rewrite bot title prompt and fix quirk handling --------- Co-authored-by: Sam Saffron <sam.saffron@gmail.com>	2024-01-12 14:36:44 -03:00
Rafael dos Santos Silva	8fcba12fae	FEATURE: Support for SRV records for Discourse services (#414 ) This allows admins to configure services with multiple backends using DNS SRV records. This PR also adds support for shared secret auth via headers for TEI and vLLM endpoints, so they are inline with the other ones.	2024-01-10 19:23:07 -03:00
Sam	03fc94684b	FIX: AI helper not working correctly with mixtral (#399 ) * FIX: AI helper not working correctly with mixtral This PR introduces a new function on the generic llm called #generate This will replace the implementation of completion! #generate introduces a new way to pass temperature, max_tokens and stop_sequences Then LLM implementers need to implement #normalize_model_params to ensure the generic names match the LLM specific endpoint This also adds temperature and stop_sequences to completion_prompts this allows for much more robust completion prompts * port everything over to #generate * Fix translation - On anthropic this no longer throws random "This is your translation:" - On mixtral this actually works * fix markdown table generation as well	2024-01-04 09:53:47 -03:00
Rafael dos Santos Silva	83744bf192	FEATURE: Support for Gemini in AiHelper / Search / Summarization (#358 )	2023-12-15 14:32:01 -03:00
Roman Rizzi	450ec915d8	FIX: Make FoldContent strategy more resilient when using models with low token count. (#341 ) We'll recursively summarize the content into smaller chunks until we are sure we can concatenate them without going over the token limit.	2023-12-06 19:00:24 -03:00
Roman Rizzi	3bc010b686	FIX: call the right method to summarize with truncation (#328 )	2023-12-01 10:17:24 -03:00
Sam	6ddc17fd61	DEV: port directory structure to Zeitwerk (#319 ) Previous to this change we relied on explicit loading for a files in Discourse AI. This had a few downsides: - Busywork whenever you add a file (an extra require relative) - We were not keeping to conventions internally ... some places were OpenAI others are OpenAi - Autoloader did not work which lead to lots of full application broken reloads when developing. This moves all of DiscourseAI into a Zeitwerk compatible structure. It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there) To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.	2023-11-29 15:17:46 +11:00

23 Commits