discourse-ai

mirror of https://github.com/discourse/discourse-ai.git synced 2025-02-19 18:04:51 +00:00

Author	SHA1	Message	Date
Roman Rizzi	c7acb4a6a0	REFACTOR: Support of different summarization targets/prompts. (#835 ) * DEV: Add summary types * Refactor for different summary types * Use enum for summary types * Update lib/summarization/strategies/topic_summary.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Update lib/summarization/strategies/topic_gist.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Update lib/summarization/strategies/chat_messages.rb Co-authored-by: Penar Musaraj <pmusaraj@gmail.com> * Fix chat_messages single prompt * Small tweak to the chat summarization prompt --------- Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>	2024-10-15 13:53:26 -03:00
Sam	1320eed9b2	FEATURE: move summary to use llm_model (#699 ) This allows summary to use the new LLM models and migrates of API key based model selection Claude 3.5 etc... all work now. --------- Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2024-07-04 10:48:18 +10:00
Roman Rizzi	fc081d9da6	FIX: Restore ability to fold summaries, which was accidentally removed (#700 )	2024-07-03 18:10:31 -03:00
Keegan George	1b0ba9197c	DEV: Add summarization logic from core (#658 )	2024-07-02 08:51:59 -07:00
Sam	d5c23f01ff	FIX: correct gemini streaming implementation (#632 ) This also implements image support and gemini-flash support	2024-05-22 16:35:29 +10:00
Sam	8eee6893d6	FEATURE: GPT4o support and better auditing (#618 ) - Introduce new support for GPT4o (automation / bot / summary / helper) - Properly account for token counts on OpenAI models - Track feature that was used when generating AI completions - Remove custom llm support for summarization as we need better interfaces to control registration and de-registration	2024-05-14 13:28:46 +10:00
Roman Rizzi	e22194f321	HACK: Llama3 support for summarization/AI helper. (#616 ) There are still some limitations to which models we can support with the `LlmModel` class. This will enable support for Llama3 while we sort those out.	2024-05-13 15:54:42 -03:00
Roman Rizzi	4f1a3effe0	REFACTOR: Migrate Vllm/TGI-served models to the OpenAI format. (#588 ) Both endpoints provide OpenAI-compatible servers. The only difference is that Vllm doesn't support passing tools as a separate parameter. Even if the tool param is supported, it ultimately relies on the model's ability to handle native functions, which is not the case with the models we have today. As a part of this change, we are dropping support for StableBeluga/Llama2 models. They don't have a chat_template, meaning the new API can translate them. These changes let us remove some of our existing dialects and are a first step in our plan to support any LLM by defining them as data-driven concepts. I rewrote the "translate" method to use a template method and extracted the tool support strategies into its classes to simplify the code. Finally, these changes bring support for Ollama when running in dev mode. It only works with Mistral for now, but it will change soon..	2024-05-07 10:02:16 -03:00
Roman Rizzi	0c4069ab3f	DEV: Remove non-LLM-based summarization strategies. (#589 ) We removed these services from our hosting two weeks ago. It's safe to assume everyone has moved to other LLM-based options.	2024-04-23 12:11:04 -03:00
Sam	50be66ee63	FEATURE: Gemini 1.5 pro support and Claude Opus bedrock support (#580 ) - Updated AI Bot to only support Gemini 1.5 (used to support 1.0) - 1.0 was removed cause it is not appropriate for Bot usage - Summaries and automation can now lean on Gemini 1.5 pro - Amazon added support for Claude 3 Opus, added internal support for it on bedrock	2024-04-17 15:37:19 +10:00
Sam	6de9c53a71	FEATURE: remove gpt-4-turbo-0125 preview swap with gpt-4-turbo (#568 ) Open AI just released gpt-4-turbo (with vision) This change stops using the old preview model and swaps with the officially released gpt-4-turbo To come is an implementation of vision.	2024-04-10 09:53:20 -03:00
Sam	e8b2a200c1	FIX: prompt engineering for summary prompt (#539 ) Prompt was steering incorrectly into the wrong language. New prompt attempts to be more concise and clear and provides better guidance about size of summary and how to format it.	2024-03-20 16:33:05 +11:00
Sam	41f1530078	FIX: mention suppression was not working right (#538 ) We were only suppressing non mentions, ones that become spans. @sam in the test was not resolving to a mention cause the user did not exist. depends on: https://github.com/discourse/discourse/pull/26253 for tests to pass.	2024-03-20 13:00:39 +11:00
Sam	cc0369dd39	FEATURE: friendlier reply behavior in bot PMs (#535 ) - Stop replying as bot, when human replies to another human - Reply as correct persona when replying directly to a persona - Fix paper cut where suppressing notifications was not doing so	2024-03-19 20:15:12 +11:00
Roman Rizzi	0634b85a81	UX: Validations to LLM-backed features (except AI Bot) (#436 ) * UX: Validations to Llm-backed features (except AI Bot) This change is part of an ongoing effort to prevent enabling a broken feature due to lack of configuration. We also want to explicit which provider we are going to use. For example, Claude models are available through AWS Bedrock and Anthropic, but the configuration differs. Validations are: * You must choose a model before enabling the feature. * You must turn off the feature before setting the model to blank. * You must configure each model settings before being able to select it. * Add provider name to summarization options * vLLM can technically support same models as HF * Check we can talk to the selected model * Check for Bedrock instead of anthropic as a site could have both creds setup	2024-01-29 16:04:25 -03:00
Sam	092da860e2	FEATURE: support gpt-4-0125 which was just released (#443 ) The new model has better performance and is always preferable to the old one which has unicode issues during function calls.	2024-01-26 09:08:02 +11:00
Jarek Radosz	5802cd1a0c	DEV: Fix various typos (#434 )	2024-01-19 12:51:26 +01:00
Roman Rizzi	04eae76f68	REFACTOR: Represent generic prompts with an Object. (#416 ) * REFACTOR: Represent generic prompts with an Object. * Adds a bit more validation for clarity * Rewrite bot title prompt and fix quirk handling --------- Co-authored-by: Sam Saffron <sam.saffron@gmail.com>	2024-01-12 14:36:44 -03:00
Rafael dos Santos Silva	8fcba12fae	FEATURE: Support for SRV records for Discourse services (#414 ) This allows admins to configure services with multiple backends using DNS SRV records. This PR also adds support for shared secret auth via headers for TEI and vLLM endpoints, so they are inline with the other ones.	2024-01-10 19:23:07 -03:00
Roman Rizzi	abde82c1f3	FIX: Use claude-2.1 to enable system prompts (#411 )	2024-01-09 14:10:20 -03:00
Sam	03fc94684b	FIX: AI helper not working correctly with mixtral (#399 ) * FIX: AI helper not working correctly with mixtral This PR introduces a new function on the generic llm called #generate This will replace the implementation of completion! #generate introduces a new way to pass temperature, max_tokens and stop_sequences Then LLM implementers need to implement #normalize_model_params to ensure the generic names match the LLM specific endpoint This also adds temperature and stop_sequences to completion_prompts this allows for much more robust completion prompts * port everything over to #generate * Fix translation - On anthropic this no longer throws random "This is your translation:" - On mixtral this actually works * fix markdown table generation as well	2024-01-04 09:53:47 -03:00
Rafael dos Santos Silva	20cb15ab5f	FEATURE: Mixtral for summarization (#381 )	2023-12-26 17:50:02 -03:00
Rafael dos Santos Silva	5db7bf6e68	Mixtral (#376 ) Add both Mistral and Mixtral support. Also includes vLLM-openAI inference support. Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>	2023-12-26 14:49:55 -03:00
Sam	d0f54443ae	FEATURE: LLM based peroidical summary report (#357 ) Introduce a Discourse Automation based periodical report. Depends on Discourse Automation. Report works best with very large context language models such as GPT-4-Turbo and Claude 2. - Introduces final_insts to generic llm format, for claude to work best it is better to guide the last assistant message (we should add this to other spots as well) - Adds GPT-4 turbo support to generic llm interface	2023-12-19 12:04:15 +11:00
Rafael dos Santos Silva	83744bf192	FEATURE: Support for Gemini in AiHelper / Search / Summarization (#358 )	2023-12-15 14:32:01 -03:00
Roman Rizzi	450ec915d8	FIX: Make FoldContent strategy more resilient when using models with low token count. (#341 ) We'll recursively summarize the content into smaller chunks until we are sure we can concatenate them without going over the token limit.	2023-12-06 19:00:24 -03:00
Roman Rizzi	3bc010b686	FIX: call the right method to summarize with truncation (#328 )	2023-12-01 10:17:24 -03:00
Sam	6ddc17fd61	DEV: port directory structure to Zeitwerk (#319 ) Previous to this change we relied on explicit loading for a files in Discourse AI. This had a few downsides: - Busywork whenever you add a file (an extra require relative) - We were not keeping to conventions internally ... some places were OpenAI others are OpenAi - Autoloader did not work which lead to lots of full application broken reloads when developing. This moves all of DiscourseAI into a Zeitwerk compatible structure. It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there) To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.	2023-11-29 15:17:46 +11:00

28 Commits