0a8195242b
Using RAG fragments can lead to considerably big system messages, which becomes problematic when models have a smaller context window. Before this change, we only look at the rest of the conversation to make sure we don't surpass the limit, which could lead to two unwanted scenarios when having large system messages: All other messages are excluded due to size. The system message already exceeds the limit. As a result, I'm putting a hard-limit of 60% of available tokens. We don't want to aggresively truncate because if rag fragments are included, the system message contains a lot of context to improve the model response, but we also want to make room for the recent messages in the conversation. |
||
---|---|---|
.github/workflows | ||
admin/assets/javascripts/discourse | ||
app | ||
assets | ||
config | ||
db | ||
discourse_automation | ||
lib | ||
public/ai-share | ||
spec | ||
test/javascripts | ||
tokenizers | ||
.discourse-compatibility | ||
.eslintrc.cjs | ||
.gitignore | ||
.prettierignore | ||
.prettierrc.cjs | ||
.rubocop.yml | ||
.streerc | ||
.template-lintrc.cjs | ||
Gemfile | ||
Gemfile.lock | ||
LICENSE | ||
README.md | ||
package.json | ||
plugin.rb | ||
translator.yml | ||
yarn.lock |
README.md
Discourse AI Plugin
Plugin Summary
For more information, please see: https://meta.discourse.org/t/discourse-ai/259214?u=falco