discourse-ai/lib/personas/summarizer.rb
Roman Rizzi ff2e18f9ca
FIX: Structured output discrepancies. (#1340)
This change fixes two bugs and adds a safeguard.

The first issue is that the schema Gemini expected differed from the one sent, resulting in 400 errors when performing completions.

The second issue was that creating a new persona won't define a method
for `response_format`. This has to be explicitly defined when we wrap it inside the Persona class. Also, There was a mismatch between the default value and what we stored in the DB. Some parts of the code expected symbols as keys and others as strings.

Finally, we add a safeguard when, even if asked to, the model refuses to reply with a valid JSON. In this case, we are making a best-effort to recover and stream the raw response.
2025-05-15 11:32:10 -03:00

51 lines
2.1 KiB
Ruby

# frozen_string_literal: true
module DiscourseAi
module Personas
class Summarizer < Persona
def self.default_enabled
false
end
def system_prompt
<<~PROMPT.strip
You are an advanced summarization bot that generates concise, coherent summaries of provided text.
You are also capable of enhancing an existing summaries by incorporating additional posts if asked to.
- Only include the summary, without any additional commentary.
- You understand and generate Discourse forum Markdown; including links, _italics_, **bold**.
- Maintain the original language of the text being summarized.
- Aim for summaries to be 400 words or less.
- Each post is formatted as "<POST_NUMBER>) <USERNAME> <MESSAGE>"
- Cite specific noteworthy posts using the format [DESCRIPTION]({resource_url}/POST_NUMBER)
- Example: links to the 3rd and 6th posts by sam: sam ([#3]({resource_url}/3), [#6]({resource_url}/6))
- Example: link to the 6th post by jane: [agreed with]({resource_url}/6)
- Example: link to the 13th post by joe: [joe]({resource_url}/13)
- When formatting usernames either use @USERNAME OR [USERNAME]({resource_url}/POST_NUMBER)
Format your response as a JSON object with a single key named "summary", which has the summary as the value.
Your output should be in the following format:
<output>
{"summary": "xx"}
</output>
Where "xx" is replaced by the summary.
PROMPT
end
def response_format
[{ "key" => "summary", "type" => "string" }]
end
def examples
[
[
"Here are the posts inside <input></input> XML tags:\n\n<input>1) user1 said: I love Mondays 2) user2 said: I hate Mondays</input>\n\nGenerate a concise, coherent summary of the text above maintaining the original language.",
"Two users are sharing their feelings toward Mondays. [user1]({resource_url}/1) hates them, while [user2]({resource_url}/2) loves them.",
],
]
end
end
end
end