mirror of
https://github.com/discourse/discourse-ai.git
synced 2025-06-25 17:12:16 +00:00
DEV: improve tool infra, improve forum researcher prompts, improve logging (#1391)
- add sleep function for tool polling with rate limits - Support base64 encoding for HTTP requests and uploads - Enhance forum researcher with cost warnings and comprehensive planning - Add cancellation support for research operations - Include feature_name parameter for bot analytics - richer research support (OR queries)
This commit is contained in:
parent
4c0660d6fd
commit
4dffd0b2c5
@ -15,7 +15,7 @@ module ::Jobs
|
|||||||
|
|
||||||
bot = DiscourseAi::Personas::Bot.as(bot_user, persona: persona.new)
|
bot = DiscourseAi::Personas::Bot.as(bot_user, persona: persona.new)
|
||||||
|
|
||||||
DiscourseAi::AiBot::Playground.new(bot).reply_to(post)
|
DiscourseAi::AiBot::Playground.new(bot).reply_to(post, feature_name: "bot")
|
||||||
rescue DiscourseAi::Personas::Bot::BOT_NOT_FOUND
|
rescue DiscourseAi::Personas::Bot::BOT_NOT_FOUND
|
||||||
Rails.logger.warn(
|
Rails.logger.warn(
|
||||||
"Bot not found for post #{post.id} - perhaps persona was deleted or bot was disabled",
|
"Bot not found for post #{post.id} - perhaps persona was deleted or bot was disabled",
|
||||||
|
@ -13,43 +13,45 @@ module DiscourseAi
|
|||||||
|
|
||||||
def system_prompt
|
def system_prompt
|
||||||
<<~PROMPT
|
<<~PROMPT
|
||||||
You are a helpful Discourse assistant specializing in forum research.
|
You are a helpful Discourse assistant specializing in forum research.
|
||||||
You _understand_ and **generate** Discourse Markdown.
|
You _understand_ and **generate** Discourse Markdown.
|
||||||
|
|
||||||
You live in the forum with the URL: {site_url}
|
You live in the forum with the URL: {site_url}
|
||||||
The title of your site: {site_title}
|
The title of your site: {site_title}
|
||||||
The description is: {site_description}
|
The description is: {site_description}
|
||||||
The participants in this conversation are: {participants}
|
The participants in this conversation are: {participants}
|
||||||
The date now is: {time}, much has changed since you were trained.
|
The date now is: {time}, much has changed since you were trained.
|
||||||
Topic URLs are formatted as: /t/-/TOPIC_ID
|
Topic URLs are formatted as: /t/-/TOPIC_ID
|
||||||
Post URLs are formatted as: /t/-/TOPIC_ID/POST_NUMBER
|
Post URLs are formatted as: /t/-/TOPIC_ID/POST_NUMBER
|
||||||
|
|
||||||
As a forum researcher, guide users through a structured research process:
|
CRITICAL: Research is extremely expensive. You MUST gather ALL research goals upfront and execute them in a SINGLE request. Never run multiple research operations.
|
||||||
1. UNDERSTAND: First clarify the user's research goal - what insights are they seeking?
|
|
||||||
2. PLAN: Design an appropriate research approach with specific filters
|
|
||||||
3. TEST: Always begin with dry_run:true to gauge the scope of results
|
|
||||||
4. REFINE: If results are too broad/narrow, suggest filter adjustments
|
|
||||||
5. EXECUTE: Run the final analysis only when filters are well-tuned
|
|
||||||
6. SUMMARIZE: Present findings with links to supporting evidence
|
|
||||||
|
|
||||||
BE MINDFUL: specify all research goals in one request to avoid multiple processing runs.
|
As a forum researcher, follow this structured process:
|
||||||
|
1. UNDERSTAND: Clarify ALL research goals - what insights are they seeking?
|
||||||
|
2. PLAN: Design ONE comprehensive research approach covering all objectives
|
||||||
|
3. TEST: Always begin with dry_run:true to gauge the scope of results
|
||||||
|
4. REFINE: If results are too broad/narrow, suggest filter adjustments (but don't re-run yet)
|
||||||
|
5. EXECUTE: Run the final analysis ONCE when filters are well-tuned for all goals
|
||||||
|
6. SUMMARIZE: Present findings with links to supporting evidence
|
||||||
|
|
||||||
REMEMBER: Different filters serve different purposes:
|
Before any research, ask users to specify:
|
||||||
- Use post date filters (after/before) for analyzing specific posts
|
- ALL research questions they want answered
|
||||||
- Use topic date filters (topic_after/topic_before) for analyzing entire topics
|
- Time periods of interest
|
||||||
- Combine user/group filters with categories/tags to find specialized contributions
|
- Specific users, categories, or tags to focus on
|
||||||
|
- Expected scope (broad overview vs. deep dive)
|
||||||
|
|
||||||
Always ground your analysis with links to original posts on the forum.
|
Research filter guidelines:
|
||||||
|
- Use post date filters (after/before) for analyzing specific posts
|
||||||
|
- Use topic date filters (topic_after/topic_before) for analyzing entire topics
|
||||||
|
- Combine user/group filters with categories/tags to find specialized contributions
|
||||||
|
|
||||||
Research workflow best practices:
|
When formatting results:
|
||||||
1. Start with a dry_run to gauge the scope (set dry_run:true)
|
- Link to topics with descriptive text when relevant
|
||||||
2. For temporal analysis, specify explicit date ranges
|
- Use markdown footnotes for supporting evidence
|
||||||
3. For user behavior analysis, combine @username with categories or tags
|
- Always ground analysis with links to original forum posts
|
||||||
|
|
||||||
- When formatting research results, format backing links clearly:
|
Remember: ONE research request should answer ALL questions. Plan comprehensively before executing.
|
||||||
- When it is a good fit, link to the topic with descriptive text.
|
PROMPT
|
||||||
- When it is a good fit, link using markdown footnotes.
|
|
||||||
PROMPT
|
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
@ -13,6 +13,9 @@ module DiscourseAi
|
|||||||
MARSHAL_STACK_DEPTH = 20
|
MARSHAL_STACK_DEPTH = 20
|
||||||
MAX_HTTP_REQUESTS = 20
|
MAX_HTTP_REQUESTS = 20
|
||||||
|
|
||||||
|
MAX_SLEEP_CALLS = 30
|
||||||
|
MAX_SLEEP_DURATION_MS = 60_000
|
||||||
|
|
||||||
def initialize(parameters:, llm:, bot_user:, context: nil, tool:, timeout: nil)
|
def initialize(parameters:, llm:, bot_user:, context: nil, tool:, timeout: nil)
|
||||||
if context && !context.is_a?(DiscourseAi::Personas::BotContext)
|
if context && !context.is_a?(DiscourseAi::Personas::BotContext)
|
||||||
raise ArgumentError, "context must be a BotContext object"
|
raise ArgumentError, "context must be a BotContext object"
|
||||||
@ -28,6 +31,7 @@ module DiscourseAi
|
|||||||
@timeout = timeout || DEFAULT_TIMEOUT
|
@timeout = timeout || DEFAULT_TIMEOUT
|
||||||
@running_attached_function = false
|
@running_attached_function = false
|
||||||
|
|
||||||
|
@sleep_calls_made = 0
|
||||||
@http_requests_made = 0
|
@http_requests_made = 0
|
||||||
end
|
end
|
||||||
|
|
||||||
@ -44,6 +48,7 @@ module DiscourseAi
|
|||||||
attach_index(ctx)
|
attach_index(ctx)
|
||||||
attach_upload(ctx)
|
attach_upload(ctx)
|
||||||
attach_chain(ctx)
|
attach_chain(ctx)
|
||||||
|
attach_sleep(ctx)
|
||||||
attach_discourse(ctx)
|
attach_discourse(ctx)
|
||||||
ctx.eval(framework_script)
|
ctx.eval(framework_script)
|
||||||
ctx
|
ctx
|
||||||
@ -73,6 +78,9 @@ module DiscourseAi
|
|||||||
const upload = {
|
const upload = {
|
||||||
create: _upload_create,
|
create: _upload_create,
|
||||||
getUrl: _upload_get_url,
|
getUrl: _upload_get_url,
|
||||||
|
getBase64: function(id, maxPixels) {
|
||||||
|
return _upload_get_base64(id, maxPixels);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
const chain = {
|
const chain = {
|
||||||
@ -310,6 +318,33 @@ module DiscourseAi
|
|||||||
mini_racer_context.attach("_chain_set_custom_raw", ->(raw) { self.custom_raw = raw })
|
mini_racer_context.attach("_chain_set_custom_raw", ->(raw) { self.custom_raw = raw })
|
||||||
end
|
end
|
||||||
|
|
||||||
|
# this is useful for polling apis
|
||||||
|
def attach_sleep(mini_racer_context)
|
||||||
|
mini_racer_context.attach(
|
||||||
|
"sleep",
|
||||||
|
->(duration_ms) do
|
||||||
|
@sleep_calls_made += 1
|
||||||
|
if @sleep_calls_made > MAX_SLEEP_CALLS
|
||||||
|
raise TooManyRequestsError.new("Tool made too many sleep calls")
|
||||||
|
end
|
||||||
|
|
||||||
|
duration_ms = duration_ms.to_i
|
||||||
|
if duration_ms > MAX_SLEEP_DURATION_MS
|
||||||
|
raise ArgumentError.new(
|
||||||
|
"Sleep duration cannot exceed #{MAX_SLEEP_DURATION_MS}ms (1 minute)",
|
||||||
|
)
|
||||||
|
end
|
||||||
|
|
||||||
|
raise ArgumentError.new("Sleep duration must be positive") if duration_ms <= 0
|
||||||
|
|
||||||
|
in_attached_function do
|
||||||
|
sleep(duration_ms / 1000.0)
|
||||||
|
{ slept: duration_ms }
|
||||||
|
end
|
||||||
|
end,
|
||||||
|
)
|
||||||
|
end
|
||||||
|
|
||||||
def attach_discourse(mini_racer_context)
|
def attach_discourse(mini_racer_context)
|
||||||
mini_racer_context.attach(
|
mini_racer_context.attach(
|
||||||
"_discourse_get_post",
|
"_discourse_get_post",
|
||||||
@ -571,6 +606,42 @@ module DiscourseAi
|
|||||||
end
|
end
|
||||||
|
|
||||||
def attach_upload(mini_racer_context)
|
def attach_upload(mini_racer_context)
|
||||||
|
mini_racer_context.attach(
|
||||||
|
"_upload_get_base64",
|
||||||
|
->(upload_id_or_url, max_pixels) do
|
||||||
|
in_attached_function do
|
||||||
|
return nil if upload_id_or_url.blank?
|
||||||
|
|
||||||
|
upload = nil
|
||||||
|
|
||||||
|
# Handle both upload ID and short URL
|
||||||
|
if upload_id_or_url.to_s.start_with?("upload://")
|
||||||
|
# Handle short URL format
|
||||||
|
sha1 = Upload.sha1_from_short_url(upload_id_or_url)
|
||||||
|
return nil if sha1.blank?
|
||||||
|
upload = Upload.find_by(sha1: sha1)
|
||||||
|
else
|
||||||
|
# Handle numeric ID
|
||||||
|
upload_id = upload_id_or_url.to_i
|
||||||
|
return nil if upload_id <= 0
|
||||||
|
upload = Upload.find_by(id: upload_id)
|
||||||
|
end
|
||||||
|
|
||||||
|
return nil if upload.nil?
|
||||||
|
|
||||||
|
max_pixels = max_pixels&.to_i
|
||||||
|
max_pixels = nil if max_pixels && max_pixels <= 0
|
||||||
|
|
||||||
|
encoded_uploads =
|
||||||
|
DiscourseAi::Completions::UploadEncoder.encode(
|
||||||
|
upload_ids: [upload.id],
|
||||||
|
max_pixels: max_pixels || 10_000_000, # Default to 10M pixels if not specified
|
||||||
|
)
|
||||||
|
|
||||||
|
encoded_uploads.first&.dig(:base64)
|
||||||
|
end
|
||||||
|
end,
|
||||||
|
)
|
||||||
mini_racer_context.attach(
|
mini_racer_context.attach(
|
||||||
"_upload_get_url",
|
"_upload_get_url",
|
||||||
->(short_url) do
|
->(short_url) do
|
||||||
@ -629,13 +700,18 @@ module DiscourseAi
|
|||||||
|
|
||||||
in_attached_function do
|
in_attached_function do
|
||||||
headers = (options && options["headers"]) || {}
|
headers = (options && options["headers"]) || {}
|
||||||
|
base64_encode = options && options["base64Encode"]
|
||||||
|
|
||||||
result = {}
|
result = {}
|
||||||
DiscourseAi::Personas::Tools::Tool.send_http_request(
|
DiscourseAi::Personas::Tools::Tool.send_http_request(
|
||||||
url,
|
url,
|
||||||
headers: headers,
|
headers: headers,
|
||||||
) do |response|
|
) do |response|
|
||||||
result[:body] = response.body
|
if base64_encode
|
||||||
|
result[:body] = Base64.strict_encode64(response.body)
|
||||||
|
else
|
||||||
|
result[:body] = response.body
|
||||||
|
end
|
||||||
result[:status] = response.code.to_i
|
result[:status] = response.code.to_i
|
||||||
end
|
end
|
||||||
|
|
||||||
@ -658,6 +734,7 @@ module DiscourseAi
|
|||||||
in_attached_function do
|
in_attached_function do
|
||||||
headers = (options && options["headers"]) || {}
|
headers = (options && options["headers"]) || {}
|
||||||
body = options && options["body"]
|
body = options && options["body"]
|
||||||
|
base64_encode = options && options["base64Encode"]
|
||||||
|
|
||||||
result = {}
|
result = {}
|
||||||
DiscourseAi::Personas::Tools::Tool.send_http_request(
|
DiscourseAi::Personas::Tools::Tool.send_http_request(
|
||||||
@ -666,7 +743,11 @@ module DiscourseAi
|
|||||||
headers: headers,
|
headers: headers,
|
||||||
body: body,
|
body: body,
|
||||||
) do |response|
|
) do |response|
|
||||||
result[:body] = response.body
|
if base64_encode
|
||||||
|
result[:body] = Base64.strict_encode64(response.body)
|
||||||
|
else
|
||||||
|
result[:body] = response.body
|
||||||
|
end
|
||||||
result[:status] = response.code.to_i
|
result[:status] = response.code.to_i
|
||||||
end
|
end
|
||||||
|
|
||||||
|
@ -33,19 +33,24 @@ module DiscourseAi
|
|||||||
<<~TEXT
|
<<~TEXT
|
||||||
Filter string to target specific content.
|
Filter string to target specific content.
|
||||||
- Supports user (@username)
|
- Supports user (@username)
|
||||||
|
- post_type:first - only includes first posts in topics
|
||||||
|
- post_type:reply - only replies in topics
|
||||||
- date ranges (after:YYYY-MM-DD, before:YYYY-MM-DD for posts; topic_after:YYYY-MM-DD, topic_before:YYYY-MM-DD for topics)
|
- date ranges (after:YYYY-MM-DD, before:YYYY-MM-DD for posts; topic_after:YYYY-MM-DD, topic_before:YYYY-MM-DD for topics)
|
||||||
- categories (category:category1,category2)
|
- categories (category:category1,category2 or categories:category1,category2)
|
||||||
- tags (tag:tag1,tag2)
|
- tags (tag:tag1,tag2 or tags:tag1,tag2)
|
||||||
- groups (group:group1,group2).
|
- groups (group:group1,group2 or groups:group1,group2)
|
||||||
- status (status:open, status:closed, status:archived, status:noreplies, status:single_user)
|
- status (status:open, status:closed, status:archived, status:noreplies, status:single_user)
|
||||||
- keywords (keywords:keyword1,keyword2) - specific words to search for in posts
|
- keywords (keywords:keyword1,keyword2) - searches for specific words within post content using full-text search
|
||||||
- max_results (max_results:10) the maximum number of results to return (optional)
|
- topic_keywords (topic_keywords:keyword1,keyword2) - searches for keywords within topics, returns all posts from matching topics
|
||||||
- order (order:latest, order:oldest, order:latest_topic, order:oldest_topic) - the order of the results (optional)
|
- topics (topic:topic_id1,topic_id2 or topics:topic_id1,topic_id2) - target specific topics by ID
|
||||||
- topic (topic:topic_id1,topic_id2) - add specific topics to the filter, topics will unconditionally be included
|
- max_results (max_results:10) - limits the maximum number of results returned (optional)
|
||||||
|
- order (order:latest, order:oldest, order:latest_topic, order:oldest_topic, order:likes) - controls result ordering (optional, defaults to latest posts)
|
||||||
|
|
||||||
If multiple tags or categories are specified, they are treated as OR conditions.
|
Multiple filters can be combined with spaces for AND logic. Example: '@sam after:2023-01-01 tag:feature'
|
||||||
|
|
||||||
Multiple filters can be combined with spaces. Example: '@sam after:2023-01-01 tag:feature'
|
Use OR to combine filter segments for inclusive logic.
|
||||||
|
Example: 'category:feature,bug OR tag:feature-tag' - includes posts in feature OR bug categories, OR posts with feature-tag tag
|
||||||
|
Example: '@sam category:bug' - includes posts by @sam AND in bug category
|
||||||
TEXT
|
TEXT
|
||||||
end
|
end
|
||||||
|
|
||||||
@ -145,10 +150,23 @@ module DiscourseAi
|
|||||||
results = []
|
results = []
|
||||||
|
|
||||||
formatter.each_chunk { |chunk| results << run_inference(chunk[:text], goals, post, &blk) }
|
formatter.each_chunk { |chunk| results << run_inference(chunk[:text], goals, post, &blk) }
|
||||||
{ dry_run: false, goals: goals, filter: @filter, results: results }
|
|
||||||
|
if context.cancel_manager&.cancelled?
|
||||||
|
{
|
||||||
|
dry_run: false,
|
||||||
|
goals: goals,
|
||||||
|
filter: @filter,
|
||||||
|
results: "Cancelled by user",
|
||||||
|
cancelled_by_user: true,
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{ dry_run: false, goals: goals, filter: @filter, results: results }
|
||||||
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
def run_inference(chunk_text, goals, post, &blk)
|
def run_inference(chunk_text, goals, post, &blk)
|
||||||
|
return if context.cancel_manager&.cancelled?
|
||||||
|
|
||||||
system_prompt = goal_system_prompt(goals)
|
system_prompt = goal_system_prompt(goals)
|
||||||
user_prompt = goal_user_prompt(goals, chunk_text)
|
user_prompt = goal_user_prompt(goals, chunk_text)
|
||||||
|
|
||||||
|
@ -4,7 +4,6 @@ module DiscourseAi
|
|||||||
module Utils
|
module Utils
|
||||||
module Research
|
module Research
|
||||||
class Filter
|
class Filter
|
||||||
# Stores custom filter handlers
|
|
||||||
def self.register_filter(matcher, &block)
|
def self.register_filter(matcher, &block)
|
||||||
(@registered_filters ||= {})[matcher] = block
|
(@registered_filters ||= {})[matcher] = block
|
||||||
end
|
end
|
||||||
@ -19,7 +18,6 @@ module DiscourseAi
|
|||||||
|
|
||||||
attr_reader :term, :filters, :order, :guardian, :limit, :offset, :invalid_filters
|
attr_reader :term, :filters, :order, :guardian, :limit, :offset, :invalid_filters
|
||||||
|
|
||||||
# Define all filters at class level
|
|
||||||
register_filter(/\Astatus:open\z/i) do |relation, _, _|
|
register_filter(/\Astatus:open\z/i) do |relation, _, _|
|
||||||
relation.where("topics.closed = false AND topics.archived = false")
|
relation.where("topics.closed = false AND topics.archived = false")
|
||||||
end
|
end
|
||||||
@ -109,6 +107,30 @@ module DiscourseAi
|
|||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
|
register_filter(/\Atopic_keywords?:(.*)\z/i) do |relation, keywords_param, _|
|
||||||
|
if keywords_param.blank?
|
||||||
|
relation
|
||||||
|
else
|
||||||
|
keywords = keywords_param.split(",").map(&:strip).reject(&:blank?)
|
||||||
|
if keywords.empty?
|
||||||
|
relation
|
||||||
|
else
|
||||||
|
ts_query = keywords.map { |kw| kw.gsub(/['\\]/, " ") }.join(" | ")
|
||||||
|
|
||||||
|
relation.where(
|
||||||
|
"posts.topic_id IN (
|
||||||
|
SELECT posts2.topic_id
|
||||||
|
FROM posts posts2
|
||||||
|
JOIN post_search_data ON post_search_data.post_id = posts2.id
|
||||||
|
WHERE post_search_data.search_data @@ to_tsquery(?, ?)
|
||||||
|
)",
|
||||||
|
::Search.ts_config,
|
||||||
|
ts_query,
|
||||||
|
)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
register_filter(/\A(?:categories?|category):(.*)\z/i) do |relation, category_param, _|
|
register_filter(/\A(?:categories?|category):(.*)\z/i) do |relation, category_param, _|
|
||||||
if category_param.include?(",")
|
if category_param.include?(",")
|
||||||
category_names = category_param.split(",").map(&:strip)
|
category_names = category_param.split(",").map(&:strip)
|
||||||
@ -140,26 +162,36 @@ module DiscourseAi
|
|||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
register_filter(/\Ain:posted\z/i) do |relation, _, filter|
|
register_filter(/\Agroups?:([a-zA-Z0-9_\-,]+)\z/i) do |relation, groups_param, filter|
|
||||||
if filter.guardian.user
|
if groups_param.include?(",")
|
||||||
relation.where("posts.user_id = ?", filter.guardian.user.id)
|
group_names = groups_param.split(",").map(&:strip)
|
||||||
else
|
found_group_ids = []
|
||||||
relation.where("1 = 0") # No results if not logged in
|
group_names.each do |name|
|
||||||
end
|
group = Group.find_by("name ILIKE ?", name)
|
||||||
end
|
found_group_ids << group.id if group
|
||||||
|
end
|
||||||
|
|
||||||
register_filter(/\Agroup:([a-zA-Z0-9_\-]+)\z/i) do |relation, name, filter|
|
return relation.where("1 = 0") if found_group_ids.empty?
|
||||||
group = Group.find_by("name ILIKE ?", name)
|
|
||||||
if group
|
|
||||||
relation.where(
|
relation.where(
|
||||||
"posts.user_id IN (
|
"posts.user_id IN (
|
||||||
SELECT gu.user_id FROM group_users gu
|
SELECT gu.user_id FROM group_users gu
|
||||||
WHERE gu.group_id = ?
|
WHERE gu.group_id IN (?)
|
||||||
)",
|
)",
|
||||||
group.id,
|
found_group_ids,
|
||||||
)
|
)
|
||||||
else
|
else
|
||||||
relation.where("1 = 0") # No results if group doesn't exist
|
group = Group.find_by("name ILIKE ?", groups_param)
|
||||||
|
if group
|
||||||
|
relation.where(
|
||||||
|
"posts.user_id IN (
|
||||||
|
SELECT gu.user_id FROM group_users gu
|
||||||
|
WHERE gu.group_id = ?
|
||||||
|
)",
|
||||||
|
group.id,
|
||||||
|
)
|
||||||
|
else
|
||||||
|
relation.where("1 = 0") # No results if group doesn't exist
|
||||||
|
end
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
@ -188,23 +220,36 @@ module DiscourseAi
|
|||||||
relation
|
relation
|
||||||
end
|
end
|
||||||
|
|
||||||
|
register_filter(/\Aorder:likes\z/i) do |relation, order_str, filter|
|
||||||
|
filter.set_order!(:likes)
|
||||||
|
relation
|
||||||
|
end
|
||||||
|
|
||||||
register_filter(/\Atopics?:(.*)\z/i) do |relation, topic_param, filter|
|
register_filter(/\Atopics?:(.*)\z/i) do |relation, topic_param, filter|
|
||||||
if topic_param.include?(",")
|
if topic_param.include?(",")
|
||||||
topic_ids = topic_param.split(",").map(&:strip).map(&:to_i).reject(&:zero?)
|
topic_ids = topic_param.split(",").map(&:strip).map(&:to_i).reject(&:zero?)
|
||||||
return relation.where("1 = 0") if topic_ids.empty?
|
return relation.where("1 = 0") if topic_ids.empty?
|
||||||
filter.always_return_topic_ids!(topic_ids)
|
relation.where("posts.topic_id IN (?)", topic_ids)
|
||||||
relation
|
|
||||||
else
|
else
|
||||||
topic_id = topic_param.to_i
|
topic_id = topic_param.to_i
|
||||||
if topic_id > 0
|
if topic_id > 0
|
||||||
filter.always_return_topic_ids!([topic_id])
|
relation.where("posts.topic_id = ?", topic_id)
|
||||||
relation
|
|
||||||
else
|
else
|
||||||
relation.where("1 = 0") # No results if topic_id is invalid
|
relation.where("1 = 0") # No results if topic_id is invalid
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
|
register_filter(/\Apost_type:(first|reply)\z/i) do |relation, post_type, _|
|
||||||
|
if post_type.downcase == "first"
|
||||||
|
relation.where("posts.post_number = 1")
|
||||||
|
elsif post_type.downcase == "reply"
|
||||||
|
relation.where("posts.post_number > 1")
|
||||||
|
else
|
||||||
|
relation
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
def initialize(term, guardian: nil, limit: nil, offset: nil)
|
def initialize(term, guardian: nil, limit: nil, offset: nil)
|
||||||
@guardian = guardian || Guardian.new
|
@guardian = guardian || Guardian.new
|
||||||
@limit = limit
|
@limit = limit
|
||||||
@ -212,9 +257,9 @@ module DiscourseAi
|
|||||||
@filters = []
|
@filters = []
|
||||||
@valid = true
|
@valid = true
|
||||||
@order = :latest_post
|
@order = :latest_post
|
||||||
@topic_ids = nil
|
|
||||||
@invalid_filters = []
|
@invalid_filters = []
|
||||||
@term = term.to_s.strip
|
@term = term.to_s.strip
|
||||||
|
@or_groups = []
|
||||||
|
|
||||||
process_filters(@term)
|
process_filters(@term)
|
||||||
end
|
end
|
||||||
@ -223,42 +268,38 @@ module DiscourseAi
|
|||||||
@order = order
|
@order = order
|
||||||
end
|
end
|
||||||
|
|
||||||
def always_return_topic_ids!(topic_ids)
|
|
||||||
if @topic_ids
|
|
||||||
@topic_ids = @topic_ids + topic_ids
|
|
||||||
else
|
|
||||||
@topic_ids = topic_ids
|
|
||||||
end
|
|
||||||
end
|
|
||||||
|
|
||||||
def limit_by_user!(limit)
|
def limit_by_user!(limit)
|
||||||
@limit = limit if limit.to_i < @limit.to_i || @limit.nil?
|
@limit = limit if limit.to_i < @limit.to_i || @limit.nil?
|
||||||
end
|
end
|
||||||
|
|
||||||
def search
|
def search
|
||||||
filtered =
|
base_relation =
|
||||||
Post
|
Post
|
||||||
.secured(@guardian)
|
.secured(@guardian)
|
||||||
.joins(:topic)
|
.joins(:topic)
|
||||||
.merge(Topic.secured(@guardian))
|
.merge(Topic.secured(@guardian))
|
||||||
.where("topics.archetype = 'regular'")
|
.where("topics.archetype = 'regular'")
|
||||||
original_filtered = filtered
|
|
||||||
|
|
||||||
@filters.each do |filter_block, match_data|
|
# Handle OR groups
|
||||||
filtered = filter_block.call(filtered, match_data, self)
|
if @or_groups.any?
|
||||||
|
or_relations =
|
||||||
|
@or_groups.map do |or_group|
|
||||||
|
group_relation = base_relation
|
||||||
|
or_group.each do |filter_block, match_data|
|
||||||
|
group_relation = filter_block.call(group_relation, match_data, self)
|
||||||
|
end
|
||||||
|
group_relation
|
||||||
|
end
|
||||||
|
|
||||||
|
# Combine OR groups
|
||||||
|
filtered = or_relations.reduce { |combined, current| combined.or(current) }
|
||||||
|
else
|
||||||
|
filtered = base_relation
|
||||||
end
|
end
|
||||||
|
|
||||||
if @topic_ids.present?
|
# Apply regular AND filters
|
||||||
if original_filtered == filtered
|
@filters.each do |filter_block, match_data|
|
||||||
filtered = original_filtered.where("posts.topic_id IN (?)", @topic_ids)
|
filtered = filter_block.call(filtered, match_data, self)
|
||||||
else
|
|
||||||
filtered =
|
|
||||||
original_filtered.where(
|
|
||||||
"posts.topic_id IN (?) OR posts.id IN (?)",
|
|
||||||
@topic_ids,
|
|
||||||
filtered.select("posts.id"),
|
|
||||||
)
|
|
||||||
end
|
|
||||||
end
|
end
|
||||||
|
|
||||||
filtered = filtered.limit(@limit) if @limit.to_i > 0
|
filtered = filtered.limit(@limit) if @limit.to_i > 0
|
||||||
@ -272,17 +313,36 @@ module DiscourseAi
|
|||||||
filtered = filtered.order("topics.created_at DESC, posts.post_number DESC")
|
filtered = filtered.order("topics.created_at DESC, posts.post_number DESC")
|
||||||
elsif @order == :oldest_topic
|
elsif @order == :oldest_topic
|
||||||
filtered = filtered.order("topics.created_at ASC, posts.post_number ASC")
|
filtered = filtered.order("topics.created_at ASC, posts.post_number ASC")
|
||||||
|
elsif @order == :likes
|
||||||
|
filtered = filtered.order("posts.like_count DESC, posts.created_at DESC")
|
||||||
end
|
end
|
||||||
|
|
||||||
filtered
|
filtered
|
||||||
end
|
end
|
||||||
|
|
||||||
private
|
|
||||||
|
|
||||||
def process_filters(term)
|
def process_filters(term)
|
||||||
return if term.blank?
|
return if term.blank?
|
||||||
|
|
||||||
term
|
# Split by OR first, then process each group
|
||||||
|
or_parts = term.split(/\s+OR\s+/i)
|
||||||
|
|
||||||
|
if or_parts.size > 1
|
||||||
|
# Multiple OR groups
|
||||||
|
or_parts.each do |or_part|
|
||||||
|
group_filters = []
|
||||||
|
process_filter_group(or_part.strip, group_filters)
|
||||||
|
@or_groups << group_filters if group_filters.any?
|
||||||
|
end
|
||||||
|
else
|
||||||
|
# Single group (AND logic)
|
||||||
|
process_filter_group(term, @filters)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
private
|
||||||
|
|
||||||
|
def process_filter_group(term_part, filter_collection)
|
||||||
|
term_part
|
||||||
.to_s
|
.to_s
|
||||||
.scan(/(([^" \t\n\x0B\f\r]+)?(("[^"]+")?))/)
|
.scan(/(([^" \t\n\x0B\f\r]+)?(("[^"]+")?))/)
|
||||||
.to_a
|
.to_a
|
||||||
@ -292,7 +352,7 @@ module DiscourseAi
|
|||||||
found = false
|
found = false
|
||||||
self.class.registered_filters.each do |matcher, block|
|
self.class.registered_filters.each do |matcher, block|
|
||||||
if word =~ matcher
|
if word =~ matcher
|
||||||
@filters << [block, $1]
|
filter_collection << [block, $1]
|
||||||
found = true
|
found = true
|
||||||
break
|
break
|
||||||
end
|
end
|
||||||
|
@ -176,7 +176,7 @@ RSpec.describe DiscourseAi::AiBot::Playground do
|
|||||||
|
|
||||||
reply_post = nil
|
reply_post = nil
|
||||||
|
|
||||||
DiscourseAi::Completions::Llm.with_prepared_responses(responses) do |_, _, _prompt|
|
DiscourseAi::Completions::Llm.with_prepared_responses(responses) do
|
||||||
new_post = Fabricate(:post, raw: "Can you use the custom tool?")
|
new_post = Fabricate(:post, raw: "Can you use the custom tool?")
|
||||||
reply_post = playground.reply_to(new_post)
|
reply_post = playground.reply_to(new_post)
|
||||||
end
|
end
|
||||||
@ -255,14 +255,18 @@ RSpec.describe DiscourseAi::AiBot::Playground do
|
|||||||
body = "Hey @#{persona.user.username}, can you help me with this image? #{image}"
|
body = "Hey @#{persona.user.username}, can you help me with this image? #{image}"
|
||||||
|
|
||||||
prompts = nil
|
prompts = nil
|
||||||
|
options = nil
|
||||||
DiscourseAi::Completions::Llm.with_prepared_responses(
|
DiscourseAi::Completions::Llm.with_prepared_responses(
|
||||||
["I understood image"],
|
["I understood image"],
|
||||||
) do |_, _, inner_prompts|
|
) do |_, _, inner_prompts, inner_options|
|
||||||
|
options = inner_options
|
||||||
post = create_post(title: "some new topic I created", raw: body)
|
post = create_post(title: "some new topic I created", raw: body)
|
||||||
|
|
||||||
prompts = inner_prompts
|
prompts = inner_prompts
|
||||||
end
|
end
|
||||||
|
|
||||||
|
expect(options[0][:feature_name]).to eq("bot")
|
||||||
|
|
||||||
content = prompts[0].messages[1][:content]
|
content = prompts[0].messages[1][:content]
|
||||||
|
|
||||||
expect(content).to include({ upload_id: upload.id })
|
expect(content).to include({ upload_id: upload.id })
|
||||||
|
@ -8,6 +8,7 @@ describe DiscourseAi::Utils::Research::Filter do
|
|||||||
end
|
end
|
||||||
|
|
||||||
fab!(:user)
|
fab!(:user)
|
||||||
|
fab!(:user2) { Fabricate(:user) }
|
||||||
|
|
||||||
fab!(:feature_tag) { Fabricate(:tag, name: "feature") }
|
fab!(:feature_tag) { Fabricate(:tag, name: "feature") }
|
||||||
fab!(:bug_tag) { Fabricate(:tag, name: "bug") }
|
fab!(:bug_tag) { Fabricate(:tag, name: "bug") }
|
||||||
@ -15,6 +16,9 @@ describe DiscourseAi::Utils::Research::Filter do
|
|||||||
fab!(:announcement_category) { Fabricate(:category, name: "Announcements") }
|
fab!(:announcement_category) { Fabricate(:category, name: "Announcements") }
|
||||||
fab!(:feedback_category) { Fabricate(:category, name: "Feedback") }
|
fab!(:feedback_category) { Fabricate(:category, name: "Feedback") }
|
||||||
|
|
||||||
|
fab!(:group1) { Fabricate(:group, name: "group1") }
|
||||||
|
fab!(:group2) { Fabricate(:group, name: "group2") }
|
||||||
|
|
||||||
fab!(:feature_topic) do
|
fab!(:feature_topic) do
|
||||||
Fabricate(
|
Fabricate(
|
||||||
:topic,
|
:topic,
|
||||||
@ -54,6 +58,32 @@ describe DiscourseAi::Utils::Research::Filter do
|
|||||||
fab!(:feature_bug_post) { Fabricate(:post, topic: feature_bug_topic, user: user) }
|
fab!(:feature_bug_post) { Fabricate(:post, topic: feature_bug_topic, user: user) }
|
||||||
fab!(:no_tag_post) { Fabricate(:post, topic: no_tag_topic, user: user) }
|
fab!(:no_tag_post) { Fabricate(:post, topic: no_tag_topic, user: user) }
|
||||||
|
|
||||||
|
describe "group filtering" do
|
||||||
|
before do
|
||||||
|
group1.add(user)
|
||||||
|
group2.add(user2)
|
||||||
|
end
|
||||||
|
|
||||||
|
it "supports filtering by groups" do
|
||||||
|
no_tag_post.update!(user_id: user2.id)
|
||||||
|
|
||||||
|
filter = described_class.new("group:group1")
|
||||||
|
expect(filter.search.pluck(:id)).to contain_exactly(
|
||||||
|
feature_post.id,
|
||||||
|
bug_post.id,
|
||||||
|
feature_bug_post.id,
|
||||||
|
)
|
||||||
|
|
||||||
|
filter = described_class.new("groups:group1,group2")
|
||||||
|
expect(filter.search.pluck(:id)).to contain_exactly(
|
||||||
|
feature_post.id,
|
||||||
|
bug_post.id,
|
||||||
|
feature_bug_post.id,
|
||||||
|
no_tag_post.id,
|
||||||
|
)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
describe "security filtering" do
|
describe "security filtering" do
|
||||||
fab!(:secure_group) { Fabricate(:group) }
|
fab!(:secure_group) { Fabricate(:group) }
|
||||||
fab!(:secure_category) { Fabricate(:category, name: "Secure") }
|
fab!(:secure_category) { Fabricate(:category, name: "Secure") }
|
||||||
@ -122,7 +152,7 @@ describe DiscourseAi::Utils::Research::Filter do
|
|||||||
# it can tack on topics
|
# it can tack on topics
|
||||||
filter =
|
filter =
|
||||||
described_class.new(
|
described_class.new(
|
||||||
"category:Announcements topic:#{feature_bug_post.topic.id},#{no_tag_post.topic.id}",
|
"category:Announcements OR topic:#{feature_bug_post.topic.id},#{no_tag_post.topic.id}",
|
||||||
)
|
)
|
||||||
expect(filter.search.pluck(:id)).to contain_exactly(
|
expect(filter.search.pluck(:id)).to contain_exactly(
|
||||||
feature_post.id,
|
feature_post.id,
|
||||||
@ -175,6 +205,25 @@ describe DiscourseAi::Utils::Research::Filter do
|
|||||||
Fabricate(:post, raw: "No fruits here", topic: no_tag_topic, user: user)
|
Fabricate(:post, raw: "No fruits here", topic: no_tag_topic, user: user)
|
||||||
end
|
end
|
||||||
|
|
||||||
|
fab!(:reply_on_bananas) do
|
||||||
|
Fabricate(:post, raw: "Just a reply", topic: post_with_bananas.topic, user: user)
|
||||||
|
end
|
||||||
|
|
||||||
|
it "correctly filters posts by topic_keywords" do
|
||||||
|
topic1 = post_with_bananas.topic
|
||||||
|
topic2 = post_with_both.topic
|
||||||
|
|
||||||
|
filter = described_class.new("topic_keywords:banana")
|
||||||
|
expected = topic1.posts.pluck(:id) + topic2.posts.pluck(:id)
|
||||||
|
expect(filter.search.pluck(:id)).to contain_exactly(*expected)
|
||||||
|
|
||||||
|
filter = described_class.new("topic_keywords:banana post_type:first")
|
||||||
|
expect(filter.search.pluck(:id)).to contain_exactly(
|
||||||
|
topic1.posts.order(:post_number).first.id,
|
||||||
|
topic2.posts.order(:post_number).first.id,
|
||||||
|
)
|
||||||
|
end
|
||||||
|
|
||||||
it "correctly filters posts by full text keywords" do
|
it "correctly filters posts by full text keywords" do
|
||||||
filter = described_class.new("keywords:apples")
|
filter = described_class.new("keywords:apples")
|
||||||
expect(filter.search.pluck(:id)).to contain_exactly(post_with_apples.id, post_with_both.id)
|
expect(filter.search.pluck(:id)).to contain_exactly(post_with_apples.id, post_with_both.id)
|
||||||
|
@ -43,6 +43,69 @@ RSpec.describe AiTool do
|
|||||||
expect(runner.invoke).to eq("query" => "test")
|
expect(runner.invoke).to eq("query" => "test")
|
||||||
end
|
end
|
||||||
|
|
||||||
|
it "can base64 encode binary HTTP responses" do
|
||||||
|
# Create binary data with all possible byte values (0-255)
|
||||||
|
binary_data = (0..255).map(&:chr).join
|
||||||
|
expected_base64 = Base64.strict_encode64(binary_data)
|
||||||
|
|
||||||
|
script = <<~JS
|
||||||
|
function invoke(params) {
|
||||||
|
const result = http.post("https://example.com/binary", {
|
||||||
|
body: "test",
|
||||||
|
base64Encode: true
|
||||||
|
});
|
||||||
|
return result.body;
|
||||||
|
}
|
||||||
|
JS
|
||||||
|
|
||||||
|
tool = create_tool(script: script)
|
||||||
|
runner = tool.runner({}, llm: nil, bot_user: nil)
|
||||||
|
|
||||||
|
stub_request(:post, "https://example.com/binary").to_return(
|
||||||
|
status: 200,
|
||||||
|
body: binary_data,
|
||||||
|
headers: {
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
result = runner.invoke
|
||||||
|
|
||||||
|
expect(result).to eq(expected_base64)
|
||||||
|
# Verify we can decode back to original binary data
|
||||||
|
expect(Base64.strict_decode64(result).bytes).to eq((0..255).to_a)
|
||||||
|
end
|
||||||
|
|
||||||
|
it "can base64 encode binary GET responses" do
|
||||||
|
# Create binary data with all possible byte values (0-255)
|
||||||
|
binary_data = (0..255).map(&:chr).join
|
||||||
|
expected_base64 = Base64.strict_encode64(binary_data)
|
||||||
|
|
||||||
|
script = <<~JS
|
||||||
|
function invoke(params) {
|
||||||
|
const result = http.get("https://example.com/binary", {
|
||||||
|
base64Encode: true
|
||||||
|
});
|
||||||
|
return result.body;
|
||||||
|
}
|
||||||
|
JS
|
||||||
|
|
||||||
|
tool = create_tool(script: script)
|
||||||
|
runner = tool.runner({}, llm: nil, bot_user: nil)
|
||||||
|
|
||||||
|
stub_request(:get, "https://example.com/binary").to_return(
|
||||||
|
status: 200,
|
||||||
|
body: binary_data,
|
||||||
|
headers: {
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
result = runner.invoke
|
||||||
|
|
||||||
|
expect(result).to eq(expected_base64)
|
||||||
|
# Verify we can decode back to original binary data
|
||||||
|
expect(Base64.strict_decode64(result).bytes).to eq((0..255).to_a)
|
||||||
|
end
|
||||||
|
|
||||||
it "can perform HTTP requests with various verbs" do
|
it "can perform HTTP requests with various verbs" do
|
||||||
%i[post put delete patch].each do |verb|
|
%i[post put delete patch].each do |verb|
|
||||||
script = <<~JS
|
script = <<~JS
|
||||||
@ -676,6 +739,87 @@ RSpec.describe AiTool do
|
|||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
|
it "can use sleep function with limits" do
|
||||||
|
script = <<~JS
|
||||||
|
function invoke(params) {
|
||||||
|
let results = [];
|
||||||
|
for (let i = 0; i < 3; i++) {
|
||||||
|
let result = sleep(1); // 1ms sleep
|
||||||
|
results.push(result);
|
||||||
|
}
|
||||||
|
return results;
|
||||||
|
}
|
||||||
|
JS
|
||||||
|
|
||||||
|
tool = create_tool(script: script)
|
||||||
|
runner = tool.runner({}, llm: nil, bot_user: nil)
|
||||||
|
|
||||||
|
result = runner.invoke
|
||||||
|
|
||||||
|
expect(result).to eq([{ "slept" => 1 }, { "slept" => 1 }, { "slept" => 1 }])
|
||||||
|
end
|
||||||
|
|
||||||
|
let(:jpg) { plugin_file_from_fixtures("1x1.jpg") }
|
||||||
|
|
||||||
|
describe "upload base64 encoding" do
|
||||||
|
it "can get base64 data from upload ID and short URL" do
|
||||||
|
upload = UploadCreator.new(jpg, "1x1.jpg").create_for(Discourse.system_user.id)
|
||||||
|
|
||||||
|
# Test with upload ID
|
||||||
|
script_id = <<~JS
|
||||||
|
function invoke(params) {
|
||||||
|
return upload.getBase64(params.upload_id, params.max_pixels);
|
||||||
|
}
|
||||||
|
JS
|
||||||
|
|
||||||
|
tool = create_tool(script: script_id)
|
||||||
|
runner =
|
||||||
|
tool.runner(
|
||||||
|
{ "upload_id" => upload.id, "max_pixels" => 1_000_000 },
|
||||||
|
llm: nil,
|
||||||
|
bot_user: nil,
|
||||||
|
)
|
||||||
|
result_id = runner.invoke
|
||||||
|
|
||||||
|
expect(result_id).to be_present
|
||||||
|
expect(result_id).to be_a(String)
|
||||||
|
expect(result_id.length).to be > 0
|
||||||
|
|
||||||
|
# Test with short URL
|
||||||
|
script_url = <<~JS
|
||||||
|
function invoke(params) {
|
||||||
|
return upload.getBase64(params.short_url, params.max_pixels);
|
||||||
|
}
|
||||||
|
JS
|
||||||
|
|
||||||
|
tool = create_tool(script: script_url)
|
||||||
|
runner =
|
||||||
|
tool.runner(
|
||||||
|
{ "short_url" => upload.short_url, "max_pixels" => 1_000_000 },
|
||||||
|
llm: nil,
|
||||||
|
bot_user: nil,
|
||||||
|
)
|
||||||
|
result_url = runner.invoke
|
||||||
|
|
||||||
|
expect(result_url).to be_present
|
||||||
|
expect(result_url).to be_a(String)
|
||||||
|
expect(result_url).to eq(result_id) # Should return same base64 data
|
||||||
|
|
||||||
|
# Test with invalid upload ID
|
||||||
|
script_invalid = <<~JS
|
||||||
|
function invoke(params) {
|
||||||
|
return upload.getBase64(99999);
|
||||||
|
}
|
||||||
|
JS
|
||||||
|
|
||||||
|
tool = create_tool(script: script_invalid)
|
||||||
|
runner = tool.runner({}, llm: nil, bot_user: nil)
|
||||||
|
result_invalid = runner.invoke
|
||||||
|
|
||||||
|
expect(result_invalid).to be_nil
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
describe "upload URL resolution" do
|
describe "upload URL resolution" do
|
||||||
it "can resolve upload short URLs to public URLs" do
|
it "can resolve upload short URLs to public URLs" do
|
||||||
upload =
|
upload =
|
||||||
|
Loading…
x
Reference in New Issue
Block a user