DEV: improve tool infra, improve forum researcher prompts, improve logging (#1391)

- add sleep function for tool polling with rate limits
- Support base64 encoding for HTTP requests and uploads
-  Enhance forum researcher with cost warnings and comprehensive planning
- Add cancellation support for research operations
- Include feature_name parameter for bot analytics
- richer research support (OR queries)
This commit is contained in:
Sam 2025-06-03 15:17:55 +10:00 committed by GitHub
parent 4c0660d6fd
commit 4dffd0b2c5
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
8 changed files with 453 additions and 95 deletions

View File

@ -15,7 +15,7 @@ module ::Jobs
bot = DiscourseAi::Personas::Bot.as(bot_user, persona: persona.new) bot = DiscourseAi::Personas::Bot.as(bot_user, persona: persona.new)
DiscourseAi::AiBot::Playground.new(bot).reply_to(post) DiscourseAi::AiBot::Playground.new(bot).reply_to(post, feature_name: "bot")
rescue DiscourseAi::Personas::Bot::BOT_NOT_FOUND rescue DiscourseAi::Personas::Bot::BOT_NOT_FOUND
Rails.logger.warn( Rails.logger.warn(
"Bot not found for post #{post.id} - perhaps persona was deleted or bot was disabled", "Bot not found for post #{post.id} - perhaps persona was deleted or bot was disabled",

View File

@ -13,43 +13,45 @@ module DiscourseAi
def system_prompt def system_prompt
<<~PROMPT <<~PROMPT
You are a helpful Discourse assistant specializing in forum research. You are a helpful Discourse assistant specializing in forum research.
You _understand_ and **generate** Discourse Markdown. You _understand_ and **generate** Discourse Markdown.
You live in the forum with the URL: {site_url} You live in the forum with the URL: {site_url}
The title of your site: {site_title} The title of your site: {site_title}
The description is: {site_description} The description is: {site_description}
The participants in this conversation are: {participants} The participants in this conversation are: {participants}
The date now is: {time}, much has changed since you were trained. The date now is: {time}, much has changed since you were trained.
Topic URLs are formatted as: /t/-/TOPIC_ID Topic URLs are formatted as: /t/-/TOPIC_ID
Post URLs are formatted as: /t/-/TOPIC_ID/POST_NUMBER Post URLs are formatted as: /t/-/TOPIC_ID/POST_NUMBER
As a forum researcher, guide users through a structured research process: CRITICAL: Research is extremely expensive. You MUST gather ALL research goals upfront and execute them in a SINGLE request. Never run multiple research operations.
1. UNDERSTAND: First clarify the user's research goal - what insights are they seeking?
2. PLAN: Design an appropriate research approach with specific filters
3. TEST: Always begin with dry_run:true to gauge the scope of results
4. REFINE: If results are too broad/narrow, suggest filter adjustments
5. EXECUTE: Run the final analysis only when filters are well-tuned
6. SUMMARIZE: Present findings with links to supporting evidence
BE MINDFUL: specify all research goals in one request to avoid multiple processing runs. As a forum researcher, follow this structured process:
1. UNDERSTAND: Clarify ALL research goals - what insights are they seeking?
2. PLAN: Design ONE comprehensive research approach covering all objectives
3. TEST: Always begin with dry_run:true to gauge the scope of results
4. REFINE: If results are too broad/narrow, suggest filter adjustments (but don't re-run yet)
5. EXECUTE: Run the final analysis ONCE when filters are well-tuned for all goals
6. SUMMARIZE: Present findings with links to supporting evidence
REMEMBER: Different filters serve different purposes: Before any research, ask users to specify:
- Use post date filters (after/before) for analyzing specific posts - ALL research questions they want answered
- Use topic date filters (topic_after/topic_before) for analyzing entire topics - Time periods of interest
- Combine user/group filters with categories/tags to find specialized contributions - Specific users, categories, or tags to focus on
- Expected scope (broad overview vs. deep dive)
Always ground your analysis with links to original posts on the forum. Research filter guidelines:
- Use post date filters (after/before) for analyzing specific posts
- Use topic date filters (topic_after/topic_before) for analyzing entire topics
- Combine user/group filters with categories/tags to find specialized contributions
Research workflow best practices: When formatting results:
1. Start with a dry_run to gauge the scope (set dry_run:true) - Link to topics with descriptive text when relevant
2. For temporal analysis, specify explicit date ranges - Use markdown footnotes for supporting evidence
3. For user behavior analysis, combine @username with categories or tags - Always ground analysis with links to original forum posts
- When formatting research results, format backing links clearly: Remember: ONE research request should answer ALL questions. Plan comprehensively before executing.
- When it is a good fit, link to the topic with descriptive text. PROMPT
- When it is a good fit, link using markdown footnotes.
PROMPT
end end
end end
end end

View File

@ -13,6 +13,9 @@ module DiscourseAi
MARSHAL_STACK_DEPTH = 20 MARSHAL_STACK_DEPTH = 20
MAX_HTTP_REQUESTS = 20 MAX_HTTP_REQUESTS = 20
MAX_SLEEP_CALLS = 30
MAX_SLEEP_DURATION_MS = 60_000
def initialize(parameters:, llm:, bot_user:, context: nil, tool:, timeout: nil) def initialize(parameters:, llm:, bot_user:, context: nil, tool:, timeout: nil)
if context && !context.is_a?(DiscourseAi::Personas::BotContext) if context && !context.is_a?(DiscourseAi::Personas::BotContext)
raise ArgumentError, "context must be a BotContext object" raise ArgumentError, "context must be a BotContext object"
@ -28,6 +31,7 @@ module DiscourseAi
@timeout = timeout || DEFAULT_TIMEOUT @timeout = timeout || DEFAULT_TIMEOUT
@running_attached_function = false @running_attached_function = false
@sleep_calls_made = 0
@http_requests_made = 0 @http_requests_made = 0
end end
@ -44,6 +48,7 @@ module DiscourseAi
attach_index(ctx) attach_index(ctx)
attach_upload(ctx) attach_upload(ctx)
attach_chain(ctx) attach_chain(ctx)
attach_sleep(ctx)
attach_discourse(ctx) attach_discourse(ctx)
ctx.eval(framework_script) ctx.eval(framework_script)
ctx ctx
@ -73,6 +78,9 @@ module DiscourseAi
const upload = { const upload = {
create: _upload_create, create: _upload_create,
getUrl: _upload_get_url, getUrl: _upload_get_url,
getBase64: function(id, maxPixels) {
return _upload_get_base64(id, maxPixels);
}
} }
const chain = { const chain = {
@ -310,6 +318,33 @@ module DiscourseAi
mini_racer_context.attach("_chain_set_custom_raw", ->(raw) { self.custom_raw = raw }) mini_racer_context.attach("_chain_set_custom_raw", ->(raw) { self.custom_raw = raw })
end end
# this is useful for polling apis
def attach_sleep(mini_racer_context)
mini_racer_context.attach(
"sleep",
->(duration_ms) do
@sleep_calls_made += 1
if @sleep_calls_made > MAX_SLEEP_CALLS
raise TooManyRequestsError.new("Tool made too many sleep calls")
end
duration_ms = duration_ms.to_i
if duration_ms > MAX_SLEEP_DURATION_MS
raise ArgumentError.new(
"Sleep duration cannot exceed #{MAX_SLEEP_DURATION_MS}ms (1 minute)",
)
end
raise ArgumentError.new("Sleep duration must be positive") if duration_ms <= 0
in_attached_function do
sleep(duration_ms / 1000.0)
{ slept: duration_ms }
end
end,
)
end
def attach_discourse(mini_racer_context) def attach_discourse(mini_racer_context)
mini_racer_context.attach( mini_racer_context.attach(
"_discourse_get_post", "_discourse_get_post",
@ -571,6 +606,42 @@ module DiscourseAi
end end
def attach_upload(mini_racer_context) def attach_upload(mini_racer_context)
mini_racer_context.attach(
"_upload_get_base64",
->(upload_id_or_url, max_pixels) do
in_attached_function do
return nil if upload_id_or_url.blank?
upload = nil
# Handle both upload ID and short URL
if upload_id_or_url.to_s.start_with?("upload://")
# Handle short URL format
sha1 = Upload.sha1_from_short_url(upload_id_or_url)
return nil if sha1.blank?
upload = Upload.find_by(sha1: sha1)
else
# Handle numeric ID
upload_id = upload_id_or_url.to_i
return nil if upload_id <= 0
upload = Upload.find_by(id: upload_id)
end
return nil if upload.nil?
max_pixels = max_pixels&.to_i
max_pixels = nil if max_pixels && max_pixels <= 0
encoded_uploads =
DiscourseAi::Completions::UploadEncoder.encode(
upload_ids: [upload.id],
max_pixels: max_pixels || 10_000_000, # Default to 10M pixels if not specified
)
encoded_uploads.first&.dig(:base64)
end
end,
)
mini_racer_context.attach( mini_racer_context.attach(
"_upload_get_url", "_upload_get_url",
->(short_url) do ->(short_url) do
@ -629,13 +700,18 @@ module DiscourseAi
in_attached_function do in_attached_function do
headers = (options && options["headers"]) || {} headers = (options && options["headers"]) || {}
base64_encode = options && options["base64Encode"]
result = {} result = {}
DiscourseAi::Personas::Tools::Tool.send_http_request( DiscourseAi::Personas::Tools::Tool.send_http_request(
url, url,
headers: headers, headers: headers,
) do |response| ) do |response|
result[:body] = response.body if base64_encode
result[:body] = Base64.strict_encode64(response.body)
else
result[:body] = response.body
end
result[:status] = response.code.to_i result[:status] = response.code.to_i
end end
@ -658,6 +734,7 @@ module DiscourseAi
in_attached_function do in_attached_function do
headers = (options && options["headers"]) || {} headers = (options && options["headers"]) || {}
body = options && options["body"] body = options && options["body"]
base64_encode = options && options["base64Encode"]
result = {} result = {}
DiscourseAi::Personas::Tools::Tool.send_http_request( DiscourseAi::Personas::Tools::Tool.send_http_request(
@ -666,7 +743,11 @@ module DiscourseAi
headers: headers, headers: headers,
body: body, body: body,
) do |response| ) do |response|
result[:body] = response.body if base64_encode
result[:body] = Base64.strict_encode64(response.body)
else
result[:body] = response.body
end
result[:status] = response.code.to_i result[:status] = response.code.to_i
end end

View File

@ -33,19 +33,24 @@ module DiscourseAi
<<~TEXT <<~TEXT
Filter string to target specific content. Filter string to target specific content.
- Supports user (@username) - Supports user (@username)
- post_type:first - only includes first posts in topics
- post_type:reply - only replies in topics
- date ranges (after:YYYY-MM-DD, before:YYYY-MM-DD for posts; topic_after:YYYY-MM-DD, topic_before:YYYY-MM-DD for topics) - date ranges (after:YYYY-MM-DD, before:YYYY-MM-DD for posts; topic_after:YYYY-MM-DD, topic_before:YYYY-MM-DD for topics)
- categories (category:category1,category2) - categories (category:category1,category2 or categories:category1,category2)
- tags (tag:tag1,tag2) - tags (tag:tag1,tag2 or tags:tag1,tag2)
- groups (group:group1,group2). - groups (group:group1,group2 or groups:group1,group2)
- status (status:open, status:closed, status:archived, status:noreplies, status:single_user) - status (status:open, status:closed, status:archived, status:noreplies, status:single_user)
- keywords (keywords:keyword1,keyword2) - specific words to search for in posts - keywords (keywords:keyword1,keyword2) - searches for specific words within post content using full-text search
- max_results (max_results:10) the maximum number of results to return (optional) - topic_keywords (topic_keywords:keyword1,keyword2) - searches for keywords within topics, returns all posts from matching topics
- order (order:latest, order:oldest, order:latest_topic, order:oldest_topic) - the order of the results (optional) - topics (topic:topic_id1,topic_id2 or topics:topic_id1,topic_id2) - target specific topics by ID
- topic (topic:topic_id1,topic_id2) - add specific topics to the filter, topics will unconditionally be included - max_results (max_results:10) - limits the maximum number of results returned (optional)
- order (order:latest, order:oldest, order:latest_topic, order:oldest_topic, order:likes) - controls result ordering (optional, defaults to latest posts)
If multiple tags or categories are specified, they are treated as OR conditions. Multiple filters can be combined with spaces for AND logic. Example: '@sam after:2023-01-01 tag:feature'
Multiple filters can be combined with spaces. Example: '@sam after:2023-01-01 tag:feature' Use OR to combine filter segments for inclusive logic.
Example: 'category:feature,bug OR tag:feature-tag' - includes posts in feature OR bug categories, OR posts with feature-tag tag
Example: '@sam category:bug' - includes posts by @sam AND in bug category
TEXT TEXT
end end
@ -145,10 +150,23 @@ module DiscourseAi
results = [] results = []
formatter.each_chunk { |chunk| results << run_inference(chunk[:text], goals, post, &blk) } formatter.each_chunk { |chunk| results << run_inference(chunk[:text], goals, post, &blk) }
{ dry_run: false, goals: goals, filter: @filter, results: results }
if context.cancel_manager&.cancelled?
{
dry_run: false,
goals: goals,
filter: @filter,
results: "Cancelled by user",
cancelled_by_user: true,
}
else
{ dry_run: false, goals: goals, filter: @filter, results: results }
end
end end
def run_inference(chunk_text, goals, post, &blk) def run_inference(chunk_text, goals, post, &blk)
return if context.cancel_manager&.cancelled?
system_prompt = goal_system_prompt(goals) system_prompt = goal_system_prompt(goals)
user_prompt = goal_user_prompt(goals, chunk_text) user_prompt = goal_user_prompt(goals, chunk_text)

View File

@ -4,7 +4,6 @@ module DiscourseAi
module Utils module Utils
module Research module Research
class Filter class Filter
# Stores custom filter handlers
def self.register_filter(matcher, &block) def self.register_filter(matcher, &block)
(@registered_filters ||= {})[matcher] = block (@registered_filters ||= {})[matcher] = block
end end
@ -19,7 +18,6 @@ module DiscourseAi
attr_reader :term, :filters, :order, :guardian, :limit, :offset, :invalid_filters attr_reader :term, :filters, :order, :guardian, :limit, :offset, :invalid_filters
# Define all filters at class level
register_filter(/\Astatus:open\z/i) do |relation, _, _| register_filter(/\Astatus:open\z/i) do |relation, _, _|
relation.where("topics.closed = false AND topics.archived = false") relation.where("topics.closed = false AND topics.archived = false")
end end
@ -109,6 +107,30 @@ module DiscourseAi
end end
end end
register_filter(/\Atopic_keywords?:(.*)\z/i) do |relation, keywords_param, _|
if keywords_param.blank?
relation
else
keywords = keywords_param.split(",").map(&:strip).reject(&:blank?)
if keywords.empty?
relation
else
ts_query = keywords.map { |kw| kw.gsub(/['\\]/, " ") }.join(" | ")
relation.where(
"posts.topic_id IN (
SELECT posts2.topic_id
FROM posts posts2
JOIN post_search_data ON post_search_data.post_id = posts2.id
WHERE post_search_data.search_data @@ to_tsquery(?, ?)
)",
::Search.ts_config,
ts_query,
)
end
end
end
register_filter(/\A(?:categories?|category):(.*)\z/i) do |relation, category_param, _| register_filter(/\A(?:categories?|category):(.*)\z/i) do |relation, category_param, _|
if category_param.include?(",") if category_param.include?(",")
category_names = category_param.split(",").map(&:strip) category_names = category_param.split(",").map(&:strip)
@ -140,26 +162,36 @@ module DiscourseAi
end end
end end
register_filter(/\Ain:posted\z/i) do |relation, _, filter| register_filter(/\Agroups?:([a-zA-Z0-9_\-,]+)\z/i) do |relation, groups_param, filter|
if filter.guardian.user if groups_param.include?(",")
relation.where("posts.user_id = ?", filter.guardian.user.id) group_names = groups_param.split(",").map(&:strip)
else found_group_ids = []
relation.where("1 = 0") # No results if not logged in group_names.each do |name|
end group = Group.find_by("name ILIKE ?", name)
end found_group_ids << group.id if group
end
register_filter(/\Agroup:([a-zA-Z0-9_\-]+)\z/i) do |relation, name, filter| return relation.where("1 = 0") if found_group_ids.empty?
group = Group.find_by("name ILIKE ?", name)
if group
relation.where( relation.where(
"posts.user_id IN ( "posts.user_id IN (
SELECT gu.user_id FROM group_users gu SELECT gu.user_id FROM group_users gu
WHERE gu.group_id = ? WHERE gu.group_id IN (?)
)", )",
group.id, found_group_ids,
) )
else else
relation.where("1 = 0") # No results if group doesn't exist group = Group.find_by("name ILIKE ?", groups_param)
if group
relation.where(
"posts.user_id IN (
SELECT gu.user_id FROM group_users gu
WHERE gu.group_id = ?
)",
group.id,
)
else
relation.where("1 = 0") # No results if group doesn't exist
end
end end
end end
@ -188,23 +220,36 @@ module DiscourseAi
relation relation
end end
register_filter(/\Aorder:likes\z/i) do |relation, order_str, filter|
filter.set_order!(:likes)
relation
end
register_filter(/\Atopics?:(.*)\z/i) do |relation, topic_param, filter| register_filter(/\Atopics?:(.*)\z/i) do |relation, topic_param, filter|
if topic_param.include?(",") if topic_param.include?(",")
topic_ids = topic_param.split(",").map(&:strip).map(&:to_i).reject(&:zero?) topic_ids = topic_param.split(",").map(&:strip).map(&:to_i).reject(&:zero?)
return relation.where("1 = 0") if topic_ids.empty? return relation.where("1 = 0") if topic_ids.empty?
filter.always_return_topic_ids!(topic_ids) relation.where("posts.topic_id IN (?)", topic_ids)
relation
else else
topic_id = topic_param.to_i topic_id = topic_param.to_i
if topic_id > 0 if topic_id > 0
filter.always_return_topic_ids!([topic_id]) relation.where("posts.topic_id = ?", topic_id)
relation
else else
relation.where("1 = 0") # No results if topic_id is invalid relation.where("1 = 0") # No results if topic_id is invalid
end end
end end
end end
register_filter(/\Apost_type:(first|reply)\z/i) do |relation, post_type, _|
if post_type.downcase == "first"
relation.where("posts.post_number = 1")
elsif post_type.downcase == "reply"
relation.where("posts.post_number > 1")
else
relation
end
end
def initialize(term, guardian: nil, limit: nil, offset: nil) def initialize(term, guardian: nil, limit: nil, offset: nil)
@guardian = guardian || Guardian.new @guardian = guardian || Guardian.new
@limit = limit @limit = limit
@ -212,9 +257,9 @@ module DiscourseAi
@filters = [] @filters = []
@valid = true @valid = true
@order = :latest_post @order = :latest_post
@topic_ids = nil
@invalid_filters = [] @invalid_filters = []
@term = term.to_s.strip @term = term.to_s.strip
@or_groups = []
process_filters(@term) process_filters(@term)
end end
@ -223,42 +268,38 @@ module DiscourseAi
@order = order @order = order
end end
def always_return_topic_ids!(topic_ids)
if @topic_ids
@topic_ids = @topic_ids + topic_ids
else
@topic_ids = topic_ids
end
end
def limit_by_user!(limit) def limit_by_user!(limit)
@limit = limit if limit.to_i < @limit.to_i || @limit.nil? @limit = limit if limit.to_i < @limit.to_i || @limit.nil?
end end
def search def search
filtered = base_relation =
Post Post
.secured(@guardian) .secured(@guardian)
.joins(:topic) .joins(:topic)
.merge(Topic.secured(@guardian)) .merge(Topic.secured(@guardian))
.where("topics.archetype = 'regular'") .where("topics.archetype = 'regular'")
original_filtered = filtered
@filters.each do |filter_block, match_data| # Handle OR groups
filtered = filter_block.call(filtered, match_data, self) if @or_groups.any?
or_relations =
@or_groups.map do |or_group|
group_relation = base_relation
or_group.each do |filter_block, match_data|
group_relation = filter_block.call(group_relation, match_data, self)
end
group_relation
end
# Combine OR groups
filtered = or_relations.reduce { |combined, current| combined.or(current) }
else
filtered = base_relation
end end
if @topic_ids.present? # Apply regular AND filters
if original_filtered == filtered @filters.each do |filter_block, match_data|
filtered = original_filtered.where("posts.topic_id IN (?)", @topic_ids) filtered = filter_block.call(filtered, match_data, self)
else
filtered =
original_filtered.where(
"posts.topic_id IN (?) OR posts.id IN (?)",
@topic_ids,
filtered.select("posts.id"),
)
end
end end
filtered = filtered.limit(@limit) if @limit.to_i > 0 filtered = filtered.limit(@limit) if @limit.to_i > 0
@ -272,17 +313,36 @@ module DiscourseAi
filtered = filtered.order("topics.created_at DESC, posts.post_number DESC") filtered = filtered.order("topics.created_at DESC, posts.post_number DESC")
elsif @order == :oldest_topic elsif @order == :oldest_topic
filtered = filtered.order("topics.created_at ASC, posts.post_number ASC") filtered = filtered.order("topics.created_at ASC, posts.post_number ASC")
elsif @order == :likes
filtered = filtered.order("posts.like_count DESC, posts.created_at DESC")
end end
filtered filtered
end end
private
def process_filters(term) def process_filters(term)
return if term.blank? return if term.blank?
term # Split by OR first, then process each group
or_parts = term.split(/\s+OR\s+/i)
if or_parts.size > 1
# Multiple OR groups
or_parts.each do |or_part|
group_filters = []
process_filter_group(or_part.strip, group_filters)
@or_groups << group_filters if group_filters.any?
end
else
# Single group (AND logic)
process_filter_group(term, @filters)
end
end
private
def process_filter_group(term_part, filter_collection)
term_part
.to_s .to_s
.scan(/(([^" \t\n\x0B\f\r]+)?(("[^"]+")?))/) .scan(/(([^" \t\n\x0B\f\r]+)?(("[^"]+")?))/)
.to_a .to_a
@ -292,7 +352,7 @@ module DiscourseAi
found = false found = false
self.class.registered_filters.each do |matcher, block| self.class.registered_filters.each do |matcher, block|
if word =~ matcher if word =~ matcher
@filters << [block, $1] filter_collection << [block, $1]
found = true found = true
break break
end end

View File

@ -176,7 +176,7 @@ RSpec.describe DiscourseAi::AiBot::Playground do
reply_post = nil reply_post = nil
DiscourseAi::Completions::Llm.with_prepared_responses(responses) do |_, _, _prompt| DiscourseAi::Completions::Llm.with_prepared_responses(responses) do
new_post = Fabricate(:post, raw: "Can you use the custom tool?") new_post = Fabricate(:post, raw: "Can you use the custom tool?")
reply_post = playground.reply_to(new_post) reply_post = playground.reply_to(new_post)
end end
@ -255,14 +255,18 @@ RSpec.describe DiscourseAi::AiBot::Playground do
body = "Hey @#{persona.user.username}, can you help me with this image? #{image}" body = "Hey @#{persona.user.username}, can you help me with this image? #{image}"
prompts = nil prompts = nil
options = nil
DiscourseAi::Completions::Llm.with_prepared_responses( DiscourseAi::Completions::Llm.with_prepared_responses(
["I understood image"], ["I understood image"],
) do |_, _, inner_prompts| ) do |_, _, inner_prompts, inner_options|
options = inner_options
post = create_post(title: "some new topic I created", raw: body) post = create_post(title: "some new topic I created", raw: body)
prompts = inner_prompts prompts = inner_prompts
end end
expect(options[0][:feature_name]).to eq("bot")
content = prompts[0].messages[1][:content] content = prompts[0].messages[1][:content]
expect(content).to include({ upload_id: upload.id }) expect(content).to include({ upload_id: upload.id })

View File

@ -8,6 +8,7 @@ describe DiscourseAi::Utils::Research::Filter do
end end
fab!(:user) fab!(:user)
fab!(:user2) { Fabricate(:user) }
fab!(:feature_tag) { Fabricate(:tag, name: "feature") } fab!(:feature_tag) { Fabricate(:tag, name: "feature") }
fab!(:bug_tag) { Fabricate(:tag, name: "bug") } fab!(:bug_tag) { Fabricate(:tag, name: "bug") }
@ -15,6 +16,9 @@ describe DiscourseAi::Utils::Research::Filter do
fab!(:announcement_category) { Fabricate(:category, name: "Announcements") } fab!(:announcement_category) { Fabricate(:category, name: "Announcements") }
fab!(:feedback_category) { Fabricate(:category, name: "Feedback") } fab!(:feedback_category) { Fabricate(:category, name: "Feedback") }
fab!(:group1) { Fabricate(:group, name: "group1") }
fab!(:group2) { Fabricate(:group, name: "group2") }
fab!(:feature_topic) do fab!(:feature_topic) do
Fabricate( Fabricate(
:topic, :topic,
@ -54,6 +58,32 @@ describe DiscourseAi::Utils::Research::Filter do
fab!(:feature_bug_post) { Fabricate(:post, topic: feature_bug_topic, user: user) } fab!(:feature_bug_post) { Fabricate(:post, topic: feature_bug_topic, user: user) }
fab!(:no_tag_post) { Fabricate(:post, topic: no_tag_topic, user: user) } fab!(:no_tag_post) { Fabricate(:post, topic: no_tag_topic, user: user) }
describe "group filtering" do
before do
group1.add(user)
group2.add(user2)
end
it "supports filtering by groups" do
no_tag_post.update!(user_id: user2.id)
filter = described_class.new("group:group1")
expect(filter.search.pluck(:id)).to contain_exactly(
feature_post.id,
bug_post.id,
feature_bug_post.id,
)
filter = described_class.new("groups:group1,group2")
expect(filter.search.pluck(:id)).to contain_exactly(
feature_post.id,
bug_post.id,
feature_bug_post.id,
no_tag_post.id,
)
end
end
describe "security filtering" do describe "security filtering" do
fab!(:secure_group) { Fabricate(:group) } fab!(:secure_group) { Fabricate(:group) }
fab!(:secure_category) { Fabricate(:category, name: "Secure") } fab!(:secure_category) { Fabricate(:category, name: "Secure") }
@ -122,7 +152,7 @@ describe DiscourseAi::Utils::Research::Filter do
# it can tack on topics # it can tack on topics
filter = filter =
described_class.new( described_class.new(
"category:Announcements topic:#{feature_bug_post.topic.id},#{no_tag_post.topic.id}", "category:Announcements OR topic:#{feature_bug_post.topic.id},#{no_tag_post.topic.id}",
) )
expect(filter.search.pluck(:id)).to contain_exactly( expect(filter.search.pluck(:id)).to contain_exactly(
feature_post.id, feature_post.id,
@ -175,6 +205,25 @@ describe DiscourseAi::Utils::Research::Filter do
Fabricate(:post, raw: "No fruits here", topic: no_tag_topic, user: user) Fabricate(:post, raw: "No fruits here", topic: no_tag_topic, user: user)
end end
fab!(:reply_on_bananas) do
Fabricate(:post, raw: "Just a reply", topic: post_with_bananas.topic, user: user)
end
it "correctly filters posts by topic_keywords" do
topic1 = post_with_bananas.topic
topic2 = post_with_both.topic
filter = described_class.new("topic_keywords:banana")
expected = topic1.posts.pluck(:id) + topic2.posts.pluck(:id)
expect(filter.search.pluck(:id)).to contain_exactly(*expected)
filter = described_class.new("topic_keywords:banana post_type:first")
expect(filter.search.pluck(:id)).to contain_exactly(
topic1.posts.order(:post_number).first.id,
topic2.posts.order(:post_number).first.id,
)
end
it "correctly filters posts by full text keywords" do it "correctly filters posts by full text keywords" do
filter = described_class.new("keywords:apples") filter = described_class.new("keywords:apples")
expect(filter.search.pluck(:id)).to contain_exactly(post_with_apples.id, post_with_both.id) expect(filter.search.pluck(:id)).to contain_exactly(post_with_apples.id, post_with_both.id)

View File

@ -43,6 +43,69 @@ RSpec.describe AiTool do
expect(runner.invoke).to eq("query" => "test") expect(runner.invoke).to eq("query" => "test")
end end
it "can base64 encode binary HTTP responses" do
# Create binary data with all possible byte values (0-255)
binary_data = (0..255).map(&:chr).join
expected_base64 = Base64.strict_encode64(binary_data)
script = <<~JS
function invoke(params) {
const result = http.post("https://example.com/binary", {
body: "test",
base64Encode: true
});
return result.body;
}
JS
tool = create_tool(script: script)
runner = tool.runner({}, llm: nil, bot_user: nil)
stub_request(:post, "https://example.com/binary").to_return(
status: 200,
body: binary_data,
headers: {
},
)
result = runner.invoke
expect(result).to eq(expected_base64)
# Verify we can decode back to original binary data
expect(Base64.strict_decode64(result).bytes).to eq((0..255).to_a)
end
it "can base64 encode binary GET responses" do
# Create binary data with all possible byte values (0-255)
binary_data = (0..255).map(&:chr).join
expected_base64 = Base64.strict_encode64(binary_data)
script = <<~JS
function invoke(params) {
const result = http.get("https://example.com/binary", {
base64Encode: true
});
return result.body;
}
JS
tool = create_tool(script: script)
runner = tool.runner({}, llm: nil, bot_user: nil)
stub_request(:get, "https://example.com/binary").to_return(
status: 200,
body: binary_data,
headers: {
},
)
result = runner.invoke
expect(result).to eq(expected_base64)
# Verify we can decode back to original binary data
expect(Base64.strict_decode64(result).bytes).to eq((0..255).to_a)
end
it "can perform HTTP requests with various verbs" do it "can perform HTTP requests with various verbs" do
%i[post put delete patch].each do |verb| %i[post put delete patch].each do |verb|
script = <<~JS script = <<~JS
@ -676,6 +739,87 @@ RSpec.describe AiTool do
end end
end end
it "can use sleep function with limits" do
script = <<~JS
function invoke(params) {
let results = [];
for (let i = 0; i < 3; i++) {
let result = sleep(1); // 1ms sleep
results.push(result);
}
return results;
}
JS
tool = create_tool(script: script)
runner = tool.runner({}, llm: nil, bot_user: nil)
result = runner.invoke
expect(result).to eq([{ "slept" => 1 }, { "slept" => 1 }, { "slept" => 1 }])
end
let(:jpg) { plugin_file_from_fixtures("1x1.jpg") }
describe "upload base64 encoding" do
it "can get base64 data from upload ID and short URL" do
upload = UploadCreator.new(jpg, "1x1.jpg").create_for(Discourse.system_user.id)
# Test with upload ID
script_id = <<~JS
function invoke(params) {
return upload.getBase64(params.upload_id, params.max_pixels);
}
JS
tool = create_tool(script: script_id)
runner =
tool.runner(
{ "upload_id" => upload.id, "max_pixels" => 1_000_000 },
llm: nil,
bot_user: nil,
)
result_id = runner.invoke
expect(result_id).to be_present
expect(result_id).to be_a(String)
expect(result_id.length).to be > 0
# Test with short URL
script_url = <<~JS
function invoke(params) {
return upload.getBase64(params.short_url, params.max_pixels);
}
JS
tool = create_tool(script: script_url)
runner =
tool.runner(
{ "short_url" => upload.short_url, "max_pixels" => 1_000_000 },
llm: nil,
bot_user: nil,
)
result_url = runner.invoke
expect(result_url).to be_present
expect(result_url).to be_a(String)
expect(result_url).to eq(result_id) # Should return same base64 data
# Test with invalid upload ID
script_invalid = <<~JS
function invoke(params) {
return upload.getBase64(99999);
}
JS
tool = create_tool(script: script_invalid)
runner = tool.runner({}, llm: nil, bot_user: nil)
result_invalid = runner.invoke
expect(result_invalid).to be_nil
end
end
describe "upload URL resolution" do describe "upload URL resolution" do
it "can resolve upload short URLs to public URLs" do it "can resolve upload short URLs to public URLs" do
upload = upload =