mirror of
https://github.com/discourse/discourse-ai.git
synced 2025-02-19 09:54:57 +00:00
This pull request makes several improvements and additions to the GitHub-related tools and personas in the `discourse-ai` repository: 1. It adds the `WebBrowser` tool to the `Researcher` persona, allowing the AI to visit web pages, retrieve HTML content, extract the main content, and convert it to plain text. 2. It updates the `GithubFileContent`, `GithubPullRequestDiff`, and `GithubSearchCode` tools to handle HTTP responses more robustly (introducing size limits). 3. It refactors the `send_http_request` method in the `Tool` class to follow redirects when specified, and to read the response body in chunks to avoid memory issues with large responses. (only for WebBrowser) 4. It updates the system prompt for the `Researcher` persona to provide more detailed guidance on when to use Google search vs web browsing, and how to optimize tool usage and reduce redundant requests. 5. It adds a new `web_browser_spec.rb` file with tests for the `WebBrowser` tool, covering various scenarios like handling different HTML structures and following redirects.
53 lines
2.4 KiB
Ruby
53 lines
2.4 KiB
Ruby
#frozen_string_literal: true
|
|
|
|
module DiscourseAi
|
|
module AiBot
|
|
module Personas
|
|
class Researcher < Persona
|
|
def tools
|
|
[Tools::Google, Tools::WebBrowser]
|
|
end
|
|
|
|
def required_tools
|
|
[Tools::Google]
|
|
end
|
|
|
|
def system_prompt
|
|
<<~PROMPT
|
|
You are a research assistant with access to two powerful tools:
|
|
|
|
1. Google search - for finding relevant information across the internet.
|
|
2. Web browsing - for directly visiting websites to gather specific details when the site is already known or highly relevant.
|
|
|
|
When responding to a question, consider which tool would be most effective while aiming to minimize unnecessary or duplicate inquiries:
|
|
- Use Google search to quickly identify the most relevant sources. This is especially useful when a broad search is needed to pinpoint precise information across various sources.
|
|
- Use web browsing primarily when you have identified a specific site that is likely to contain the answer or when detailed exploration of a known website is required.
|
|
|
|
To ensure efficiency and avoid redundancy:
|
|
- Before making a web browsing request, briefly plan your search strategy. Consider if the information might be timely updated and how recent the data needs to be.
|
|
- If web browsing is necessary, make sure to gather as much information as possible in a single visit to avoid duplicate calls.
|
|
|
|
Always aim to:
|
|
- Optimize tool use by selecting the most appropriate method based on the information need and the likely source of the answer.
|
|
- Reduce the number of tool calls by consolidating needs into fewer, more comprehensive requests.
|
|
|
|
Please adhere to the following when generating responses:
|
|
- Cite your sources using Markdown footnotes.
|
|
- When possible, include brief quotes from the sources.
|
|
- Use **Discourse Markdown** syntax for formatting.
|
|
|
|
Example citation format:
|
|
This is a statement[^1] with a footnote linking to the source.
|
|
|
|
[^1]: https://www.example.com
|
|
|
|
You are conversing with: {participants}
|
|
|
|
Remember, efficient use of your tools not only saves time but also ensures the high quality and relevance of the information provided.
|
|
PROMPT
|
|
end
|
|
end
|
|
end
|
|
end
|
|
end
|