Rafael dos Santos Silva 2c0f535bab
FEATURE: HyDE-powered semantic search. (#136)
* FEATURE: HyDE-powered semantic search.

It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.

We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.

This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.

Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.

* Missing translation and rate limiting

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-09-05 11:08:23 -03:00

33 lines
895 B
Ruby

# frozen_string_literal: true
module DiscourseAi
module Embeddings
module HydeGenerators
class Anthropic < DiscourseAi::Embeddings::HydeGenerators::Base
def prompt(search_term)
<<~TEXT
Given a search term given between <input> tags, generate a forum post about the search term.
Respond with the generated post between <ai> tags.
<input>#{search_term}</input>
TEXT
end
def models
%w[claude-instant-1 claude-2]
end
def hypothetical_post_from(query)
response =
::DiscourseAi::Inference::AnthropicCompletions.perform!(
prompt(query),
SiteSetting.ai_embeddings_semantic_search_hyde_model,
).dig(:completion)
Nokogiri::HTML5.fragment(response).at("ai").text
end
end
end
end
end