Rafael dos Santos Silva 2c0f535bab
FEATURE: HyDE-powered semantic search. (#136)
* FEATURE: HyDE-powered semantic search.

It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.

We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.

This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.

Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.

* Missing translation and rate limiting

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-09-05 11:08:23 -03:00

31 lines
832 B
Ruby

# frozen_string_literal: true
module DiscourseAi
module Embeddings
module HydeGenerators
class OpenAi < DiscourseAi::Embeddings::HydeGenerators::Base
def prompt(search_term)
[
{
role: "system",
content: "You are a helpful bot. You create forum posts about a given topic.",
},
{ role: "user", content: "Create a forum post about the topic: #{search_term}" },
]
end
def models
%w[gpt-3.5-turbo gpt-4]
end
def hypothetical_post_from(query)
::DiscourseAi::Inference::OpenAiCompletions.perform!(
prompt(query),
SiteSetting.ai_embeddings_semantic_search_hyde_model,
).dig(:choices, 0, :message, :content)
end
end
end
end
end