discourse-ai/lib/modules/embeddings/semantic_topic_query.rb
Rafael dos Santos Silva 2c0f535bab
FEATURE: HyDE-powered semantic search. (#136)
* FEATURE: HyDE-powered semantic search.

It relies on the new outlet added on discourse/discourse#23390 to display semantic search results in an unobtrusive way.

We'll use a HyDE-backed approach for semantic search, which consists on generating an hypothetical document from a given keywords, which gets transformed into a vector and used in a asymmetric similarity topic search.

This PR also reorganizes the internals to have less moving parts, maintaining one hierarchy of DAOish classes for vector-related operations like transformations and querying.

Completions and vectors created by HyDE will remain cached on Redis for now, but we could later use Postgres instead.

* Missing translation and rate limiting

---------

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-09-05 11:08:23 -03:00

29 lines
883 B
Ruby

# frozen_string_literal: true
class DiscourseAi::Embeddings::SemanticTopicQuery < TopicQuery
def list_semantic_related_topics(topic)
query_opts = {
skip_ordering: true,
per_page: SiteSetting.ai_embeddings_semantic_related_topics,
unordered: true,
}
if !SiteSetting.ai_embeddings_semantic_related_include_closed_topics
query_opts[:status] = "open"
end
list =
create_list(:semantic_related, query_opts) do |topics|
candidate_ids = DiscourseAi::Embeddings::SemanticRelated.new.related_topic_ids_for(topic)
list =
topics
.where.not(id: topic.id)
.where(id: candidate_ids)
.order("array_position(ARRAY#{candidate_ids}, topics.id)") # array_position forces the order of the topics to be preserved
list = remove_muted(list, @user, query_opts)
end
end
end