discourse-ai/db/post_migrate
Rafael dos Santos Silva 791fad1e6a
FEATURE: Index embeddings using bit vectors (#824)
On very large sites, the rare cache misses for Related Topics can take around 200ms, which affects our p99 metric on the topic page. In order to mitigate this impact, we now have several tools at our disposal.

First, one is to migrate the index embedding type from halfvec to bit and change the related topic query to leverage the new bit index by changing the search algorithm from inner product to Hamming distance. This will reduce our index sizes by 90%, severely reducing the impact of embeddings on our storage. By making the related query a bit smarter, we can have zero impact on recall by using the index to over-capture N*2 results, then re-ordering those N*2 using the full halfvec vectors and taking the top N. The expected impact is to go from 200ms to <20ms for cache misses and from a 2.5GB index to a 250MB index on a large site.

Another tool is migrating our index type from IVFFLAT to HNSW, which can increase the cache misses performance even further, eventually putting us in the under 5ms territory. 

Co-authored-by: Roman Rizzi <roman@discourse.org>
2024-10-14 13:26:03 -03:00
..
20240119152348_explicit_provider_backwards_compat.rb FEATURE: GPT4o support and better auditing (#618) 2024-05-14 13:28:46 +10:00
20240207144910_fix_llm_backed_setting_defaults.rb FIX: Explicit check for empty string in compat migration (#463) 2024-02-07 14:51:51 -03:00
20240528144216_seed_open_ai_models.rb DEV: Rewire AI bot internals to use LlmModel (#638) 2024-06-18 14:32:14 -03:00
20240531205234_seed_claude_models.rb DEV: Rewire AI bot internals to use LlmModel (#638) 2024-06-18 14:32:14 -03:00
20240603133432_seed_other_propietary_models.rb DEV: Rewire AI bot internals to use LlmModel (#638) 2024-06-18 14:32:14 -03:00
20240603143158_seed_oss_models.rb FEATURE: allow access to private topics if tool permits (#673) 2024-06-19 15:49:36 +10:00
20240606152117_copy_summary_sections_to_ai_summaries.rb DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
20240609232736_drop_commands_from_ai_personas.rb FEATURE: optional tool detail blocks (#662) 2024-06-11 18:14:14 +10:00
20240610232040_copy_summarization_strategy_to_ai_summarization_strategy.rb DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
20240610232546_copy_custom_summarization_allowed_groups_to_ai_custom_summarization_allowed_groups.rb DEV: Add summarization logic from core (#658) 2024-07-02 08:51:59 -07:00
20240611170906_drop_old_embeddings_tables.rb DEV: Move to single table per embeddings type (#561) 2024-08-08 11:55:20 -03:00
20240619193057_choose_llm_model_setting_migration.rb DEV: Transition "Select model" settings to only use LlmModels (#675) 2024-06-19 18:01:35 -03:00
20240619211337_update_automation_script_models.rb DEV: Use LlmModels as options in automation rules (#676) 2024-06-21 08:07:17 +10:00
20240624202602_add_provider_specific_params_to_llm_models.rb DEV: Use Rails 7.0 instead of 7.1 in post-migrations 2024-06-26 18:41:38 +02:00
20240703135444_llm_models_for_summarization.rb FEATURE: move summary to use llm_model (#699) 2024-07-04 10:48:18 +10:00
20240708193243_fix_vllm_model_name.rb FIX: Flaky SRV-backed model seeding. (#708) 2024-07-08 18:47:10 -03:00
20240724174343_migrate_vision_llms.rb DEV: Remove old code now that features rely on LlmModels. (#729) 2024-07-30 13:44:57 -03:00
20240729202857_migrate_persona_llm_override.rb DEV: Remove old code now that features rely on LlmModels. (#729) 2024-07-30 13:44:57 -03:00
20240809162837_rename_ai_helper_enabled_setting.rb DEV: Clearly separate post/composer helper settings (#747) 2024-08-12 15:40:23 -07:00
20240809163303_rename_ai_helper_allowed_groups_setting.rb DEV: Clearly separate post/composer helper settings (#747) 2024-08-12 15:40:23 -07:00
20240912055831_drop_persona_id_from_rag_document_fragments.rb FEATURE: Make tool support polymorphic (#798) 2024-09-16 08:17:17 +10:00
20241008055831_drop_old_embeddings_indexes.rb FEATURE: Index embeddings using bit vectors (#824) 2024-10-14 13:26:03 -03:00