Commit Graph

907 Commits

Author SHA1 Message Date
Rafael dos Santos Silva 791fad1e6a
FEATURE: Index embeddings using bit vectors (#824)
On very large sites, the rare cache misses for Related Topics can take around 200ms, which affects our p99 metric on the topic page. In order to mitigate this impact, we now have several tools at our disposal.

First, one is to migrate the index embedding type from halfvec to bit and change the related topic query to leverage the new bit index by changing the search algorithm from inner product to Hamming distance. This will reduce our index sizes by 90%, severely reducing the impact of embeddings on our storage. By making the related query a bit smarter, we can have zero impact on recall by using the index to over-capture N*2 results, then re-ordering those N*2 using the full halfvec vectors and taking the top N. The expected impact is to go from 200ms to <20ms for cache misses and from a 2.5GB index to a 250MB index on a large site.

Another tool is migrating our index type from IVFFLAT to HNSW, which can increase the cache misses performance even further, eventually putting us in the under 5ms territory. 

Co-authored-by: Roman Rizzi <roman@discourse.org>
2024-10-14 13:26:03 -03:00
Kelv 6615104389
DEV: Switch to use pnpm (#833) 2024-10-14 13:37:20 +02:00
Hoa Nguyen 94010a5f78
FEATURE: Tools for models from Ollama provider (#819)
Adds support for Ollama function calling
2024-10-11 07:25:53 +11:00
Sam 6c4c96e83c
FEATURE: allow persona to only force tool calls on limited replies (#827)
This introduces another configuration that allows operators to
limit the amount of interactions with forced tool usage.

Forced tools are very handy in initial llm interactions, but as
conversation progresses they can hinder by slowing down stuff
and adding confusion.
2024-10-11 07:23:42 +11:00
Mark VanLandingham 52d90cf1bc
DEV: Add apply_modifier for SemanticTopicQuery topics list (#830) 2024-10-10 12:13:16 -05:00
Bianca Nenciu c5b323fc07
DEV: Fix mismatched column types in tests (#826)
The primary key is usually a bigint column, but the foreign key columns
usually are of integer type. This can lead to issues when joining these
columns due to mismatched types and different value ranges.

In a recent core change, all bigint sequences will start at a very high
value in the test environment to surface this type of errors. The same
change also added a temporary API that changes the column type to bigint
in order to allow for the tests to run.

The plugin API is only temporary and it is important for these plugins
to migrate their columns to bigint to avoid issues in the future.
2024-10-10 18:39:36 +03:00
Rafael dos Santos Silva 95e70474fd
DEV: Skip flaky test (#829) 2024-10-10 12:02:31 -03:00
Martin Brennan 4f6b36147b
UX: Remove AdminPageSubheader style override (#828)
In this core commit https://github.com/discourse/discourse/pull/29149 we
are changing the subheader title to H2 and making the size smaller,
this style override is no longer needed.
2024-10-10 17:18:32 +10:00
Mark VanLandingham 51494db236
REVERT: "DEV: Convert related-topics to gjs (#822)" (#825)
This reverts commit a3c6938cb3.
2024-10-09 10:10:03 -05:00
Sam e1a0eb6131
FEATURE: support chain halting and upload creation support (#821)
This adds chain halting (ability to terminate llm chain in a tool)
and the ability to create uploads in a tool

Together this lets us integrate custom image generators into a
custom tool.
2024-10-09 08:17:45 +11:00
Discourse Translator Bot 3170e14acb
Update translations (#823) 2024-10-08 20:21:52 +02:00
Jarek Radosz a3c6938cb3
DEV: Convert related-topics to gjs (#822) 2024-10-08 14:16:08 +02:00
Sam 545500b329
FEATURE: allows forced LLM tool use (#818)
* FEATURE: allows forced LLM tool use

Sometimes we need to force LLMs to use tools, for example in RAG
like use cases we may want to force an unconditional search.

The new framework allows you backend to force tool usage.

Front end commit to follow

* UI for forcing tools now works, but it does not react right

* fix bugs

* fix tests, this is now ready for review
2024-10-05 09:46:57 +10:00
Sam c294b6d394
FEATURE: allow llm triage to automatically hide posts (#820)
Previous to this change we could flag, but there was no way
to hide content and treat the flag as spam.

We had the option to hide topics, but this is not desirable for
a spam reply.

New option allows triage to hide a post if it is a reply, if the
post happens to be the first post on the topic, the topic will
be hidden.
2024-10-04 16:11:30 +10:00
Keegan George 110a1629aa
DEV: Update rate limits for image captioning (#816)
This PR updates the rate limits for AI helper so that image caption follows a specific rate limit of 20 requests per minute. This should help when uploading multiple files that need to be captioned. This PR also updates the UI so that it shows toast message with the extracted error message instead of having a blocking `popupAjaxError` error dialog.
---------

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2024-10-02 10:36:35 -07:00
Discourse Translator Bot 7ae6c17236
Update translations (#814) 2024-10-02 08:51:12 +02:00
Martin Brennan 7325fb21ab
DEV: Use section landing components for LLMs templates (#817)
Relies on https://github.com/discourse/discourse/pull/28477,
uses AdminSectionLandingWrapper and AdminSectionLandingItem
for the section items on the LLM page which are used to create
a new LLM config from a template.
2024-10-02 15:31:48 +10:00
Kris 62ba2fa4d7
UX: update icon and text for copying message (#815) 2024-10-01 18:38:57 -04:00
Hoa Nguyen 2063b3854f
FEATURE: Add Ollama provider (#812)
This allows our users to add the Ollama provider and use it to serve our AI bot (completion/dialect).

In this PR, we introduce:

    DiscourseAi::Completions::Dialects::Ollama which would help us translate by utilizing Completions::Endpoint::Ollama
    Correct extract_completion_from and partials_from in Endpoints::Ollama

Also

    Add tests for Endpoints::Ollama
    Introduce ollama_model fabricator
2024-10-01 10:45:03 +10:00
Discourse Translator Bot c7eaea48f5
Update translations (#811) 2024-09-30 17:31:26 +10:00
Sam 5cbc9190eb
FEATURE: RAG search within tools (#802)
This allows custom tools access to uploads and sophisticated searches using embedding.

It introduces:

 - A shared front end for listing and uploading files (shared with personas)
 -  Backend implementation of index.search function within a custom tool.

Custom tools now may search through uploaded files

function invoke(params) {
   return index.search(params.query)
}

This means that RAG implementers now may preload tools with knowledge and have high fidelity over
the search.

The search function support

    specifying max results
    specifying a subset of files to search (from uploads)

Also

 - Improved documentation for tools (when creating a tool a preamble explains all the functionality)
  - uploads were a bit finicky, fixed an edge case where the UI would not show them as updated
2024-09-30 17:27:50 +10:00
Kris 18ecc843e5
UX: move templates to main LLM config tab, restyle (#813)
Restructures LLM config page so it is far clearer. 

Also corrects bugs around adding LLMs and having LLMs not editable post addition 
---------

Co-authored-by: Sam Saffron <sam.saffron@gmail.com>
2024-09-30 17:15:11 +10:00
Hoa Nguyen 1002dc877d
DEV: remove ignore column syntax for the removed provider column in completion prompt model (#810) 2024-09-30 08:57:23 +10:00
chapoi 8cf1798afe
UX: AI composer helper z-index issue (#809) 2024-09-23 17:01:04 -04:00
Keegan George 95f80325e5
DEV: Prevent close of summary from outside clicks (#808)
Often it is helpful to have the summary box open while composing a reply to the topic. However, the summary box currently gets closed each time you click outside the box. In this PR we add `closeOnClickOutside: false` attribute to the `DMenu` options for summary box to prevent that from occurring.
2024-09-18 10:36:42 -07:00
Keegan George e666266473
DEV: Make indicator wave a reusable component (#807)
Previously we had some hardcoded markup with scss making a loading indicator wave. This code was being duplicated and used in both semantic search and summarization. We want to add the indicator wave to the AI helper diff modal as well and have the text flashing instead of the loading spinner. To ensure we do not repeat ourselves, in this PR we turn the summary indicator wave into a reusable template only component called: `AiIndicatorWave`. We then apply the usage of that component to semantic search, summarization, and the composer helper modal.
2024-09-18 09:53:54 -07:00
chapoi 1e155942bb
UX: take composer height into account when calculating the max-height for topic summary (#806)
* remove unused import

* UX: take composer height into account when calculating the max-height for the topic summary
2024-09-18 14:54:41 +10:00
Discourse Translator Bot 101f1e9512
Update translations (#799) 2024-09-18 09:53:01 +10:00
Keegan George 513510d6d0
FIX: AI Helper not visible on iPads (#805)
This commit fixes an issue where the composer AI helper was not visible on iPad in DiscourseHub. This was due to the z-index being different for `reply-control` when Discourse Hub inserts its `footer-nav`
2024-09-17 16:43:15 -07:00
Sam 4b21eb7974
FEATURE: basic support for GPT-o models (#804)
Caveats

- No streaming, by design
- No tool support (including no XML tools)
- No vision

Open AI will revamt the model and more of these features may
become available.

This solution is a bit hacky for now
2024-09-17 09:41:00 +10:00
Keegan George 493d65af1f
FIX: Diff modal closing along with composer menu on mobile (#803)
The `DiffModal` is triggered after selecting an option in the composer helper menu. After selecting an option, we should close the composer helper menu and only show the diff modal. On mobile, there was an edge-case where `this.args.close()` for was causing the closing of both the `DiffModal` and the `AiComposerHelperMenu`. This PR resolves that by ensuring the menu is closed _first_ asynchronously, followed by opening the relevant modal.
2024-09-16 14:00:41 -07:00
Sam 03eccbe392
FEATURE: Make tool support polymorphic (#798)
Polymorphic RAG means that we will be able to access RAG fragments both from AiPersona and AiCustomTool

In turn this gives us support for richer RAG implementations.
2024-09-16 08:17:17 +10:00
Keegan George b16390ae2a
UX: Improve toast message location (#800) 2024-09-14 09:19:13 +10:00
Keegan George 9374cd7ac1
FIX: Keyboard shortcut should be platform specific (#801) 2024-09-14 09:18:07 +10:00
Keegan George 9cd14b0003
DEV: Move composer AI helper to toolbar (#796)
Previously we had moved the AI helper from the options menu to a selection menu that appears when selecting text in the composer. This had the benefit of making the AI helper a more discoverable feature. Now that some time has passed and the AI helper is more recognized, we will be moving it back to the composer toolbar.

This is better because:
- It consistent with other behavior and ways of accessing tools in the composer
- It has an improved mobile experience
- It reduces unnecessary code and keeps things easier to migrate when we have composer V2.
- It allows for easily triggering AI helper for all content by clicking the button instead of having to select everything.
2024-09-13 11:59:30 -07:00
Sam 5b9add0ac8
FEATURE: add a SambaNova LLM provider (#797)
Note, at the moment the context window is quite small, it is
mainly useful as a helper backend or hyde generator
2024-09-12 11:28:08 +10:00
chapoi 22d1e71dc9
UX: AI post helper DMenu styling (#770) 2024-09-11 05:45:48 +02:00
Sam 36ce88f356
FIX: support case insensitive setting lookup (#795) 2024-09-10 15:21:03 +10:00
Sam a5b5c3bebe
PERF: speed up spec (#794)
~500ms -> ~100ms

It is still not a super fast spec given search is not free, but
it is a bit faster and clearer
2024-09-04 16:14:32 +10:00
Sam cabecb801e
FEATURE: disable rate limiting when skipping hyde (#793)
Embedding search is rate limited due to potentially expensive
hyde operation (which require LLM access).

Embedding generally is very cheap compared to it. (usually 100x cheaper)

This raises the limit to 100 per minute for embedding searches,
while keeping the old 4 per minute for HyDE powered search.
2024-09-04 15:51:01 +10:00
Roman Rizzi c4c9dc2034
FIX: Display cached summaries with our new streamer. (#792)
Make sure the summary box is in the DOM before attempting to
display a cached summary.:
2024-09-03 18:45:28 -03:00
Sam a48acc894a
FEATURE: more accurate and faster titles (#791)
Previously we waited 1 minute before automatically titling PMs

The new change introduces adding a title immediately after the the
llm replies

Prompt was also modified to include the LLM reply in title suggestion.

This helps situation like:

user: tell me a joke
llm: a very funy joke about horses

Then the title would be "A Funny Horse Joke"

Specs already covered some auto title logic, amended to also
catch the new message bus message we have been sending.
2024-09-03 15:52:20 +10:00
Discourse Translator Bot b0ae2138af
Update translations (#774) 2024-09-02 18:00:14 +02:00
dependabot[bot] b0d1eee0ce
Build(deps): Bump micromatch from 4.0.5 to 4.0.8 (#790)
Bumps [micromatch](https://github.com/micromatch/micromatch) from 4.0.5 to 4.0.8.
- [Release notes](https://github.com/micromatch/micromatch/releases)
- [Changelog](https://github.com/micromatch/micromatch/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/micromatch/compare/4.0.5...4.0.8)

---
updated-dependencies:
- dependency-name: micromatch
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-01 12:45:48 +02:00
dependabot[bot] 8ea26e8cfc
Build(deps-dev): Bump rexml from 3.3.3 to 3.3.6 (#768)
Bumps [rexml](https://github.com/ruby/rexml) from 3.3.3 to 3.3.6.
- [Release notes](https://github.com/ruby/rexml/releases)
- [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md)
- [Commits](https://github.com/ruby/rexml/compare/v3.3.3...v3.3.6)

---
updated-dependencies:
- dependency-name: rexml
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-01 12:23:09 +02:00
Roman Rizzi db5cbfb148
FIX: Bail earlier when a chat thread has no messages (#789) 2024-08-30 17:17:14 -03:00
Roman Rizzi ed97827f49
FIX: Correctly display errors when parent module needs to be disabled first (#788)
* FIX: Correctly display errors when parent module needs to be disabled first

* Update spec/configuration/llm_validator_spec.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

---------

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2024-08-30 17:16:11 -03:00
Roman Rizzi e408cd080c
FIX: coerce value before downcasing the hyde param (#787) 2024-08-30 12:13:29 -03:00
Kris 4a1874265e
UX: replace "share" with "share-alt" icon (#784) 2024-08-30 12:13:09 -03:00
Sam 584753cf60
FIX: we were never reindexing old content (#786)
* FIX: we were never reindexing old content

Embedding backfill contains logic for searching for old content
change and then backfilling.

Unfortunately it was excluding all topics that had embedding
unconditionally, leading to no backfill ever happening.


This change adds a test and ensures we backfill.

* over select results, this ensures we will be more likely to find
ai results when filtered
2024-08-30 14:37:55 +10:00