Commit Graph

875 Commits

Author SHA1 Message Date
Sam e817b7dc11
FEATURE: improve tool support (#904)
This re-implements tool support in DiscourseAi::Completions::Llm #generate

Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML

New implementation has the endpoints return ToolCall objects.

Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement

decode, decode_chunk (for streaming)

It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up.

Also (new)

    Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM
    Token accounting is fixed for vllm (we were not correctly counting tokens)
2024-11-12 08:14:30 +11:00
Keegan George 644141ff08
FIX: Regenerate summary button still shows cached summary (#903)
This PR fixes an issue where clicking to regenerate a summary was still showing the cached summary. To resolve this we call resetSummary() to reset all the summarization related properties before creating a new request.
2024-11-07 16:01:18 -08:00
Roman Rizzi fbc74c7467
FEATURE: Extend summary backfill to also generate gists (#896)
Updates default batch size to 0 and max to 10000
2024-11-07 13:40:18 -03:00
Keegan George c421f713a3
DEV: Handle streaming animation within `AiSummaryBox` (#901)
This PR further decouples the streaming animation by completely handling the streaming animation directly in the `AiSummaryBox` component. Previously, handling the streaming animation by calling methods in the `ai-streamer` API was leading to timing issues making things out-of-sync. This results in some issues such as the last update of streamed text not being shown. Handling streaming directly in the component should simplify things drastically and prevent any issues.
2024-11-07 08:08:32 -08:00
Rafael dos Santos Silva 021e09607d
FIX: Unhide gemini api key setting for embeddings (#902) 2024-11-07 11:55:18 -03:00
Kris 893fa624e4
UX: increase gist size, and adjust surrounding elements to accommodate (#900) 2024-11-07 08:35:05 -05:00
Kris 1ad5321c09
UX: add sparkle icon to related topics for anons (#897) 2024-11-05 17:15:20 -05:00
Osama Sayegh da7c97d294
FIX: Specify type for Rag upload (#895)
Specifying `type` when using `UppyUpload` will be required as of https://github.com/discourse/discourse/pull/29600.
2024-11-05 22:10:54 +03:00
Discourse Translator Bot ab4678275d
Update translations (#894) 2024-11-05 16:55:54 +01:00
Keegan George 99282612a9
DEV: Prefer ENV key for seeded models (#893)
This PR ensures we prefer getting the API key from environment variables when it is a seeded model.
2024-11-05 06:19:13 -08:00
Roman Rizzi 9505a8976c
FEATURE: Automatically backfill regular summaries. (#892)
This change introduces a job to summarize topics and cache the results automatically. We provide a setting to control how many topics we'll backfill per hour and what the topic's minimum word count is to qualify.

We'll prioritize topics without summary over outdated ones.
2024-11-04 17:48:11 -03:00
Sam 98022d7d96
FEATURE: support custom instructions for persona streaming (#890)
This allows us to inject information into the system prompt
which can help shape replies without repeating over and over
in messages.
2024-11-05 07:43:26 +11:00
Jarek Radosz fa7ca8bc31
DEV: Use the new more-topics API (#885)
See: https://github.com/discourse/discourse/pull/29143
2024-11-04 17:42:50 +01:00
Rafael dos Santos Silva 772ee934ab
Migrate sentiment to a TEI backend (#886) 2024-11-04 09:14:34 -03:00
Sam bffe9dfa07
FIX: we must properly encode objects prior to escaping (#891)
in cases of arrays escapeHTML will not work)

*
2024-11-04 16:16:25 +11:00
Sam c352054d4e
FIX: encode parameters returned from LLMs correctly (#889)
Fixes encoding of params on LLM function calls.

Previously we would improperly return results if a function parameter returned an HTML tag.

Additionally adds some missing HTTP verbs to tool calls.
2024-11-04 10:07:17 +11:00
Roman Rizzi 7e3a543f6f
FEATURE: Double gist length to 40 words (#888) 2024-11-01 13:09:03 -03:00
Kris 32ea421408
UX: in share, use native image dimensions and hide filename (#880) 2024-10-31 13:51:10 -04:00
Roman Rizzi e8f0633141
DEV: Extend truncation to all summarizable content (#884) 2024-10-31 12:17:42 -03:00
Roman Rizzi e8eed710e0
FIX: Truncate OP for gists to help the model focus on the latest posts (#883) 2024-10-31 10:54:56 -03:00
Roman Rizzi 32fb023357
COPY: Include model names in sentiment report descriptions (#882) 2024-10-30 15:50:28 -03:00
Roman Rizzi 00e4a84305
COPY: Update sentiment report descriptions to clarify how it works (#881) 2024-10-30 11:48:32 -03:00
Sam 34a59b623e
FIX: ensure replies are never double streamed (#879)
The custom field "discourse_ai_bypass_ai_reply" was added so
we can signal the post created hook to bypass replying even
if it thinks it should.

Otherwise there are cases where we double answer user questions
leading to much confusion.

This also slightly refactors code making the controller smaller
2024-10-30 20:24:39 +11:00
Sam be0b78cacd
FEATURE: new endpoint for directly accessing a persona (#876)
The new `/admin/plugins/discourse-ai/ai-personas/stream-reply.json` was added.

This endpoint streams data direct from a persona and can be used
to access a persona from remote systems leaving a paper trail in
PMs about the conversation that happened

This endpoint is only accessible to admins.

---------

Co-authored-by: Gabriel Grubba <70247653+Grubba27@users.noreply.github.com>
Co-authored-by: Keegan George <kgeorge13@gmail.com>
2024-10-30 10:28:20 +11:00
Kris 05790a6a40
UX: convert AI gist disclosure to a toggle (#878) 2024-10-29 11:59:41 -04:00
Discourse Translator Bot e7a66b0789
Update translations (#877) 2024-10-29 15:31:51 +01:00
Roman Rizzi dd404c924a
DEV: Use different feature_names for summarization strategies (#875) 2024-10-29 08:45:14 -03:00
dependabot[bot] 0f0f2a247a
Build(deps-dev): Bump rexml from 3.3.6 to 3.3.9 (#874)
Bumps [rexml](https://github.com/ruby/rexml) from 3.3.6 to 3.3.9.
- [Release notes](https://github.com/ruby/rexml/releases)
- [Changelog](https://github.com/ruby/rexml/blob/master/NEWS.md)
- [Commits](https://github.com/ruby/rexml/compare/v3.3.6...v3.3.9)

---
updated-dependencies:
- dependency-name: rexml
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-28 20:44:39 +01:00
Roman Rizzi 37b6461d68
FIX: Make sure that topic gists are displayed ONLY on the hot list. (#873) 2024-10-28 15:15:53 -03:00
Rafael dos Santos Silva 820b506910
DEV: Hide soon to be deprecated modules settings (#872) 2024-10-28 14:27:25 -03:00
David Taylor 945f04b089
DEV: Update plugin annotations (#871) 2024-10-28 14:07:09 +00:00
Bianca Nenciu 294c364a75
DEV: Fix mismatched column types (#868)
The primary key is usually a bigint column, but the foreign key columns
are usually of integer type. This can lead to issues when joining these
columns due to mismatched types and different value ranges.

This was using a temporary plugin / test API to make tests pass, but it
is safe to alter "ai_document_fragment_embeddings" and
"rag_document_fragments" tables because they usually have less than 1M
rows and migration is going to be fast.

Depending on the size of the community, "classification_results" table
may have more than 1M rows and the migration will lock the table for a
longer time. However, classification runs in background jobs and they
will be automatically retried if they fail due to the lock, which makes
it acceptable.
2024-10-28 15:36:42 +02:00
jbrw c479b177a7
DEV: Check for presence of currentRoute.attributes (#870) 2024-10-25 13:30:29 -04:00
Rafael dos Santos Silva 8ded4b2e58
FIX: Use present? instead of invalid exists? (#869) 2024-10-25 13:04:42 -03:00
Roman Rizzi a2b1ea3c63
FEATURE: Fast-track gist regeneration when a hot topic gets a new post (#860)
* FEATURE: Fast-track gist regeneration when a hot topic gets a new post

* DEV: Introduce an upsert-like summarize

* FIX: Only enqueue fast-track gist for hot hot hot topics

---------

Co-authored-by: Rafael Silva <xfalcox@gmail.com>
2024-10-25 12:38:49 -03:00
Rafael dos Santos Silva 33da27e231
FIX: Change hot gist prompt to avoid title repeating #859 (#859)
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2024-10-25 12:12:33 -03:00
Roman Rizzi ec97996905
FIX/REFACTOR: FoldContent revamp (#866)
* FIX/REFACTOR: FoldContent revamp

We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic.

The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex.

To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize.

Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with.

* Fix fold docs

* Use #shift instead of #pop to get the first elem, not the last
2024-10-25 11:51:17 -03:00
Sam 12869f2146
FIX: testing tool was not showing rag results (#867)
This changeset contains 4 fixes:

1. We were allowing running tests on unsaved tools,
this is problematic cause uploads are not yet associated or indexed
leading to confusing results. We now only show the test button when
tool is saved.


2. We were not properly scoping rag document fragements, this
meant that personas and ai tools could get results from other
unrelated tools, just to be filtered out later


3. index.search showed options as "optional" but implementation
required the second option

4. When testing tools searching through document fragments was
not working at all cause we did not properly load the tool
2024-10-25 16:01:25 +11:00
Sam 4923837165
FIX: Llm selector / forced tools / search tool (#862)
* FIX: Llm selector / forced tools / search tool


This fixes a few issues:

1. When search was not finding any semantic results we would break the tool
2. Gemin / Anthropic models did not implement forced tools previously despite it being an API option
3. Mechanics around displaying llm selector were not right. If you disabled LLM selector server side persona PM did not work correctly.
4. Disabling native tools for anthropic model moved out of a site setting. This deliberately does not migrate cause this feature is really rare to need now, people who had it set probably did not need it.
5. Updates anthropic model names to latest release

* linting

* fix a couple of tests I missed

* clean up conditional
2024-10-25 06:24:53 +11:00
Rafael dos Santos Silva 3022d34613
FEATURE: Support srv records for OpenAI compatible LLMs (#865) 2024-10-24 15:47:12 -03:00
David Taylor c1fa84ad29
DEV: Update form-template-field/upload uppy usage (#863) 2024-10-24 15:21:13 +01:00
Kris fc6f0a6560
UX: minor gist optimizations for readability (#864) 2024-10-24 09:31:52 -04:00
Keegan George 9e8608b070
UX: Hide AI bot in seeded LLM (#858)
AI bot won't be turned on for seeded LLMs so it makes no sense to expose it here. This will cleanup the template and avoid the double `{{#unless}}` check.
2024-10-23 16:36:17 -07:00
Kris 72111a10ae
UX: add disclosure for topic list gists (#861) 2024-10-23 19:32:22 -04:00
Rafael dos Santos Silva 96f5f8cbd0
FIX: Basic cleanup of AI Caption to remove line breaks and pipes (#857) 2024-10-23 18:38:29 -03:00
Keegan George 9af0c2e719
UX: Improve seeded LLM edit page (#856) 2024-10-23 13:58:27 -07:00
Kris 0aa2789437
UX: switch gist outlet to avoid badges (#855) 2024-10-23 15:10:41 -04:00
Kris 657d103919
UX: adjust gist position, darken unread color (#854) 2024-10-23 13:00:52 -04:00
Sam f1283e156d
FEATURE: allow scoping of google tool queries (#852)
This allows to simply scope search results to specific domains and prepend arbitrary snippets to searches made
2024-10-23 16:55:10 +11:00
Sam 059d3b6fd2
FEATURE: better logging for automation reports (#853)
A new feature_context json column was added to ai_api_audit_logs

This allows us to store rich json like context on any LLM request
made.

This new field now stores automation id and name.

Additionally allows llm_triage to specify maximum number of tokens

This means that you can limit the cost of llm triage by scanning only
first N tokens of a post.
2024-10-23 16:49:56 +11:00