Commit Graph

453 Commits

Author SHA1 Message Date
Roman Rizzi e54f2da1a5
FIX: Unnecessary complex preloading accidentally filters some topics. (#945)
The `topic_query_create_list_topics` modifier we append was always meant to avoid an N+1 situation when serializing gists. However, I tried to be too smart and only preload these, which resulted in some topics with *only* regular summaries getting removed from the list. This issue became apparent now we are adding gists to other lists besides hot.

Let's simplify the preloading, which still solves the N+1 issue, and let the serializer get the needed summary.
2024-11-22 12:07:27 -03:00
Joffrey JAFFEUX 2cc8115b48
FIX: disables temporarily ai_summaries filtering (#943) 2024-11-22 08:34:54 +01:00
Sam 86cf4ccba7
FIX: automatically bust cache for share ai assets (#942)
* FIX: automatically bust cache for share ai assets

CDNs can be configured to strip query params in Discourse
hosting. This is generally safe, but in this case we had
no way of busting the cache using the path.

New design properly caches and properly breaks busts the
cache if asset changes so we don't need to worry about versions

* one day I will set up conditional lint on save :)
2024-11-22 11:23:15 +11:00
Sam d56ed53eb1
FIX: cancel functionality regressed (#938)
The cancel messaging was not floating correctly to the HTTP call leading to impossible to cancel completions 

This is now fully tested as well.
2024-11-21 17:51:45 +11:00
Sam 52c644798d
DEV: improve artifact presentation (#932)
1. Keep source in a "details" block after rendered so it does
not overwhelm users

2. Ensure artifacts are never indexed by robots

3. Cache break our CSS that changed recently
2024-11-20 18:53:19 +11:00
Sam a0aec48606
FIX: gists are not html safe (#931)
Also allow "Everyone" in ai_hot_topic_gists_allowed_groups
2024-11-20 10:54:49 +11:00
Roman Rizzi 530a795d43
FIX: Instruct AR that we want to use ai_summaries for filtering. (#927)
We use `includes` instead of `joins` because we want to eager-load summaries, avoiding an extra query when summarizing. However, Rails will complain unless you explicitly inform them you plan to use that inside a `WHERE` clause.
2024-11-19 17:32:13 -03:00
Roman Rizzi fb80d776d8
FEATURE: Enable gists on all topic lists (#922) 2024-11-19 11:04:34 -03:00
Rafael dos Santos Silva 48d08dedd4
FEATURE: Emotion activity metrics table (#916) 2024-11-19 10:01:10 -03:00
David Taylor b10be23533
FIX: Ensure artifacts are sandboxed, even when visited directly (#921)
It's important that artifacts are never given 'same origin' access to the forum domain, so that they cannot access cookies, or make authenticated HTTP requests. So even when visiting the URL directly, we need to wrap them in a sandboxed iframe.
2024-11-19 11:44:17 +00:00
Sam 3ae1e4eaf0
FIX: properly bypass CSP for artifacts (#920)
Was meant to be bypassed but was not implemented correctly
2024-11-19 20:25:07 +11:00
Sam 755b63f31f
FEATURE: Add support for Mistral models (#919)
Adds support for mistral models (pixtral and mistral large now have presets)

Also corrects token accounting in AWS bedrock models
2024-11-19 17:28:09 +11:00
Sam 0d7f353284
FEATURE: AI artifacts (#898)
This is a significant PR that introduces AI Artifacts functionality to the discourse-ai plugin along with several other improvements. Here are the key changes:

1. AI Artifacts System:
   - Adds a new `AiArtifact` model and database migration
   - Allows creation of web artifacts with HTML, CSS, and JavaScript content
   - Introduces security settings (`strict`, `lax`, `disabled`) for controlling artifact execution
   - Implements artifact rendering in iframes with sandbox protection
   - New `CreateArtifact` tool for AI to generate interactive content

2. Tool System Improvements:
   - Adds support for partial tool calls, allowing incremental updates during generation
   - Better handling of tool call states and progress tracking
   - Improved XML tool processing with CDATA support
   - Fixes for tool parameter handling and duplicate invocations

3. LLM Provider Updates:
   - Updates for Anthropic Claude models with correct token limits
   - Adds support for native/XML tool modes in Gemini integration
   - Adds new model configurations including Llama 3.1 models
   - Improvements to streaming response handling

4. UI Enhancements:
   - New artifact viewer component with expand/collapse functionality
   - Security controls for artifact execution (click-to-run in strict mode)
   - Improved dialog and response handling
   - Better error management for tool execution

5. Security Improvements:
   - Sandbox controls for artifact execution
   - Public/private artifact sharing controls
   - Security settings to control artifact behavior
   - CSP and frame-options handling for artifacts

6. Technical Improvements:
   - Better post streaming implementation
   - Improved error handling in completions
   - Better memory management for partial tool calls
   - Enhanced testing coverage

7. Configuration:
   - New site settings for artifact security
   - Extended LLM model configurations
   - Additional tool configuration options

This PR significantly enhances the plugin's capabilities for generating and displaying interactive content while maintaining security and providing flexible configuration options for administrators.
2024-11-19 09:22:39 +11:00
Rafael dos Santos Silva 4fb686a548
FIX: Move emotion /filter logic into a CTE to keep cardinality sane (#915) 2024-11-14 17:16:48 -03:00
Rafael dos Santos Silva 5026ab52d0
FEATURE: Order by emotion on /filter (#913) 2024-11-14 12:45:40 -03:00
Sam 823e8ef490
FEATURE: partial tool call support for OpenAI and Anthropic (#908)
Implement streaming tool call implementation for Anthropic and Open AI.

When calling:

llm.generate(..., partial_tool_calls: true) do ...
Partials may contain ToolCall instances with partial: true, These tool calls are partially populated with json partially parsed.

So for example when performing a search you may get:

ToolCall(..., {search: "hello" })
ToolCall(..., {search: "hello world" })

The library used to parse json is:

https://github.com/dgraham/json-stream

We use a fork cause we need access to the internal buffer.

This prepares internals to perform partial tool calls, but does not implement it yet.
2024-11-14 06:58:24 +11:00
Sam 9551b1a4d1
FIX: do not strip empty string during stream processing (#911)
Fixes issue in Open AI provider eating newlines and spaces
2024-11-13 07:12:00 +11:00
Rafael dos Santos Silva aef9a03d4c
FEATURE: Truncate AI Captions to a reasonable max size (#907) 2024-11-12 15:52:46 -03:00
Sérgio Saquetim 9583964676
DEV: Added compatibility with the Glimmer Post Menu (#887) 2024-11-12 15:46:17 -03:00
Sam e817b7dc11
FEATURE: improve tool support (#904)
This re-implements tool support in DiscourseAi::Completions::Llm #generate

Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML

New implementation has the endpoints return ToolCall objects.

Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement

decode, decode_chunk (for streaming)

It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up.

Also (new)

    Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM
    Token accounting is fixed for vllm (we were not correctly counting tokens)
2024-11-12 08:14:30 +11:00
Keegan George 644141ff08
FIX: Regenerate summary button still shows cached summary (#903)
This PR fixes an issue where clicking to regenerate a summary was still showing the cached summary. To resolve this we call resetSummary() to reset all the summarization related properties before creating a new request.
2024-11-07 16:01:18 -08:00
Roman Rizzi fbc74c7467
FEATURE: Extend summary backfill to also generate gists (#896)
Updates default batch size to 0 and max to 10000
2024-11-07 13:40:18 -03:00
Keegan George 99282612a9
DEV: Prefer ENV key for seeded models (#893)
This PR ensures we prefer getting the API key from environment variables when it is a seeded model.
2024-11-05 06:19:13 -08:00
Roman Rizzi 9505a8976c
FEATURE: Automatically backfill regular summaries. (#892)
This change introduces a job to summarize topics and cache the results automatically. We provide a setting to control how many topics we'll backfill per hour and what the topic's minimum word count is to qualify.

We'll prioritize topics without summary over outdated ones.
2024-11-04 17:48:11 -03:00
Sam 98022d7d96
FEATURE: support custom instructions for persona streaming (#890)
This allows us to inject information into the system prompt
which can help shape replies without repeating over and over
in messages.
2024-11-05 07:43:26 +11:00
Rafael dos Santos Silva 772ee934ab
Migrate sentiment to a TEI backend (#886) 2024-11-04 09:14:34 -03:00
Sam c352054d4e
FIX: encode parameters returned from LLMs correctly (#889)
Fixes encoding of params on LLM function calls.

Previously we would improperly return results if a function parameter returned an HTML tag.

Additionally adds some missing HTTP verbs to tool calls.
2024-11-04 10:07:17 +11:00
Sam 34a59b623e
FIX: ensure replies are never double streamed (#879)
The custom field "discourse_ai_bypass_ai_reply" was added so
we can signal the post created hook to bypass replying even
if it thinks it should.

Otherwise there are cases where we double answer user questions
leading to much confusion.

This also slightly refactors code making the controller smaller
2024-10-30 20:24:39 +11:00
Sam be0b78cacd
FEATURE: new endpoint for directly accessing a persona (#876)
The new `/admin/plugins/discourse-ai/ai-personas/stream-reply.json` was added.

This endpoint streams data direct from a persona and can be used
to access a persona from remote systems leaving a paper trail in
PMs about the conversation that happened

This endpoint is only accessible to admins.

---------

Co-authored-by: Gabriel Grubba <70247653+Grubba27@users.noreply.github.com>
Co-authored-by: Keegan George <kgeorge13@gmail.com>
2024-10-30 10:28:20 +11:00
Rafael dos Santos Silva 820b506910
DEV: Hide soon to be deprecated modules settings (#872) 2024-10-28 14:27:25 -03:00
Bianca Nenciu 294c364a75
DEV: Fix mismatched column types (#868)
The primary key is usually a bigint column, but the foreign key columns
are usually of integer type. This can lead to issues when joining these
columns due to mismatched types and different value ranges.

This was using a temporary plugin / test API to make tests pass, but it
is safe to alter "ai_document_fragment_embeddings" and
"rag_document_fragments" tables because they usually have less than 1M
rows and migration is going to be fast.

Depending on the size of the community, "classification_results" table
may have more than 1M rows and the migration will lock the table for a
longer time. However, classification runs in background jobs and they
will be automatically retried if they fail due to the lock, which makes
it acceptable.
2024-10-28 15:36:42 +02:00
Roman Rizzi a2b1ea3c63
FEATURE: Fast-track gist regeneration when a hot topic gets a new post (#860)
* FEATURE: Fast-track gist regeneration when a hot topic gets a new post

* DEV: Introduce an upsert-like summarize

* FIX: Only enqueue fast-track gist for hot hot hot topics

---------

Co-authored-by: Rafael Silva <xfalcox@gmail.com>
2024-10-25 12:38:49 -03:00
Roman Rizzi ec97996905
FIX/REFACTOR: FoldContent revamp (#866)
* FIX/REFACTOR: FoldContent revamp

We hit a snag with our hot topic gist strategy: the regex we used to split the content didn't work, so we cannot send the original post separately. This was important for letting the model focus on what's new in the topic.

The algorithm doesn’t give us full control over how prompts are written, and figuring out how to format the content isn't straightforward. This means we're having to use more complicated workarounds, like regex.

To tackle this, I'm suggesting we simplify the approach a bit. Let's focus on summarizing as much as we can upfront, then gradually add new content until there's nothing left to summarize.

Also, the "extend" part is mostly for models with small context windows, which shouldn't pose a problem 99% of the time with the content volume we're dealing with.

* Fix fold docs

* Use #shift instead of #pop to get the first elem, not the last
2024-10-25 11:51:17 -03:00
Sam 12869f2146
FIX: testing tool was not showing rag results (#867)
This changeset contains 4 fixes:

1. We were allowing running tests on unsaved tools,
this is problematic cause uploads are not yet associated or indexed
leading to confusing results. We now only show the test button when
tool is saved.


2. We were not properly scoping rag document fragements, this
meant that personas and ai tools could get results from other
unrelated tools, just to be filtered out later


3. index.search showed options as "optional" but implementation
required the second option

4. When testing tools searching through document fragments was
not working at all cause we did not properly load the tool
2024-10-25 16:01:25 +11:00
Sam 4923837165
FIX: Llm selector / forced tools / search tool (#862)
* FIX: Llm selector / forced tools / search tool


This fixes a few issues:

1. When search was not finding any semantic results we would break the tool
2. Gemin / Anthropic models did not implement forced tools previously despite it being an API option
3. Mechanics around displaying llm selector were not right. If you disabled LLM selector server side persona PM did not work correctly.
4. Disabling native tools for anthropic model moved out of a site setting. This deliberately does not migrate cause this feature is really rare to need now, people who had it set probably did not need it.
5. Updates anthropic model names to latest release

* linting

* fix a couple of tests I missed

* clean up conditional
2024-10-25 06:24:53 +11:00
Rafael dos Santos Silva 96f5f8cbd0
FIX: Basic cleanup of AI Caption to remove line breaks and pipes (#857) 2024-10-23 18:38:29 -03:00
Keegan George 9af0c2e719
UX: Improve seeded LLM edit page (#856) 2024-10-23 13:58:27 -07:00
Sam f1283e156d
FEATURE: allow scoping of google tool queries (#852)
This allows to simply scope search results to specific domains and prepend arbitrary snippets to searches made
2024-10-23 16:55:10 +11:00
Sam 059d3b6fd2
FEATURE: better logging for automation reports (#853)
A new feature_context json column was added to ai_api_audit_logs

This allows us to store rich json like context on any LLM request
made.

This new field now stores automation id and name.

Additionally allows llm_triage to specify maximum number of tokens

This means that you can limit the cost of llm triage by scanning only
first N tokens of a post.
2024-10-23 16:49:56 +11:00
Sam a1f859a415
FEATURE: improve visibility of AI usage in LLM page (#845)
This changeset: 

1. Corrects some issues with "force_default_llm" not applying
2. Expands the LLM list page to show LLM usage
3. Clarifies better what "enabling a bot" on an llm means (you get it in the selector)
2024-10-22 11:16:02 +11:00
Roman Rizzi 6d504ab80d
FEATURE: Make hot topic gists opt-in. (#846)
This change restricts gists to members of specific groups. It also fixes a bug where other lists could display the gist if available.
2024-10-21 15:15:25 -03:00
Roman Rizzi e768fa877e
FIX: Don't regenerate up to date gists (#843) 2024-10-18 18:49:01 -03:00
Roman Rizzi 27b5542357
FEATURE: Generate topic gists for the hot topics list. (#837)
* Display gists in the hot topics list

* Adjust hot topics gist strategy and add a job to generate gists

* Replace setting with a configurable batch size

* Avoid loading summaries for other topic lists

* Tweak gist prompt to focus on latest posts in the context of the OP

* Remove serializer hack and rely on core change from discourse/discourse#29291

* Update lib/summarization/strategies/hot_topic_gists.rb

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>

---------

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2024-10-18 18:01:39 -03:00
Loïc Guitaut 7919173cba DEV: Use `ChatSDK.create` instead of service in specs
This patch will allow upcoming changes to services
(https://github.com/discourse/discourse/pull/29129) without breaking the
`discourse-ai` specs.
2024-10-16 18:18:32 +02:00
Rafael dos Santos Silva 792703c942
FEATURE: Discord Bot integration (#831)
This adds support for the a Discord bot that can search in a Discourse instance when invoked via slash commands in Discord Guild channel.
2024-10-16 12:41:18 -03:00
Sam bdf3b6268b
FEATURE: smarter persona tethering (#832)
Splits persona permissions so you can allow a persona on:

- chat dms
- personal messages
- topic mentions
- chat channels

(any combination is allowed)

Previously we did not have this flexibility.

Additionally, adds the ability to "tether" a language model to a persona so it will always be used by the persona. This allows people to use a cheaper language model for one group of people and more expensive one for other people
2024-10-16 07:20:31 +11:00
Roman Rizzi c7acb4a6a0
REFACTOR: Support of different summarization targets/prompts. (#835)
* DEV: Add summary types

* Refactor for different summary types

* Use enum for summary types

* Update lib/summarization/strategies/topic_summary.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Update lib/summarization/strategies/topic_gist.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Update lib/summarization/strategies/chat_messages.rb

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>

* Fix chat_messages single prompt

* Small tweak to the chat summarization prompt

---------

Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2024-10-15 13:53:26 -03:00
Hoa Nguyen 94010a5f78
FEATURE: Tools for models from Ollama provider (#819)
Adds support for Ollama function calling
2024-10-11 07:25:53 +11:00
Sam 6c4c96e83c
FEATURE: allow persona to only force tool calls on limited replies (#827)
This introduces another configuration that allows operators to
limit the amount of interactions with forced tool usage.

Forced tools are very handy in initial llm interactions, but as
conversation progresses they can hinder by slowing down stuff
and adding confusion.
2024-10-11 07:23:42 +11:00
Mark VanLandingham 52d90cf1bc
DEV: Add apply_modifier for SemanticTopicQuery topics list (#830) 2024-10-10 12:13:16 -05:00