Commit Graph

9 Commits

Author SHA1 Message Date
Rafael dos Santos Silva 5db7bf6e68
Mixtral (#376)
Add both Mistral and Mixtral support. Also includes vLLM-openAI inference support.

Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
2023-12-26 14:49:55 -03:00
Sam af2e692761
FIX: under certain conditions we would get duplicate data from llm (#373)
Previously endpoint/base would `+=` decoded_chunk to leftover

This could lead to cases where the leftover buffer had duplicate
previously processed data

Fix ensures we properly skip previously decoded data.
2023-12-20 14:28:05 -03:00
Roman Rizzi e0bf6adb5b
DEV: Tool support for the LLM service. (#366)
This PR adds tool support to available LLMs. We'll buffer tool invocations and return them instead of making users of this service parse the response.

It also adds support for conversation context in the generic prompt. It includes bot messages, user messages, and tool invocations, which we'll trim to make sure it doesn't exceed the prompt limit, then translate them to the correct dialect.

Finally, It adds some buffering when reading chunks to handle cases when streaming is extremely slow.:M
2023-12-18 18:06:01 -03:00
Rafael dos Santos Silva 83744bf192
FEATURE: Support for Gemini in AiHelper / Search / Summarization (#358) 2023-12-15 14:32:01 -03:00
Rafael dos Santos Silva d8267d8da0
FIX: Many fixes for huggingface and llama2 inference (#335) 2023-12-06 11:22:42 -03:00
Sam 6ddc17fd61
DEV: port directory structure to Zeitwerk (#319)
Previous to this change we relied on explicit loading for a files in Discourse AI.

This had a few downsides:

- Busywork whenever you add a file (an extra require relative)
- We were not keeping to conventions internally ... some places were OpenAI others are OpenAi
- Autoloader did not work which lead to lots of full application broken reloads when developing.

This moves all of DiscourseAI into a Zeitwerk compatible structure.

It also leaves some minimal amount of manual loading (automation - which is loading into an existing namespace that may or may not be there)

To avoid needing /lib/discourse_ai/... we mount a namespace thus we are able to keep /lib pointed at ::DiscourseAi

Various files were renamed to get around zeitwerk rules and minimize usage of custom inflections

Though we can get custom inflections to work it is not worth it, will require a Discourse core patch which means we create a hard dependency.
2023-11-29 15:17:46 +11:00
Roman Rizzi f26adf2cf6
FIX: Use XML tags in generate_titles prompt. (#322)
We must ensure we can isolate titles, and the models sometimes ignore the example we give them.

Additionally, anons can generate HyDE posts, so we need to check if user is nil when attempting to log requests.
2023-11-28 12:52:22 -03:00
Roman Rizzi 2e7c5f047d
DEV: Don't attempt to update log if completion request fails. (#321)
We already log the request failure when we raise the exception.
2023-11-28 11:15:12 -03:00
Roman Rizzi 3064d4c288
REFACTOR: Summarization and HyDE now use an LLM abstraction. (#297)
* DEV: One LLM abstraction to rule them all

* REFACTOR: HyDE search uses new LLM abstraction

* REFACTOR: Summarization uses the LLM abstraction

* Updated documentation and made small fixes. Remove Bedrock claude-2 restriction
2023-11-23 12:58:54 -03:00