Claude 1 costs the same and is less good than Claude 2. Make use of Claude
2 in all spots ...
This also fixes streaming so it uses the far more efficient streaming protocol.
* DEV: Better strategies for summarization
The strategy responsibility needs to be "Given a collection of texts, I know how to summarize them most efficiently, using the minimum amount of requests and maximizing token usage".
There are different token limits for each model, so it all boils down to two different strategies:
Fold all these texts into a single one, doing the summarization in chunks, and then build a summary from those.
Build it by combining texts in a single prompt, and truncate it according to your token limits.
While the latter is less than ideal, we need it for "bart-large-cnn-samsum" and "flan-t5-base-samsum", both with low limits. The rest will rely on folding.
* Expose summarized chunks to users
The new site settings:
ai_openai_gpt35_url : distribution for GPT 16k
ai_openai_gpt4_url: distribution for GPT 4
ai_openai_embeddings_url: distribution for ada2
If untouched we will simply use OpenAI endpoints.
Azure requires 1 URL per model, OpenAI allows a single URL to serve multiple models. Hence the new settings.
Given latest GPT 3.5 16k which is both better steered and supports functions
we can now support rich bot integration.
Clunky system message based steering is removed and instead we use the
function framework provided by Open AI
For the time being smart commands only work consistently on GPT 4.
Avoid using any smart commands on the earlier models.
Additionally adds better error handling to Claude which sometimes streams
partial json and slightly tunes the search command.
This change-set connects GPT based chat with the forum it runs on. Allowing it to perform search, lookup tags and categories and summarize topics.
The integration is currently restricted to public portions of the forum.
Changes made:
- Do not run ai reply job for small actions
- Improved composable system prompt
- Trivial summarizer for topics
- Image generator
- Google command for searching via Google
- Corrected trimming of posts raw (was replacing with numbers)
- Bypass of problem specs
The feature works best with GPT-4
---------
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
We'll create one bot user for each available model. When listed in the `ai_bot_enabled_chat_bots` setting, they will reply.
This PR lets us use Claude-v1 in stream mode.
This module lets you chat with our GPT bot inside a PM. The bot only replies to members of the groups listed on the ai_bot_allowed_groups setting and only if you invite it to participate in the PM.
Also adds some tests around completions and supports additional params
such as top_p, temperature and max_tokens
This also migrates off Faraday to using Net::HTTP directly