usage.md
May 11, 2025 ยท View on GitHub
๐ Usage
This document covers how to use the bot in a room.
The ๐ Features page also includes details about how each feature works and can be configured.
๐ฌ Text Generation
This is related to the ๐ฌ Text Generation feature.
If there's a text-generation handler agent configured, the bot may respond to messages sent in the room.
Some models also support vision, so you may be able to mix text and images in the same conversation.
See screenshots of:
- ๐ผ๏ธ the default Text Generation flow in 1:1 rooms
- ๐ผ๏ธ the Text Generation flow in multi-user rooms (where the ๐ Prefix Requirement setting is auto-configured to "required")
- the on-demand involvement feature
Whether the bot responds depends on:
-
(๐ access) whether you're a whitelisted bot ๐ฅ user
-
๐ ๏ธ configuration whether there's a configured
text-generationhandler agent (or acatch-allhandler agent). See Mixing & matching models -
(๐จ agent capabilities) whether the configured
text-generation(orcatch-all) handler agent actually supports text-generation. The provider may lack support for this feature or it may be disabled in the ๐ค agents configuration -
(the ๐ Prefix Requirement setting) whether a prefix (e.g.
!bai) or user mention (e.g.@baibot) is required for messages sent to the room. For multi-user rooms, this setting defaults to "required". See ๐ Features / ๐ฌ Text Generation / On-demand involvement for details.
Room messages start a threaded conversation where you can continue back-and-forth communication with the bot. Using on-demand involvement, you can can also mention the bot to provoke it to get involved in any conversation thread or reply chain.
Unless you've enabled the โป๏ธ Context Management feature, all messages will be sent to the agent's API each time. If the context management feature is enabled, older messages may be dropped.
๐ฃ๏ธ Text-to-Speech
This is related to the ๐ฃ๏ธ Text-to-Speech feature.
If there's a text-to-speech handler agent configured, the bot may convert text messages sent to the room to audio (voice).
See:
-
a ๐ผ๏ธ screenshot of the bot's Text-to-Speech-only mode
-
a ๐ผ๏ธ screenshot of the bot's Seamless voice interaction mode
By default, the bot:
-
will offer tex-to-speech for its own messages which are a response to voice message from your, as part of the Seamless voice interaction feature. This can be adjusted via the ๐ฃ๏ธ Text-to-Speech / ๐ช Bot Messages Flow Type setting.
-
does not turn your own text messages to audio (voice). If you'd like for the bot to operate in such a mode, use the ๐ฃ๏ธ Text-to-Speech / ๐ช User Messages Flow Type setting (see Text-to-Speech-only mode).
๐ฆป Speech-to-Text
This is related to the ๐ฆป Speech-to-Text feature.
If there's a speech-to-text handler agent configured, the bot may transcribe voice messages sent to the room to text.
See a ๐ผ๏ธ Screenshot of the default flow for Speech-to-Text and Text-Generation.
The speech-to-text feature triggers automatically by default, but can be adjusted via the ๐ฆป Speech-to-Text / ๐ช Flow Type setting.
If all your messages are in the same language, you can improve accuracy & latency by configuring the language (see ๐ฆป Speech-to-Text / ๐ค Language).
Image Generation
This feature is not configurable at the moment. The configuration (size, quality, style) specified at the ๐ค agent level will be used.
Capabilities depend on the โ๏ธ provider and model used.
๐๏ธ Creating images
Simply send a command like !bai image create A beautiful sunset over the ocean and the bot will start a threaded conversation and post an image based on your prompt.
See a ๐ผ๏ธ Screenshot of the Image Creation feature.
You can then respond in the same message thread with:
- more messages, to add more criteria to your prompt.
- a message saying
again, to generate one more image with the current prompt.
๐จ Editing images
Simply send a command like !bai image edit Turn the following image into an anime-style drawing and the bot will start a threaded conversation asking for more details.
See a ๐ผ๏ธ Screenshot of the Image Editing feature (manipulating a single image) and a ๐ผ๏ธ Screenshot of the Image Editing feature (manipulating multiple images).
You can then respond in the same message thread with:
- more messages, to add more criteria to your prompt.
- one or more images, to provide the images that the bot will operate on.
- a message saying
go, to start the image generation process. - a message saying
again, to prompt the bot to generate one more image edit with the current prompt.
๐ซต Creating stickers
A variation of creating images is creating "sticker images".
See a ๐ผ๏ธ Screenshot of the Sticker Creation feature.
To create a sticker, send a command like !bai sticker A huge ramen bowl with lots of chashu and a mountain of beansprouts on top.
The difference from creating images is that the bot will:
- generate a smaller-resolution image (currently hardcoded to
256x256) - smaller/quicker, but still good enough for a sticker - potentially switch to a different (cheaper or otherwise more suitable) model, if available
- post the image directly to the room (as a reply to your message), without starting a threaded conversation
Some models (like OpenAI's Dall-E-3) can only generate larger images (1024x1024, etc., for a higher charge), so we switching to a smaller/cheaper model (like Dall-E-2) is a way to generate a sticker cheaply.