OBS Module

February 18, 2026 · View on GitHub

Back to README | Architecture


Overview

The OBS module connects to OBS Studio via WebSocket and controls two types of sources in real time:

  1. Avatar source — swaps the image/video file to reflect Bea's current mood and speaking state.
  2. Text source — animates text with a typewriter effect for the speech bubble overlay.
src/modules/obs/
└── obs_websocket.py    OBSController implementing OBSInterface

Connection

obs = OBSController(host="localhost", port=4455, password="...", source_name="BeaPNG")
obs.connect()

If OBS is not running, the connection fails gracefully with a warning. The rest of the engine continues normally without OBS output.


Source Types

Set obs_source_type in config:

ValueOBS Source TypeSuitable for
"image"Image SourceStatic PNGs
"media"Media Source (ffmpeg)MP4, GIF, WebM

Image switch:

obs.set_image("data/pngs/angry/talking.png")

Media switch:

obs.set_media("data/pngs/angry/talking.mp4")

Typing Animation

type_text() writes a message character-by-character into the OBS text source, paginating if the message exceeds the visible area.

Parameters:

ParameterDescription
textThe full message to type
source_nameOBS text source name
line_widthCharacters per line before wrapping
max_linesMax visible lines
base_font_sizeStarting font size
min_font_sizeMinimum font size (shrinks for long text)
font_stepFont size decrement step
typing_delaySeconds between each character
min_page_durationMinimum seconds a page stays visible
speaking_rateCharacters per second used to estimate reading time per page (default: 12.0). Controls the post-typing wait so that longer pages stay visible longer.

Returns the final font size used (stored by the brain to correctly clear the source afterward).

The typing task and TTS playback task run in parallel — they are both launched as asyncio tasks and cancelled together if an interrupt arrives.


Font Management

OBS text sources have their font settings stored in OBS. The controller reads the current font settings the first time it types to a given source, caches them, and then applies font-size changes per page. This avoids resetting user-configured font family/style.


Avatar Swap Logic

When a response is generated with a given mood:

  1. Brain looks up (idle_path, talking_path) from png_map for that mood.
  2. Before speaking: set_media(talking_path) (or set_image).
  3. After speaking: set_media(idle_path).

If the mood is unknown, it falls back to "normal".


clear_text()

A convenience method on OBSController that sets the text source to an empty string, preserving the given font size:

obs.clear_text(source_name="AIText", font_size=75)

It is equivalent to set_text("", source_name, font_size=font_size). The brain calls set_text("", ...) directly in most places; clear_text() is provided as a helper and can be used interchangeably.


Hot Reload

reload_config() updates host/port/password. If any of those changed and a client was already connected, it disconnects and reconnects automatically.