API Reference
April 22, 2026 · View on GitHub
Complete documentation for all MCP tools exposed by the sanzaru server.
Video Generation Tools
create_video
Generate videos using OpenAI's Sora API.
Parameters:
prompt(string, required): Text description of the video to generatemodel(string, optional): Model to use -"sora-2"(default) or"sora-2-pro"seconds(string, optional): Duration as string -"4","8", or"12"(NOTE: Must be string, not integer)size(string, optional): Resolution -"720x1280","1280x720","1024x1792", or"1792x1024"input_reference_filename(string, optional): Filename (not path) of reference image fromIMAGE_PATH
Returns: Video object with id, status, progress, model, seconds, size
Example:
video = create_video(
prompt="A serene mountain landscape at sunrise",
model="sora-2",
seconds="8",
size="1280x720"
)
get_video_status
Check the status of a video generation job.
Parameters:
video_id(string, required): ID returned fromcreate_video
Returns: Video object with updated status and progress
Status values:
"queued": Job is queued"in_progress": Currently generating (checkprogressfield for 0-100%)"completed": Ready to download"failed": Generation failed
Example:
status = get_video_status(video.id)
# Poll until status.status == "completed"
download_video
Download a completed video to VIDEO_PATH.
Parameters:
video_id(string, required): ID of completed videofilename(string, optional): Custom filename (defaults to{video_id}.{extension})variant(string, optional): What to download -"video"(default),"thumbnail", or"spritesheet"
Variant formats:
"video"→ MP4 file"thumbnail"→ WEBP image"spritesheet"→ JPG image
Returns: DownloadResult with filename, variant
Example:
result = download_video(video.id, filename="my_video.mp4")
# File saved to: {VIDEO_PATH}/my_video.mp4
list_videos
List all video generation jobs with pagination.
Parameters:
limit(integer, optional): Max results to return (default: 20, max: 100)after(string, optional): Cursor for pagination (uselastfrom previous response)order(string, optional): Sort order -"desc"(default, newest first) or"asc"
Returns: Object with data (array of video summaries), has_more (boolean), last (cursor)
Example:
page1 = list_videos(limit=20)
if page1.has_more:
page2 = list_videos(limit=20, after=page1.last)
delete_video
Permanently delete a video from OpenAI's storage.
Parameters:
video_id(string, required): ID of video to delete
Returns: Confirmation with deleted video ID
Warning: This is permanent and cannot be undone!
remix_video
Create a new video by remixing an existing completed video.
Parameters:
previous_video_id(string, required): ID of completed video to remixprompt(string, required): New prompt to guide the remix
Returns: NEW Video object with different video_id
Note: This creates a brand new job. Poll the NEW video_id for completion.
list_local_videos
List locally downloaded video files in VIDEO_PATH.
Parameters:
pattern(string, optional): Glob pattern to filter filenames (e.g.,"*.mp4","sora*")file_type(string, optional): Filter by type -"mp4","webm","mov", or"all"(default)sort_by(string, optional): Sort by"name","size", or"modified"(default)order(string, optional):"desc"(default) or"asc"limit(integer, optional): Max results (default: 50)
Returns: Object with data (array of VideoFile objects with filename, size_bytes, modified_timestamp, file_type)
Example:
# List all local videos
videos = list_local_videos()
# Find MP4 files matching a pattern
videos = list_local_videos(pattern="sora*", file_type="mp4")
# Get recently modified
recent = list_local_videos(sort_by="modified", order="desc", limit=10)
Image Generation Tools
Two APIs are available for image generation:
| Tool | API | Best For |
|---|---|---|
generate_image | Images API | New generation with gpt-image-2 (RECOMMENDED) |
edit_image | Images API | Editing existing images |
create_image | Responses API | Iterative refinement with previous_response_id |
Images API (gpt-image-2 default): Synchronous, returns immediately, no polling required, up to 4K output
Responses API (GPT-5.2): Async polling pattern, supports iterative refinement chains + action field, gpt-image-2 via tool_config
generate_image
Generate images using OpenAI's Images API with gpt-image-2 (default). RECOMMENDED for new image generation.
Key advantages:
- Synchronous - returns immediately (no polling)
- gpt-image-2 - state-of-the-art quality, ~99% text accuracy, up to 4K output
- Token usage tracking for cost monitoring
- Accepts thousands of valid resolutions (not just the documented presets)
Parameters:
prompt(string, required): Text description of the image (max 32k chars)model(string, optional): Model -"gpt-image-2"(default, recommended),"gpt-image-1.5","gpt-image-1","gpt-image-1-mini","dall-e-3","dall-e-2"size(string, optional): Dimensions -"auto"(default),"1024x1024","1536x1024","1024x1536", plus gpt-image-2 sizes"2048x2048","2048x1152","3840x2160","2160x3840"quality(string, optional): Quality -"auto"(default),"low","medium","high"background(string, optional): Background -"auto"(default),"transparent"(NOT supported on gpt-image-2 — use gpt-image-1.5),"opaque"output_format(string, optional): Format -"png"(default),"jpeg","webp"moderation(string, optional): Content moderation -"auto"(default),"low"filename(string, optional): Custom output filename (auto-generated if omitted)
Returns: ImageGenerateResult with filename, size, format, model, usage
Usage tracking: Returns token counts for cost monitoring:
result.usage.input_tokens # Text tokens
result.usage.output_tokens # Image tokens
result.usage.total_tokens # Combined total
Examples:
# Basic generation (recommended path)
result = generate_image(prompt="a sunset over mountains")
# File immediately available at result.path
# High quality portrait
result = generate_image(
prompt="professional headshot, studio lighting",
size="1024x1536",
quality="high"
)
# Transparent background for icons (falls back to gpt-image-1.5)
result = generate_image(
prompt="product icon, clean design",
model="gpt-image-1.5",
background="transparent",
output_format="png"
)
# Fast generation with mini model
result = generate_image(
prompt="quick sketch of a cat",
model="gpt-image-1-mini"
)
edit_image
Edit existing images using OpenAI's Images API with gpt-image-2 (default).
Key features:
- Synchronous - returns immediately (no polling)
- Supports up to 16 input images for composition
- Mask-based inpainting
- Multi-image composition and blending
Parameters:
prompt(string, required): Description of desired edits (max 32k chars)input_images(array, required): List of image filenames fromIMAGE_PATH(1-16 images)model(string, optional): Model -"gpt-image-2"(default),"gpt-image-1.5","gpt-image-1","gpt-image-1-mini"mask_filename(string, optional): PNG mask with alpha channel for inpainting (transparent = edit, opaque = keep)size(string, optional): Output dimensions -"auto"(default),"1024x1024","1536x1024","1024x1536", plus gpt-image-2 sizes"2048x2048","2048x1152","3840x2160","2160x3840"quality(string, optional): Quality -"auto"(default),"low","medium","high"background(string, optional): Background -"auto"(default),"transparent"(NOT supported on gpt-image-2),"opaque"output_format(string, optional): Format -"png"(default),"jpeg","webp"input_fidelity(string, optional): Fidelity to input -"high"(preserve faces/style) or"low"(more creative freedom). gpt-image-1 / gpt-image-1.5 only — silently ignored for gpt-image-2 (always high).filename(string, optional): Custom output filename
Returns: ImageGenerateResult with filename, size, format, model, usage
Examples:
# Simple edit
result = edit_image(
prompt="add a hat to the person",
input_images=["portrait.png"]
)
# Multi-image composition
result = edit_image(
prompt="create a gift basket containing all these items",
input_images=["lotion.png", "soap.png", "candle.png"]
)
# Inpainting with mask
result = edit_image(
prompt="add a flamingo standing in the water",
input_images=["pool.png"],
mask_filename="pool_mask.png"
)
# High-fidelity face preservation on gpt-image-1.5
result = edit_image(
prompt="change hair color to red",
input_images=["portrait.jpg"],
model="gpt-image-1.5",
input_fidelity="high",
)
create_image
Generate images using OpenAI's Responses API. Use for iterative refinement with previous_response_id.
Tip: Use tool_config={"type": "image_generation", "model": "gpt-image-2"} for best quality. Use "gpt-image-1.5" when you need transparent backgrounds. You can also pass action: "generate" / "edit" to force a mode when an image is in context (default "auto").
Parameters:
prompt(string, required): Text description of image to generatemodel(string, optional): Model to use -"gpt-5.2"(default, OpenAI's latest),"gpt-5.1","gpt-5","gpt-4.1"tool_config(object, optional): Advanced configuration (ImageGeneration type)previous_response_id(string, optional): Previous response ID for iterative refinementinput_images(array, optional): Array of filenames fromIMAGE_PATHfor image editingmask_filename(string, optional): PNG mask file for inpainting
Returns: ImageResponse with id, status, created_at
Example:
# Generate from text
resp = create_image(prompt="sunset over mountains")
# Iterative refinement
resp2 = create_image(
prompt="add more dramatic clouds",
previous_response_id=resp.id
)
# Image editing
resp3 = create_image(
prompt="add a flamingo to the pool",
input_images=["pool.png"]
)
get_image_status
Check status of image generation job.
Parameters:
response_id(string, required): ID returned fromcreate_image
Returns: ImageResponse with updated status
download_image
Download completed image to IMAGE_PATH.
Parameters:
response_id(string, required): ID of completed imagefilename(string, optional): Custom filename (auto-generated if omitted)
Returns: ImageDownloadResult with filename, size, format
Reference Image Management Tools
list_reference_images
Search and list available reference images in IMAGE_PATH.
Parameters:
pattern(string, optional): Glob pattern to filter filenames (e.g.,"cat*.png","*.jpg")file_type(string, optional): Filter by type -"jpeg","png","webp", or"all"(default)sort_by(string, optional): Sort by"name","size", or"modified"(default)order(string, optional):"desc"(default) or"asc"limit(integer, optional): Max results (default: 50)
Returns: Array of ReferenceImage objects with filename, size_bytes, modified_timestamp, file_type
Example:
# Find all dog images
images = list_reference_images(pattern="dog*", file_type="png")
# Get recently modified
recent = list_reference_images(sort_by="modified", order="desc", limit=10)
prepare_reference_image
Resize images to match Sora's required dimensions.
Parameters:
input_filename(string, required): Source image filename inIMAGE_PATHtarget_size(string, required): Target size -"720x1280","1280x720","1024x1792", or"1792x1024"output_filename(string, optional): Custom output name (defaults to{original}_{width}x{height}.png)resize_mode(string, optional): How to handle aspect ratio -"crop"(default),"pad", or"rescale"
Resize modes:
- crop: Scale to cover target, center crop excess (no distortion, may lose edges)
- pad: Scale to fit inside target, add black bars (no distortion, preserves full image)
- rescale: Stretch/squash to exact dimensions (may distort, no cropping/padding)
Returns: PrepareResult with output_filename, original_size, target_size, resize_mode
Example:
result = prepare_reference_image(
"photo.jpg",
"1280x720",
resize_mode="crop"
)
# Creates: photo_1280x720.png
Audio Tools
For detailed audio tool documentation, see docs/audio/README.md.
Available tools:
list_audio_files- List and filter audio filesget_latest_audio- Get most recent audio fileconvert_audio- Convert to mp3/wavcompress_audio- Compress for API limitstranscribe_audio- Whisper transcriptionchat_with_audio- GPT-4o audio analysistranscribe_with_enhancement- Enhanced transcriptioncreate_audio- Text-to-speech generation
Best Practices
Polling for Completion
Don't block - poll status periodically:
# ❌ Don't block
video = create_video(...)
while get_video_status(video.id).status != "completed":
# blocks LLM session
# ✅ Do poll with messaging
video = create_video(...)
status = get_video_status(video.id)
if status.status != "completed":
return f"Video generating... {status.progress}% complete. Check back in a moment."
File Security
- All file operations are sandboxed to configured paths
- Reference images must be in
IMAGE_PATH(no path traversal) - Symlinks are rejected for security
- Downloaded content goes to
VIDEO_PATHorIMAGE_PATH
Error Handling
All tools return structured error messages. Common errors:
- File not found in reference path
- Invalid dimensions for target size
- Video not completed yet
- API rate limits