Veo 4 API: Python Wrapper for Google DeepMind's AI Video Generator

June 2, 2026 ยท View on GitHub

PyPI version GitHub stars License: MIT Python 3.7+

The most comprehensive Python wrapper for the Veo 4 API (developed by Google DeepMind), delivered via muapi.ai. Generate native 4K AI videos up to 30 seconds with integrated audio, character consistency, and advanced camera controls โ€” Google's most powerful video generation model.

Join subreddit Veo4 for discussion

๐ŸŒŠ Also explore these top AI video models:

  • Seedance 2.0 API โ€” ByteDance's cinematic 2K video model with character sheets & omni-reference
  • HappyHorse 1.0 API โ€” Alibaba's #1 ranked model (1392 Elo I2V) with native 1080p & integrated audio

๐Ÿš€ Why Use Veo 4 API?

Veo 4 is Google DeepMind's latest state-of-the-art AI video generation model, featuring a 3x larger Transformer architecture than Veo 3, native 4K output, and advanced character anchoring technology.

  • Native 4K Output: Every pixel generated from scratch โ€” not upscaled.
  • Up to 30 Seconds: Longer clips than any previous Veo model.
  • Integrated Audio: Jointly generates synchronized dialogue, ambient sound, and music in one pass (building on Veo 3's audio breakthrough).
  • Character Consistency: Advanced anchoring technology keeps faces, clothing, and features consistent across all frames and camera angles.
  • Advanced Camera Controls: Pan, zoom, orbit, tracking shots โ€” precise cinematic control.
  • Developer-First: Simple Python SDK via MuAPI infrastructure.

๐ŸŒŸ Key Features of Veo 4 API

  • โœ… Veo 4 Text-to-Video (T2V): Transform descriptive prompts into stunning native 4K video clips up to 30 seconds.
  • โœ… Veo 4 Image-to-Video (I2V): Animate static images with precise motion and camera control using images_list.
  • โœ… Integrated Audio-Video Generation: Jointly generate synchronized audio and video in one pass โ€” include sound cues in your prompt.
  • โœ… Character Consistency: character_video() anchors on reference photos to keep identity consistent across scenes.
  • โœ… Advanced Camera Controls: Specify camera_control for cinematic movements โ€” pan, zoom, orbit, tracking shots.
  • โœ… Video Extension: Extend existing Veo 4 clips up to 30 seconds total.
  • โœ… Video Edit: Edit existing videos using natural language prompts.
  • โœ… File Upload: Upload local images and videos directly via upload_file().
  • โœ… Flexible Aspect Ratios: Optimized for 16:9, 9:16 (TikTok/Reels), and 1:1.
  • โœ… Quality Tiers: 1080p and 4k (native) output.

๐Ÿ›  Installation

pip install veo-4-api

From Source

git clone https://github.com/Anil-matcha/Veo-4-API.git
cd Veo-4-API
pip install -r requirements.txt

Configuration

Create a .env file in the root directory and add your MuAPI API key:

MUAPI_API_KEY=your_muapi_api_key_here

๐Ÿค– Veo 4 MCP Server

Use Veo 4 as an MCP (Model Context Protocol) server, allowing AI assistants like Claude Desktop or Cursor to directly invoke Veo 4 generation tools.

Running the MCP Server

  1. Ensure MUAPI_API_KEY is set in your environment.
  2. Run the server:
    python3 mcp_server.py
    
  3. To test with the MCP Inspector:
    npx -y @modelcontextprotocol/inspector python3 mcp_server.py
    

๐Ÿ’ป Quick Start with Veo 4 API (Python)

from veo4_api import Veo4API

# Initialize the Veo 4 client
api = Veo4API()

# Generate Video from Text (T2V)
print("Generating AI Video using Veo 4...")
submission = api.text_to_video(
    prompt="A cinematic tracking shot through a lush rainforest, sunlight filtering through the canopy, birds calling",
    aspect_ratio="16:9",
    duration=8,
    quality="4k",
    camera_control="tracking shot"
)

# Wait for completion
result = api.wait_for_completion(submission['request_id'])
print(f"Success! View your Veo 4 video here: {result['outputs'][0]}")

๐ŸŽต Audio-Video Generation

Veo 4 jointly generates synchronized video and audio in a single pass โ€” include sound cues in your prompt for best results.

from veo4_api import Veo4API

api = Veo4API()

# Text-to-video with audio
submission = api.text_to_video_with_audio(
    prompt="A street musician playing violin in Paris, rain on cobblestones, distant traffic, melancholic melody",
    aspect_ratio="16:9",
    duration=15,
    quality="4k"
)
result = api.wait_for_completion(submission['request_id'])
print(f"Video with audio: {result['outputs'][0]}")

# Image-to-video with audio
submission = api.image_to_video_with_audio(
    prompt="@image1 comes alive โ€” waves crashing, seagulls calling, ocean breeze rustling palm trees",
    images_list=["https://example.com/beach.jpg"],
    duration=10,
)
result = api.wait_for_completion(submission['request_id'])
print(f"Animated with audio: {result['outputs'][0]}")

Tip: Include explicit sound cues (e.g. "thunder rumbling", "crowd cheering", "piano melody") for richer, more accurate audio generation.


๐ŸŽญ Character Consistency

Veo 4's character anchoring keeps faces and identity consistent across all frames.

from veo4_api import Veo4API

api = Veo4API()

# Anchor on a reference photo
submission = api.character_video(
    prompt="@image1 walks confidently through a neon-lit Tokyo street at night",
    character_images=["https://example.com/person.jpg"],
    aspect_ratio="16:9",
    duration=8,
    quality="4k",
    with_audio=True,
)
result = api.wait_for_completion(submission['request_id'])
print(f"Character video: {result['outputs'][0]}")

๐ŸŽฌ Camera Controls

Specify cinematic camera movements with the camera_control parameter.

# Zoom in dramatically
submission = api.text_to_video(
    prompt="A lone lighthouse on a rocky cliff at dusk, storm approaching",
    aspect_ratio="16:9",
    duration=10,
    quality="4k",
    camera_control="slow zoom in"
)

# Orbit around a subject
submission = api.text_to_video(
    prompt="A marble statue in a sunlit museum courtyard",
    aspect_ratio="16:9",
    duration=8,
    camera_control="orbit"
)

๐Ÿ“ก API Endpoints & Reference

1. Veo 4 Text-to-Video (T2V)

Endpoint: POST https://api.muapi.ai/api/v1/veo-4-t2v

curl --location --request POST "https://api.muapi.ai/api/v1/veo-4-t2v" \
  --header "Content-Type: application/json" \
  --header "x-api-key: YOUR_API_KEY" \
  --data-raw '{
      "prompt": "A majestic eagle soaring over snow-capped mountains at sunrise",
      "aspect_ratio": "16:9",
      "duration": 8,
      "quality": "4k",
      "camera_control": "pan right"
  }'

2. Veo 4 Image-to-Video (I2V)

Endpoint: POST https://api.muapi.ai/api/v1/veo-4-i2v

curl --location --request POST "https://api.muapi.ai/api/v1/veo-4-i2v" \
  --header "Content-Type: application/json" \
  --header "x-api-key: YOUR_API_KEY" \
  --data-raw '{
      "prompt": "@image1 โ€” the clouds drift slowly, light shifts from golden to dusk",
      "images_list": ["https://example.com/landscape.jpg"],
      "aspect_ratio": "16:9",
      "duration": 8,
      "quality": "4k"
  }'

3. Veo 4 T2V with Audio

Endpoint: POST https://api.muapi.ai/api/v1/veo-4-t2v-audio

curl --location --request POST "https://api.muapi.ai/api/v1/veo-4-t2v-audio" \
  --header "Content-Type: application/json" \
  --header "x-api-key: YOUR_API_KEY" \
  --data-raw '{
      "prompt": "A busy Tokyo street at night, neon signs, rain, jazz music drifting from a bar",
      "aspect_ratio": "16:9",
      "duration": 15,
      "quality": "4k"
  }'

4. Veo 4 I2V with Audio

Endpoint: POST https://api.muapi.ai/api/v1/veo-4-i2v-audio

curl --location --request POST "https://api.muapi.ai/api/v1/veo-4-i2v-audio" \
  --header "Content-Type: application/json" \
  --header "x-api-key: YOUR_API_KEY" \
  --data-raw '{
      "prompt": "@image1 โ€” waves begin to crash, seagulls cry in the distance, wind howling",
      "images_list": ["https://example.com/ocean.jpg"],
      "aspect_ratio": "16:9",
      "duration": 10,
      "quality": "4k"
  }'

5. Veo 4 Character Video

Endpoint: POST https://api.muapi.ai/api/v1/veo-4-character

curl --location --request POST "https://api.muapi.ai/api/v1/veo-4-character" \
  --header "Content-Type: application/json" \
  --header "x-api-key: YOUR_API_KEY" \
  --data-raw '{
      "prompt": "@image1 walks confidently through a neon-lit Tokyo street",
      "images_list": ["https://example.com/person.jpg"],
      "aspect_ratio": "16:9",
      "duration": 8,
      "quality": "4k"
  }'

6. Video Extension

Endpoint: POST https://api.muapi.ai/api/v1/veo-4-extend

curl --location --request POST "https://api.muapi.ai/api/v1/veo-4-extend" \
  --header "Content-Type: application/json" \
  --header "x-api-key: YOUR_API_KEY" \
  --data-raw '{
      "request_id": "your-completed-request-id",
      "prompt": "The eagle lands on a mountain peak, surveying the valley below",
      "duration": 10,
      "quality": "4k"
  }'

7. Video Edit

Endpoint: POST https://api.muapi.ai/api/v1/veo-4-video-edit

curl --location --request POST "https://api.muapi.ai/api/v1/veo-4-video-edit" \
  --header "Content-Type: application/json" \
  --header "x-api-key: YOUR_API_KEY" \
  --data-raw '{
      "prompt": "Change the weather to a dramatic thunderstorm",
      "video_urls": ["https://example.com/video.mp4"],
      "aspect_ratio": "16:9",
      "quality": "4k"
  }'

๐Ÿ“– API Method Reference

MethodParametersDescription
text_to_videoprompt, aspect_ratio, duration, quality, with_audio, camera_controlGenerate native 4K video from text.
image_to_videoprompt, images_list, aspect_ratio, duration, quality, with_audio, camera_controlAnimate images into 4K video.
text_to_video_with_audioprompt, aspect_ratio, duration, quality, camera_controlT2V with jointly generated audio.
image_to_video_with_audioprompt, images_list, aspect_ratio, duration, quality, camera_controlI2V with jointly generated audio.
character_videoprompt, character_images, aspect_ratio, duration, quality, with_audioConsistent character identity across frames.
extend_videorequest_id, prompt, duration, qualityExtend an existing Veo 4 video segment.
video_editprompt, video_urls, images_list, aspect_ratio, qualityEdit existing videos with natural language.
upload_filefile_pathUpload a local file (image or video) to MuAPI.
get_resultrequest_idCheck task status and retrieve outputs.
wait_for_completionrequest_id, poll_interval, timeoutBlocking helper โ€” polls until generation completes.

๐Ÿ”— Official Resources

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Keywords: Veo 4 API, Google Veo 4, Google DeepMind Video, AI Video Generator, Text-to-Video AI, Image-to-Video API, Veo 4 Python SDK, Google Video AI, Audio Video Generation, 4K AI Video, Character Consistency AI, Camera Control Video, MuAPI, Video Generation API, Native 4K Video, AI Video Creation, Veo 4 API Documentation, Veo 4 I2V, Veo 4 T2V, Python Video API.