๐จ Gemini Vision Art Studio
April 18, 2025 ยท View on GitHub
A powerful MCP server leveraging Google's Gemini AI for advanced image generation and transformation. This studio offers two specialized tools: a 3D cartoon generator and an image processing transformer, both powered by the cutting-edge Gemini 2.0 Flash model.
โจ Features
1. 3D Cartoon Generator
- Generate high-quality 3D cartoon images from text descriptions
- Child-friendly designs with vibrant colors and engaging visuals
- Perfect for children's books, educational materials, and creative projects
2. Image Transformer
- Transform existing images using Gemini AI's vision capabilities
- Apply various artistic styles and modifications
- Enhance, modify, or completely reimagine your images
Additional Features
- ๐ผ๏ธ Automatic preview generation
- ๐ Browser-based image viewing
- ๐พ Local storage with organized output
- ๐ Real-time processing
- ๐ฑ Cross-platform support
๐ Quick Start
Installation
# Clone the repository
git clone https://github.com/falahgs/gemini-vision-art-studio.git
# Install dependencies
cd gemini-vision-art-studio
npm install
Configuration
- Project Configuration:
Create a
.envfile in the root directory:
GEMINI_API_KEY=your_api_key_here
# Set to true if running in a remote environment (no browser preview)
IS_REMOTE=true
- Claude Desktop Configuration:
Add the server configuration to your Claude Desktop config file at
%AppData%\Claude\claude_desktop_config.json:
{
"mcpServers": {
"gemini-vision-art-studio": {
"command": "node",
"args": [
"PATH_TO_YOUR_PROJECT\\build\\src\\index.js"
],
"env": {
"GEMINI_API_KEY": "your_gemini_api_key_here",
"IS_REMOTE": "true"
}
}
}
}
Replace:
PATH_TO_YOUR_PROJECTwith your actual project pathyour_gemini_api_key_herewith your Gemini API key
๐ก Note: On Windows, the config file is typically located at:
C:\Users\YourUsername\AppData\Roaming\Claude\claude_desktop_config.json
Remote Usage
When running the server remotely:
-
Set
IS_REMOTE=truein your environment or Claude Desktop configuration -
The server will:
- Create necessary directories automatically:
/app/output: For generated images and previews/app/temp: For temporary processing files
- Skip browser preview attempts
- Save all files to the
/app/outputdirectory - Return absolute file paths in the response
- Create necessary directories automatically:
-
Directory Structure in Remote Mode:
/app/ โโโ output/ # Generated images and previews โ โโโ image1.png โ โโโ image1_preview.html โโโ temp/ # Temporary processing files -
Troubleshooting Remote Usage:
- Ensure the
/appdirectory exists and is writable - Check the console output for directory creation messages
- Look for "Image saved to:" messages in the logs
- File paths in the response will be absolute paths
- Ensure the
Running the Server
- Build the project:
npm run build
- The server will be available in Claude Desktop automatically when you:
- Open Claude Desktop
- Start a new conversation
- The tools will appear in the available tools list
๐ ๏ธ Available Tools
1. Generate 3D Cartoon (generate_3d_cartoon)
Creates a 3D-style cartoon image from your text description.
{
"name": "generate_3d_cartoon",
"arguments": {
"prompt": "A friendly dragon teaching math to forest animals",
"fileName": "dragon_teacher"
}
}
2. Process Image (process_image)
Transforms existing images according to your instructions.
{
"name": "process_image",
"arguments": {
"imagePath": "input/photo.jpg",
"prompt": "Transform this into a watercolor painting with autumn colors",
"outputFileName": "watercolor_autumn"
}
}
๐ Directory Structure
gemini-vision-art-studio/
โโโ src/ # Source code
โโโ build/ # Compiled code
โโโ input/ # Input images
โโโ output/ # Generated images and previews
โโโ temp/ # Temporary processing files
โโโ examples/ # Example usage and images
๐ง Technical Details
- Runtime: Node.js v14+
- Language: TypeScript 5.8.3
- AI Model: Gemini 2.0 Flash
- Framework: Model Context Protocol (MCP) SDK
- Image Processing: Google Generative AI
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐จโ๐ป Author
Falah G. Salieh
- Copyright ยฉ 2025
- GitHub: @falahgs
๐ Acknowledgments
- Google Gemini AI team for the powerful image generation model
- The MCP SDK team for the excellent tooling
- All contributors and users of this project
Made with โค๏ธ by Falah G. Salieh