A curated list of projects, tools, models, UIs, and resources for ACE-Step — the open-source music generation foundation model by ACE Studio and StepFun.
ACE-Step is a hybrid architecture combining a Language Model planner with a Diffusion Transformer to generate commercial-grade music from text prompts and lyrics. It runs locally on consumer hardware with as little as 4 GB VRAM, generating a full song in under 2 seconds on A100 or under 10 seconds on RTX 3090.
| Resource | Description |
|---|
| GitHub Repository (v1.5) | Latest codebase with Gradio UI, REST API, CLI, LoRA training. Mac, AMD, Intel, CUDA. |
| GitHub Repository (v1.0) | Original v1.0 codebase. |
| Project Page (v1.0) | Architecture overview, demos, and benchmarks. |
| Project Page (v1.5) | Hybrid LM + DiT architecture, new capabilities. |
| HuggingFace Space | Interactive online demo on HuggingFace Zero GPU. |
| HuggingFace Models | All official model weights, LoRAs, and spaces. |
| Discord | Community chat and support. |
| acestep.vst3 | Official VST3 plugin — C++17/GGML inference + JUCE plugin + web UI for DAW integration. |
| Model | Steps | Quality | Speed | Features | Link |
|---|
| acestep-v15-turbo | 8 | Very High | Very Fast | text2music, cover, repaint | HF |
| acestep-v15-turbo-continuous | 8 | Very High | Very Fast | Optimized for streaming | HF |
| acestep-v15-sft | 50 | High | Medium | All features | HF |
| acestep-v15-base | 50 | Medium | Medium | All features, best for fine-tuning | HF |
| Model | Base | VRAM | Capability | Link |
|---|
| acestep-5Hz-lm-0.6B | Qwen3-0.6B | 6-8 GB | Lightweight | HF |
| acestep-5Hz-lm-1.7B | Qwen3-1.7B | 8-16 GB | Default, full features | HF |
| acestep-5Hz-lm-4B | Qwen3-4B | 16+ GB | Best quality, audio understanding | HF |
| Model | Type | Description | Link |
|---|
| ACE-Step-v1.5-chinese-new-year-LoRA | LoRA | Chinese folk instruments (dizi, erhu), festive style. Trained on 12 songs | HF |
| Serveurperso/ACE-Step-1.5-GGUF | GGUF | Full quantization suite (Q4-Q8, BF16) for acestep.cpp | HF |
| Project | Description | Link |
|---|
| acestep.vst3 | Official VST3 plugin for ACE-Step 1.5. JUCE 8 plugin + C++17/GGML inference engine (from acestep.cpp). Runs on CPU, CUDA, Metal, Vulkan. Includes Ableton-inspired web UI and standalone ace-server | GitHub |
| acestep.cpp | Portable C++17/GGML implementation of ACE-Step 1.5. Text + lyrics in, stereo 48kHz WAV/MP3 out. Built-in HTTP server with Svelte web UI | GitHub |
| ACE-Step-DAW | WIP DAW integration project | GitHub |
| gary4juce | VST3/AU plugin with six open-source music models. Uses ACE-Step lego mode for vocals over existing DAW audio and a modified complete mode for continuations | GitHub |
| ACE-Step-1.5-GGUF | Pre-quantized GGUF models (Q4_K_M to BF16) for acestep.cpp and acestep.vst3 | HF |
| Project | Tech Stack | Highlights | Link |
|---|
| ace-step-ui (fspecii) | Node.js + Python | Spotify-inspired, dark/light modes, audio editor, stem extraction, video gen | GitHub |
| ace-step-studio (roblaughter) | React + FastAPI | Suno-style studio, create/library/player workflow, OpenAI-compatible LLM for lyrics, cover art gen | GitHub |
| Tadpole Studio | Next.js + FastAPI | AI DJ, Radio, Library, Playlists, LoRA training, HeartMuLa backend, 11 themes | GitHub |
| Ace-Step-Wrangler | Python + HTML/JS | DAW-inspired dark UI for musicians. Friendly sliders (Creativity, Strictly follow lyrics) instead of raw model params | GitHub |
| ace-step-ui.pinokio | Pinokio | One-click launcher for ace-step-ui (v1.5), auto backend + frontend | GitHub |
| ACE-Step-1.5-for-windows (sdbds) | Python + Windows | 936 Suno style tags with search/select; song parameter history; 4-language UI (EN/ZH/JA/KO); LoRA/LoKR training with GPU memory optimization | GitHub |
| Codi | Desktop app | PC-side AI music generation solution for ACE-Step 1.5, designed to make songwriting as simple as coding. Supports NVIDIA RTX 3060 Ti (8GB VRAM). | GitHub |
| ProdIA-MAX (ElWalki) | Node.js + Python | Fork of ace-step-ui with AI Chat Assistant (multi-LLM), Audio Codes conditioning, Voice Recorder + Whisper, Chord Progression Editor, Windows one-click setup | GitHub |
| ACE-Step-RADIO | Python | Continuous radio-style music stream powered by ACE-Step — auto-generates and plays songs back-to-back | GitHub |
| Majik's Music Studio | Swift/SwiftUI (macOS), GTK4 (Linux) | Free native desktop app (Apache 2.0). All 7 generation modes, full MLX acceleration on Apple Silicon, ACE Music Cloud integration, publishing to majiks.online (800+ tracks). Official ACE-Step partner. | GitHub |
| Project | Description | Link |
|---|
| ComfyUI Native Support | ACE-Step 1.5 built into ComfyUI core. AIO and split model workflows | Docs |
| ComfyUI-AceMusic | 15-node full-featured integration: generation, cover, repaint, extend, edit, LoRA, HeartMuLa compatible | GitHub |
| ComfyUI_RH_ACE-Step | ComfyUI plugin for ACE-Step 1.5 generation | GitHub |
| scromfyUI-AceStep | 30+ specialized nodes: audio KSamplers with shift control, multi-API lyrics gen (Gemini/Groq/OpenAI/Claude), masking & inpainting | GitHub |
| ComfyUI-FL-AceStep-Training | LoRA training pipeline in ComfyUI: auto-label, tiled VAE, real-time loss charts | GitHub |
| Comfyui_SN_AceStepTrainer | LoRA training nodes for ACE-Step 1.5 inside ComfyUI | GitHub |
| ComfyUI-kaola-ace-step | ComfyUI custom nodes for ACE-Step music generation | GitHub |
| Project | Description | Link |
|---|
| Side-Step | Standalone LoRA/LoKR toolkit for v1.5. Auto-detects variant, 8 GB VRAM training, interactive wizard + CLI | GitHub |
| ACE-Step-1.5-for-windows (sdbds) | LoRA and LoKR training with GPU memory offloading optimizations; integrated Gradio UI with style management and 4-language support | GitHub |
| ComfyUI-FL-AceStep-Training | End-to-end LoRA training inside ComfyUI with auto-labeling and live monitoring | GitHub |
| Ace-Step-1.5-Dataset-Manager | Desktop tool (Qt/C++) for editing LoRA training datasets: per-track caption, lyrics, BPM, key, audio preview | GitHub |
| Project | Description | Link |
|---|
| acestep-captioner | 11B music captioning model (Qwen2.5 Omni). 1000+ instruments, timbre, structure analysis. Accuracy surpasses Gemini Pro 2.5 | HF |
| acestep-transcriber | Qwen2.5 Omni-based music transcription. Structure annotation, lyrics transcription, 50+ languages | HF |
| Project | Description | Link |
|---|
| acestep.cpp | Portable C++17 / GGML implementation of ACE-Step 1.5. CPU, CUDA, Metal, Vulkan. Stereo 48 kHz WAV/MP3 output | GitHub |
| acestep.vst3 | Official VST3 plugin for DAW integration. JUCE 8 + acestep.cpp engine. Includes minimalist web UI | GitHub |
| ace-step-1.5 Docker | Docker image with models pre-baked (~15 GB). REST API server, RunPod template, CLI generation tool | GitHub |
| Generative Radio | Fully local AI radio station. Qwen3 generates prompts, ACE-Step 1.5 generates songs. Multi-listener, Apple Silicon optimized | GitHub |
| StemForge | Local GPU-accelerated audio workstation. Stem separation (Demucs, BS-Roformer), MIDI extraction, Stable Audio generation, ACE-Step composition, RVC voice conversion, mixing, and export — all in one browser UI | GitHub |
| Boppy | Free hosted AI music generator. Describe a song in plain text, LLM writes lyrics, ACE-Step 1.5 generates full audio. Any genre, up to 5 min, shareable links, no signup | Website |
| DEMON | Streaming diffusion engine for ACE-Step v1.5 with TensorRT acceleration, hot-mutable controls, and a bundled web demo | GitHub · Website |
A comparison of notable open-source music generation projects alongside ACE-Step.
| Project | Architecture | Capability | License | Link |
|---|
| ACE-Step | LM + DiT | Text/lyrics → full song (vocal + BGM), cover, repaint, LoRA. <4 GB VRAM | Apache-2.0 | GitHub |
| YuE | LLaMA2 autoregressive | Lyrics → full song, multi-genre, multi-lingual, voice cloning, style transfer | Apache-2.0 | GitHub |
| AudioCraft / MusicGen | Autoregressive transformer | Text → music/audio, melody conditioning, style conditioning (JASCO) | MIT | GitHub |
| Amphion | Multiple (SVC, TTS, TTA) | Singing voice conversion, text-to-audio, vocoders, research toolkit | MIT | GitHub |
| Riffusion | Stable Diffusion (spectrograms) | Real-time text → music via spectrogram diffusion | MIT | GitHub |
| Stable Audio Tools | DiT + flow matching | Text → variable-length stereo audio (up to 47 s) | MIT | GitHub |
| DiffRhythm | Latent diffusion (DiT + VAE) | Lyrics → full-length song (up to 4 min 45 s) in ~10 s | Apache-2.0 | GitHub |
| HeartMuLa | LLM-based codec | Song gen, lyric recognition, audio codec, audio-text alignment | Apache-2.0 | GitHub |
| SongGeneration (LeVo) | Transformer-based | Lyrics → high-quality full song with multi-preference alignment (vocals + BGM) | Non-commercial | GitHub |
| Title | Topic | Link |
|---|
| ACE-Step Prompt Guide | Detailed prompting tips: tags, lyrics structure, genre control | Ambience AI |
| Generate AI Music with ACE-Step 1.5 | Installation, generation, LoRA customization | DigitalOcean |
| ComfyUI ACE-Step 1.5 Guide | Official ComfyUI v1.5 workflow tutorial | Comfy.org |
| AMD ACE-Step 1.5 Local Guide | Running ACE-Step on AMD GPUs | PromptGalaxy |
| Running ACE-Step 1.5 on M2 Mac | Apple Silicon setup, MPS memory workarounds | BioErrorLog |
| Install ACE-Step 1.5 with UV | Git + UV package manager setup | PandaiTech |
| ACE-Step 1.5 DeepWiki | Architecture deep-dive, code walkthrough, Gradio UI internals | DeepWiki |
| ACE Studio | Professional AI music production suite | acestudio.ai |
| Paper | Version | Key Contribution | Link |
|---|
| ACE-Step: A Step Towards Music Generation Foundation Model | v1.0 | DCAE + linear transformer, REPA training | arXiv |
| ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation | v1.5 | Hybrid LM + DiT, intrinsic RL, comprehensive evaluation | arXiv |
Contributions welcome! Please read the contributing guidelines first.

To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this work.