wgpu-mojo

June 13, 2026 · View on GitHub

Mojo bindings for wgpu-native, providing a lightweight WebGPU wrapper with RAII-friendly GPU objects.

GPU Fire Simulation Demo

GPU Fire Simulation

Real-time Doom-style fire running entirely on the GPU — compute shader → render shader → GLFW window.

# clone + run the demo
pixi run build-callbacks && pixi run example-fire-sim

Using wgpu-mojo in your project

Step 1 — Add the package

Your pixi.toml needs the Modular and conda-forge channels and pixi-build preview:

[workspace]
channels = ["https://conda.modular.com/max-nightly", "conda-forge"]
preview  = ["pixi-build"]

Then add the dependency:

pixi add --git https://github.com/Hundo1018/wgpu-mojo wgpu-mojo

This compiles wgpu.mojopkg and installs it into your environment. All from wgpu import … statements now resolve at compile time.

Step 2 — Install the native GPU library (once per machine)

The Mojo package needs libwgpu_native and its callback bridge at runtime. Run this once inside your project's activated pixi environment:

curl -fsSL https://raw.githubusercontent.com/Hundo1018/wgpu-mojo/main/scripts/setup-native.sh | bash

What it does:

  • Downloads libwgpu_native v29.0.0.0 from wgpu-native releases
  • Compiles the Mojo callback bridge (libwgpu_mojo_cb) from source
  • Installs both to $CONDA_PREFIX/lib/
  • If GLFW is present, also compiles the windowed rendering bridge (libglfw_input_cb)

Requires: curl, unzip, gcc (standard in any conda environment).

Step 3 — Write GPU code

from wgpu import Instance, WGPUBufferUsage

def main() raises:
    var instance = Instance()
    var adapter  = instance.request_adapter()
    var device   = adapter.request_device()

    # Allocate a GPU storage buffer
    var buf = device.create_buffer(
        UInt64(1024),
        WGPUBufferUsage.STORAGE | WGPUBufferUsage.COPY_SRC,
        label = "my_buffer",
    )
    # buf releases automatically (RAII) when it goes out of scope

Examples

Headless GPU Compute (no window)

Vector addition on the GPU, result read back to CPU:

from wgpu import Instance, WGPUBufferUsage, WGPUMapMode

comptime WGSL = """
@group(0) @binding(0) var<storage, read>       a   : array<f32>;
@group(0) @binding(1) var<storage, read>       b   : array<f32>;
@group(0) @binding(2) var<storage, read_write> out : array<f32>;
@compute @workgroup_size(64)
fn main(@builtin(global_invocation_id) id: vec3<u32>) {
    let i = id.x;
    if i < arrayLength(&a) { out[i] = a[i] + b[i]; }
}
"""

def main() raises:
    var instance = Instance()
    var device   = instance.request_adapter().request_device()
    var gpu      = GPU.wgpu(instance^)

    var n: UInt64 = 1024
    var a = gpu.buffer[Float32](n, WGPUBufferUsage.STORAGE | WGPUBufferUsage.COPY_DST)
    var b = gpu.buffer[Float32](n, WGPUBufferUsage.STORAGE | WGPUBufferUsage.COPY_DST)
    var c = gpu.buffer[Float32](n, WGPUBufferUsage.STORAGE | WGPUBufferUsage.COPY_SRC)

    gpu.write(a, List[Float32](1.0, 2.0, 3.0))   # fill first 3 elements
    gpu.write(b, List[Float32](10.0, 20.0, 30.0))

    var prog = gpu.compile_compute(WGSL, entry_point="main", n_storage_buffers=3)
    gpu.dispatch(prog^, List[Buffer](a, b, c), wx=UInt32(n // 64))

    var result = gpu.read[Float32](c, count=3)
    print(result[0], result[1], result[2])  # 11.0  22.0  33.0

Full source: examples/compute_add_v2.mojo

Render Triangle (GLFW window)

from wgpu.instance import Instance
from wgpu._ffi.structs import WGPUColor
from wgpu.rendercanvas import RenderCanvas

comptime WGSL = """
struct V { @builtin(position) pos: vec4<f32>, @location(0) col: vec3<f32> }
@vertex fn vs(@builtin(vertex_index) i: u32) -> V {
    var pos = array<vec2<f32>,3>(vec2(0.0,0.5), vec2(-0.5,-0.5), vec2(0.5,-0.5));
    var col = array<vec3<f32>,3>(vec3(1,0,0), vec3(0,1,0), vec3(0,0,1));
    return V(vec4(pos[i],0,1), col[i]);
}
@fragment fn fs(v: V) -> @location(0) vec4<f32> { return vec4(v.col,1); }
"""

def main() raises:
    var instance = Instance()
    var adapter  = instance.request_adapter()
    var device   = adapter.request_device()
    var canvas   = RenderCanvas(adapter, device, 800, 600, "hello triangle")
    var shader   = device.create_shader_module_wgsl(WGSL, "tri")
    var layout   = device.create_pipeline_layout(List[OpaquePointer[MutExternalOrigin]](), "layout")
    var pipeline = device.create_render_pipeline(
        shader, "vs", "fs", canvas.surface_format(), layout,
        primitive_topology = UInt32(4),
    )
    while canvas.is_open():
        canvas.poll()
        var frame = canvas.next_frame()
        if not frame.is_renderable(): continue
        var enc   = device.create_command_encoder("frame")
        var rpass = enc.begin_surface_clear_pass(frame.texture, WGPUColor(0,0,0,1), "pass")
        rpass.set_pipeline(pipeline)
        rpass.draw(UInt32(3), UInt32(1), UInt32(0), UInt32(0))
        rpass^.end()
        device.queue_submit(enc^.finish())
        canvas.present()

Full source: examples/triangle_window.mojo

Texture Sampling

from wgpu import Instance, WGPUTextureUsage, WGPUTextureFormat, WGPUFilterMode

Full source: examples/texture_sample.mojo

Enumerate GPU Adapters

pixi run example-enumerate

Output lists all available GPU backends (Vulkan, Metal, DX12) and adapter names.


Available pixi tasks (development clone)

TaskDescription
build-callbacksCompile C callback bridge (required before GPU tasks)
helloHello triangle quickstart
example-computeHeadless vector addition
example-enumerateList available GPU adapters
example-clearCornflower-blue window (smoke test)
example-triangleRGB triangle in a window
example-texture-sampleTexture sampling demo
example-fire-simGPU fire simulation (compute + render)
testNon-GPU unit tests

Core API

ModuleWhat it provides
wgpu.instanceInstance — library entry point, adapter selection
wgpu.adapterAdapter — device creation
wgpu.deviceDevice — create all GPU objects, submit work
wgpu.bufferBuffer — GPU memory allocation, mapping
wgpu.textureTexture, TextureView
wgpu.shaderShaderModule — WGSL compilation
wgpu.pipelineComputePipeline, RenderPipeline
wgpu.commandCommandEncoder, CommandBuffer
wgpu.compute_passComputePassEncoder
wgpu.render_passRenderPassEncoder
wgpu.bind_groupBindGroup, BindGroupLayout
wgpu.gpuGPU — high-level facade (compile + dispatch in ~5 lines)
wgpu.rendercanvasRenderCanvas — GLFW window + surface
wgpu.diagnosticspreflight() — check library load, list adapters

All types are re-exported from wgpu so from wgpu import Instance always works.


Lifetime and Ownership

Wrappers are RAII: GPU objects release automatically when they go out of scope. Two patterns to keep in mind:

Encoder types must be explicitly finished:

var enc   = device.create_command_encoder("enc")
var cpass = enc.begin_compute_pass("pass")
# ...
cpass^.end()                  # consume the encoder
var cmd = enc^.finish("cmd")  # consume and produce CommandBuffer
device.queue_submit(cmd^)

Pin resources that must outlive the GPU call:

var pipeline = device.create_compute_pipeline(...)
var bg = device.create_bind_group(...)
device.queue_submit(cmd^)
_ = pipeline^   # prevent ASAP drop before GPU finishes
_ = bg^
device.poll(True)

The GPU facade (wgpu.gpu) handles these pins automatically.


Development Setup (cloning the repo)

git clone https://github.com/Hundo1018/wgpu-mojo
cd wgpu-mojo
pixi run build-callbacks   # download wgpu-native + compile C bridges
pixi run test              # non-GPU unit tests
pixi run hello             # GPU smoke test — RGB triangle window

pixi run build-callbacks performs the same steps as setup-native.sh but reads the native library version from ffi/wgpu-native-meta/wgpu-native-git-tag and uses sources already present in the repo.

GPU driver prerequisites

PlatformWhat to install
LinuxVulkan: mesa-vulkan-drivers + libvulkan1, or NVIDIA proprietary drivers
macOSMetal is built-in — no extra drivers needed
WindowsD3D12 or Vulkan — usually already present with GPU vendor drivers

Platform support

linux-64 and osx-arm64 are fully supported via pixi. osx-x86_64 and win-x64 can be built manually; see conda.recipe/recipe.yaml for the build steps.


Diagnostics

from wgpu.diagnostics import preflight
print(preflight())

Prints library search paths, load status, wgpu-native version, and available GPU adapters.


License

Apache-2.0