Implementation Details.md
April 5, 2026 ยท View on GitHub
Implementation Details
Resources Setup
Material & Mesh: The GsplatSettings singleton owns global rendering resources. It:
- Maintains a
GsplatMaterialarray indexed byCompressionMode(e.g.Uncompressed,Spark), where eachGsplatMaterialcontains aDefaultMaterial, aCalcDepthShader, and anInitOrderShader. TheGsplatMaterialgenerates lazilyMaterialsfor each SH band (0-3) and Render Order combination (defined byGsplatSettings.MaxRenderOrder). - Procedurally generates a
Meshthat consists of multiple quads. The number of quads is defined bySplatInstanceSize. Each vertex of these quads has its z-coordinate encoded with an intra-instance index, which is used in the vertex shader to fetch the splat order.
Gsplat Data: This package supports importing PLY file in two modes via GsplatAsset implementations:
- Uncompressed:
GsplatAssetUncompressedstores per-splat arrays (Positions,Colors,Scales,Rotations, optionalSHs) and uploads them to dedicated GPUGraphicsBuffers. - Spark (Packed):
GsplatAssetSparkpacks each splat into a fixed 16-byte layout (uint4per splat inPackedSplats) plus optional packed SH arrays (2 uints per splat for SH1, 4 uints for SH2, and 4 uints for SH3). Packing includes float16 position, log-encoded scale, RGBA8, and octahedral axis+angle quaternion encoding, which is inspired by spark.js.
GPU Resources & Lifetime:
GsplatRendererImplcreates a per-rendererOrderBuffer(which will later store the sorted indices of the splats), small buffers for cutouts data (CutoutsBuffer,OrderSizeBuffer,BoundsBuffer) and anISorterResource(sorting support buffers and key buffer).- Per-asset GPU data buffers are allocated and cached by
GsplatResourceManager(reference counted), so multiple renderers can share the same uploaded asset. - Upload can be synchronous (
UploadData) or asynchronous batched (UploadDataAsync), controlled byGsplatRenderer.AsyncUpload. The renderer can optionally draw before upload completes (RenderBeforeUploadComplete).
Rendering Pipeline
The following steps are performed each frame for every active camera, except for the Sorting pass, which is executed only every Nth frame, and the Compute pass, which is executed only every Nth sort, as configured in the GsplatRenderer. The sorting and compute are also triggered when a camera moves past a certain threshold and can be manually triggered.
Compute Prepass
This pre-pass performs precalculations when needed. Currently, it is only used to generate the index buffer when cutouts are enabled.
- InitOrder (Optional):
GsplatRendererImpl.DispatchInitOrder, if cutouts are used and have changed since last call, generate a sequential indices buffer (OrderBuffer), similar toInitPayload. While doing so, the prepass query the splats position to ignore any splats culled by a cutout. The new Bounds of the gaussian is calculated at the same time. Then, the remaining number of splats is extracted from theOrderBuffer.
Sorting Pass
This pass sorts the splats by their depth to the camera. The sorting is performed entirely on the GPU using Gsplat.compute. This compute shader leverages a highly optimized radix sort implementation from DeviceRadixSort.hlsl.
- Integration: The sorting is initiated by custom render pipeline hooks:
GsplatURPFeaturefor URP,GsplatHDRPPassfor HDRP, orGsplatSorter.OnPreCullCamerafor BiRP. These hooks callGsplatSorter.DispatchSort. - Sorting Steps:
- InitPayload (Optional): If the payload buffer (
b_sortPayload/OrderBuffer) has not been initialized, fill it with sequential indices (0, 1, 2, ...SplatCount-1). - CalcDepth:
IGsplat.ComputeDepthruns an asset-specific compute kernel (CalcDepthorCalcDepthSpark) to calculates view-space depth of each splat, and stores them intoSorterResource.InputKeyswhich will be used as the sorting key. - DeviceRadixSort: The
Upsweep,Scan, andDownsweepkernels execute a device-wide radix sort. It sorts the depth values in theb_sortbuffer. Crucially, it applies the same reordering operations to theb_sortPayloadbuffer.
- InitPayload (Optional): If the payload buffer (
- Result: After the sort, the
b_sortPayloadbuffer (which is theOrderBufferfromGsplatRendererImpl) contains the original splat indices, now sorted from back-to-front based on their depth to the camera.
Render Pass
With the splats sorted, they can now be drawn using Gsplat.shader.
- Draw Call: The
GsplatRendererImpl.Rendermethod issues a single draw call viaGraphics.RenderMeshPrimitives. It uses GPU instancing to render multiple instances of the procedurally generated quad mesh, and a material fromGsplatAsset.Materialis selected based on the desiredSHBands. All necessary buffers and parameters (_MATRIX_M,_SplatCount, etc.) are passed to the shader via aMaterialPropertyBlock. - Vertex Shader:
- Index Calculation: It determines the final splat
orderto render by combining theinstanceIDwith the intra-instance index stored in the vertex's z-component. - Fetch Sorted ID: It uses this
orderto look up the actual splatidfrom the_OrderBuffer. Thisidcorresponds to the correct, depth-sorted splat. - Fetch Splat Data: Using this sorted
id, it fetches (extracts) the splat's position, rotation, scale, color, and SH data from their respective buffers. - Apply Scale factor: The splat's UV coordinates are multiplied by the splat's
_ScaleFactor, cropping the splat to the given scale. - Covariance & Projection: It calculates the 2D covariance matrix of the Gaussian in screen space. This determines the shape and size of the splat on the screen. It performs frustum and small-splat culling for efficiency.
- SH Calculation (Optional): If SHs are used,
EvalSHis called to calculate the view-dependent color component, which is then added to the base color. - Vertex Output: It calculates the final clip-space position of the quad's vertex by offsetting it from the splat's projected center based on the 2D covariance. The final color and UV coordinates (representing the position within the Gaussian ellipse) are passed to the fragment shader.
- Index Calculation: It determines the final splat
- Fragment Shader:
- It calculates the squared distance from the pixel to the center of the Gaussian ellipse using the interpolated UVs.
- If the pixel is outside the ellipse (
A > 1.0), it is discarded. - The alpha is calculated using an exponential falloff based on the distance, modulated by the splat's opacity. Pixels with very low alpha are discarded.
- An additional falloff, based on the scaling factor, is added to the alpha to keep the gaussian splats smooth. This prevents the harsh edges of cropped splats.
- The final color is the vertex color multiplied by the calculated alpha. An optional
Gamma To Linearconversion can be applied before output.