unity-rendering-investigation

A basic performance investigation around a variety of rendering techniques within Unity.

Approaches

Unity Renderer MonoBehaviour

Every mesh gets its own GameObject, MeshFilter, and Renderer components.

Graphics.DrawMesh API

Transform matrices and materials are cached and all meshes are drawn using the Graphics.DrawMesh() function.

Details

No overhead and management of Renderers and MeshFilters.

Graphics.DrawMesh API with MaterialPropertyBlock

Same as above, but one material is used and each mesh gets its own MaterialPropertyBlock to augment that material.

Details

MaterialPropertyBlock is supposed to be more efficient, but seems to take more time render.

It's possible that this is just more memory efficient. Results in the same number of set pass calls and batches.

Graphics.DrawProcedural API

Each mesh is converted into two ComputeBuffers for both indices and attributes which are referenced in the vertex shader. A material and matrix are cached for each mesh and rendered using the Graphics.DrawProcedural() function and GL.PushMatrix() to set the transform of the draw.

Each material takes both that "points" and "attributes" compute buffers as parameters.

Details

Despite many draws, the Unity stats window only displays that two "draw calls" are being made.

Every draw requires a new set pass call.

Because the mesh is procedural this will by-pass all of Unity's internal calculations rendering logic like frustum culling.

Attributes can be stored in custom formats and unpacked in the vertex shader to save on memory.

Graphics.DrawProcedural API with Unpacked Vertex Buffer

Same as above, but the attributes for the mesh are unpacked into a single ComputeBuffer with three vertices per triangle.

Details

Only pro might be that there is less array access in the vertex shader.

This approach takes more memory and transforms more vertices.

Graphics.DrawProcedural with Visible Triangles Array

Runs an occlusion pass, counts the triangles, generates an array of visible triangles, and renders only the visible triangles using DrawProcedural.

Details

Requires a few compute shader and prepasses to run. These passes can be done over multiple frames:

Render whole model to renderTexture
Clear the "visible" array
Mark triangles as "visible" in a buffer with a compute shader pass
Write the triangle ids needed to an append buffer in a compute shader pass

More memory is required for the extra textures and buffers.

This limits the amount of triangles that have to be drawn to the primary buffer every frame to a minimal and possibly consistent count.

Moving a mesh requires updating a buffer with the model attributes.

Other Concepts

Single ComputeBuffer for All Meshes

One compute buffer could be used to store all the attributes for all meshes with an offset buffer to help address a specific point in the attribute buffer to render. Multiple meshes could then be drawn by instancing and the instanceId can be used to address the specific mesh to draw.

The biggest issue is that when using DrawProcedural(), you have to specify the amount of vertices, which means every mesh must be the same size.