Draw call optimization in UE5

With the introduction of Nanite and Lumen, should we still put in effort to author assets with drawcall reduction in mind? This was a ting to worry about a few years back but now i am seeing contradictory information. Would really appreciate what peoples experience has been like in recent versions of Unreal.

Thanks,

Richard

This question is kind of coming at the problem backwards. It’s kind of like asking if you need to care about what type of screwdriver you have, without knowing what types of screws you need to turn. Your main goal should never be to simply reduce draw calls, but to improve performance to within a reasonable threshold. There’s no set rule for what is necessary or unnecessary, it is highly dependant on the specifics of your project and your target platform.

Nanite and Lumen are not magic bullets. Nanite may be able to reduce object-related draws, but it cannot reduce material-related draws. Usually high end systems can handle thousands of draw calls per frame without much issue, but if you have 100+ unique materials then performance will degrade fast, especially in a deferred pipeline which is required for Lumen.

The point is, ultimately you need to measure your performance and determine what the largest bottlenecks are and focus on those to reach your target (for example, 4k 60fps). If that means authoring assets to reduce draw calls, then that’s what you need to do, but it doesn’t start there. It starts with measuring performance. Otherwise, you are literally just guessing.

2 Likes

Thank you for the thoughtful and informative response.
Reducing draw calls has always been an essential goal when establishing materials and asset pipeline. Most technical people know why it’s important and requirements to reduce draw calls. Most of the time it was a clear goal but achieving it meant restrictions and a tedious and ultimately slower art pipeline. No doubt there have been many efforts to establish slick workflows that reduced the pain. Recently I saw this video and was surprised to see that the outcome of this developers efforts to reduce draw calls were futile. I agree that ultimately the solution is highly dependent on many factors, but I also understand that rendering tech is improving drastically and often times certain optimization techniques are no longer at play. Certainly Epic’s efforts to develop the virtual rendering pipeline are redefining the rules quite a bit. It is more difficult to test perf when the project is starting out, hence I wanted to see if there are teams out there who have established projects who are using latest UE5 rendering features and still seeing the benefits of chasing authoring workflow that targets draw call reduction. Certainly squeezing out every ms would be crucial for an open world game with rich environments.

TL;DR: yes, you still have to care about draw calls even with Nanite (although that video you shared might be a bit overkill, the patterns it demonstrates point you in the right direction)

Why to Reduce “Draw Calls”

Before I get too deep into talking about “draw calls”, I want to clarify that in Nanite Land our goal is to reduce raster and shading bins. Nanite’s VisBuffer pass rasterizes all of geometry in the same raster bin (fixed function, or programmable (Masked materials, WPO, PDO)) with the same shader (descriptor heap + PSO combo) into a single shading bin. What that means for content creators is we’re no longer concerned with unique geometry, only unique shaders.

Rasterization

On the VS side of things, Nanite’s fixed function rasterization for non-deforming, opaque geometry is pretty darn fast. It can rasterize all of that completely static geometry in a single ExecuteIndirect call. To go a little deeper, there are separate fixed function paths for Hardware (triangles > r.Nanite.MaxPixelsPerEdge), and Software (triangles =< r.Nanite.MaxPixelsPerEdge) Rasterization. Within the HW and SW paths, there are separate Fixed Function bins for base, Spline, TwoSided, and Skinned (as well as some combinations of those). The SW path additionally has bins for tessellated geometry.

Shading

On the PS side of things and for the purposes of content creators: “shader” effectively means “Material Instance”. This could be constant (IE a Material Instance asset) or dynamic (MID).

Again, the goal for tech artists is to try and reduce the unique number of both of those bins. On PC especially, because the engine doesn’t yet have the capability to skip empty shading bins.

Empty shading bins? Well, unfortunately the engine has to allocate those bins before they’re rasterized. In order to do that the engine queues up a shading bin for each unique shader in the world. If a bin ends up not rasterizing any pixels, the engine can’t un-allocate that bin. Empty bins still require context switching, so that’s bad! On Gen9 platforms, though, the engine does have a few ways to skip over shaders which don’t end up with any on-screen pixels. Graham Wihlidal talks about that in his GDC2024 presentation Nanite GPU-Driven Materials (Video | Slides).

Worst Practices

So keeping all that in mind, I think it’s important to highlight the absolute worst thing you can do for Nanite rendering. It looks something like this:

In order to improve artist workflows and reduce content browser clutter from Material Instance Constant assets, we expose certain parameters of a material to the content creates at the actor level. In the actor’s construction script, we create Material Instance Dynamics for each material on every mesh component on the actor so that we can update the exposed parameters for the artists.

OH! And since we’re an isometric game, each of our base materials is masked. That way we can fade out the geometry when it’s between the character and the camera! And when you go into a building we animate the roofs popping off so you can see inside with World Position Offset.

Now, instead of at least being able to combine all those unique materials into a single shading bin, this studio has instead broken each single primitive into it’s own raster and shading bin.

A Tangent on Uber Materials

It’s vitally important to remember that a material graph is not one “shader”. It is an input into a system that first translates that graph to HLSL based on the permutations required. You do not save anything by putting your landscape material and your foliage material into the same graph. Even if you don’t hide functionality behind a bunch of static switch parameters, you’re still going to be checking multiple usage flags and setting Material Property Overrides on the MICs for both use cases, and all of those are going to create separate and unique shaders.

(This, of course, doesn’t even get into the permutation explosion that can happen with the existing Material Layers system)

How to Reduce Raster and Shading Bins

That video you shared is wild! It’s decidedly on the farthest end of the “Draw Calls Don’t Matter” to “You Should Only Have One Draw Call”, but from what I can tell the techniques they’re describing are more or less in line with what I was going to suggest anyways.

That said, I think there’s a balance between the effort to set up that level of infrastructure yourself and some reasonable construction techniques. From the texture side of things, I feel like the UDIM approach can be a bit cumbersome and restrictive. I know folks are excited about bindless resources as a potential avenue to be able to be more expressive and less restrictive about content and packing textures into UDIMs. For our current purposes, I’m going to bless separate MICs when you need to change out textures.

For vector and scalar parameters, though, we’ve got three tools in our tool belt we can use so that we’re reducing the number of unique MICs in our project:

  • Custom Primitive Data - In your material you can specify a parameter to Use Custom Primitive Data and specify an index into an array of up to 32 floats which are passed in from the component. There’s an inspector panel in the Material editor so you can see which indexes are used throughout the material in order to avoid collisions. As of 5.0, the details panel of the component will show you the different parameters on the mesh and offer up color pickers for vector parameters!
  • Per-Instance Custom Data - When you’re using instanced mesh components, you can specify an array of up to 32 floats of per-instance custom data in addition to Custom Primitive Data (IE parameters passed to all instances). This is a separate material expression, though. (Apparently our documentation on this is pretty thin, so I’ve linked the example of it I gave at GDC2022).
  • Material Parameter Collections - You can have up to two unique MPCs referenced in a material. The benefit is that you set the values on the MPC itself, rather than in the material. All materials that reference that MPC will get the updated values immediately and automatically! A couple of common use cases here would be things like rain or wind, but there’s a lot of really interesting uses of MPCs once you put your mind to it.

With Nanite, you don’t have the ability to vertex paint per-instance. Storing XX million float3s per instance is downright prohibitive. Instead, as of UE5.5, you can use the new mesh painting mode to paint into what amounts to a UDIM. You can set up your materials with a base layer and a blend layer, then paint the blend layer on a per-instance basis without fear of bloating your memory, or creating separate MICs and bins to different explicit mask textures.

Conclusion

I can sit here all day and say “use CPD and PICD to reduce your MICs”, but I do acknowledge that for larger scale productions there might still be some infrastructure required to make managing and updating that data easier for your artists. There’s nothing built in for “Custom Primitive Data Set Instances” or anything like that. I’ll leave it as an exercise to the reader to determine what’s best for their project.

I also don’t want to sit here and say that everyone should go out and immediately start retooling their material pipeline to try and reduce draw calls. You should still be profiling in order to identify the specific issues your project is facing and focusing your optimization efforts there instead of tilting at windmills because Don Quixote said it was a good idea.

At the end of the day, though, I think it’s important to understand how to build content optimally (as opposed to how to optimize content). If you’re at the stage of setting up material pipelines and defining production practices for a title, now is a great time to consider how you can set yourself up to keep your bin counts low without inflicting a lot of pain and suffering on your artists.

Hope that helps, lemme know if you have any questions!

3 Likes

Lots of great info here. Thank you so much for taking the time to explain in detail! Will certainly check out the linked resources and GDC presentation as well.
:+1: :folded_hands: