Tutorial: Deferred shading

Tutorial: Deferred shading

By CRC Press

July 1st 2011 at 8:00AM

This excerpt from Game Engine Gems, co-written by staff at Black Rock Studios, offers advice on deferred shading

Deferred shading is an increasingly popular technique used in video game rendering.

Geometry components such as depth, normal, material colour, etcetera are rendered into a geometry buffer (commonly referred to as a G-buffer), and then deferred passes are applied in screen space using the G-buffer components as inputs.

A particularly common and beneficial use of deferred shading is for faster lighting. By detaching lighting from scene rendering, lights no longer affect scene complexity, shader complexity, batch count, etcetera.

Another significant benefit of deferred lighting is that only relevant and visible pixels are lit by each light, leading to less pixel overdraw and better performance.
The traditional deferred lighting model usually includes a fullscreen lighting pass where global light properties, such as sun light and sun shadows, are applied.

However, this lighting pass can be very expensive due to the number of onscreen pixels and the complexity of the lighting shader required.

A more efficient approach would be to take different shader paths for different parts of the scene according to which lighting calculations are actually required. A good example is the expensive filtering techniques needed for soft shadow edges. It would improve performance significantly if we only performed this filter on the areas of the screen that we know are at the edges of shadows.

Swoboda [2009] describes a technique that uses the PlayStation 3 SPUs to analyse the depth buffer and classify screen areas for improved performance in post-processing effects, such as depth of field.

Moore and Jefferies [2009] describe a technique that uses low-resolution screen-space shadow masks to classify screen areas as in shadow, not in shadow, or on the shadow edge for improved soft shadow rendering performance. They also describe a fast multisample anti-aliasing (MSAA) edge detection technique that improves deferred lighting performance.

These works provided the background and inspiration for this chapter, which extends things further by classifying screen areas according to the global light properties they require, thus minimising shader complexity for each area. This work has been successfully implemented with good results in Split/Second, a game developed by Disney’s Black Rock Studio. It is this implementation that we cover in this chapter because it gives a practical real-world example.

Overview of Method

The screen is divided into 4x4 pixel tiles. For every frame, each tile is classified according to the minimum global light properties it requires.

The seven global light properties used on Split/Second are the following:

1. Sky. These are the fastest pixels because they don’t require any lighting calculations at all. The sky colour is simply copied directly from the G-buffer.

2. Sunlight. Pixels facing the sun require sun and specular lighting calculations (unless they’re fully in shadow).

3. Solid shadow. Pixels fully in shadow don’t require any shadow or sun light calculations.

4. Soft shadow. Pixels at the edge of shadows require expensive eight-tap percentage closer filtering (PCF) unless they face away from the sun.

5. Shadow fade. Pixels near the end of the dynamic shadow draw distance fade from full shadow to no shadow to avoid pops as geometry moves out of the shadow range.

6. Light scattering. All but the nearest pixels have a light scattering calculation applied.

7. Antialiasing. Pixels at the edges of polygons require lighting calculations for both 2X MSAA fragments.

CALCULATING LIGHT PROPERTIES

We calculate which light properties are required for each 4x4 pixel tile and store the result in a 7-bit classification ID. Some of these properties are mutually exclusive for a single pixel, such as sky and sunlight, but they can exist together when properties are combined into 4x4 pixel tiles.

Once we’ve generated a classification ID for every tile, we then create an index buffer for each ID that points to the tiles with that ID and render it using a shader with the minimum lighting code required for those light properties.

We found that a 4x4 tile size gave the best balance between classification computation time and shader complexity, leading to best overall performance. Smaller tiles meant spending too much time classifying the tiles, and larger tiles meant more lighting properties affecting each tile, leading to more complex shaders.

A size of 4x4 pixels also conveniently matches the resolution of our existing screen-space shadow mask [Moore and Jefferies 2009], which simplifies the classification code, as explained later. For Split/Second, the use of 4x4 tiles adds up to 57,600 tiles at a resolution of 1280x720.