Build a Game Engine · 3D Rendering

PBR Materials & Lighting

The mesh has a normal and the depth buffer resolves it correctly; now make it look like a real material. We build the metallic-roughness Cook-Torrance model that glTF standardizes and most engines ship: a microfacet BRDF, energy conservation, the right texture color spaces, normal mapping, and HDR tone mapping. The shader core is the centerpiece. And we keep the honest framing throughout, this is physically based, not physically correct.

Time~55 min LevelSenior PrereqsThe Going 3D tutorial (mesh, normal, UBOs), Textures (sRGB vs linear), and 3D Math (the inverse-transpose). StackGLSL · C++ & Rust

01What PBR buys

Physically based shading models the surface response to light with quantities grounded in physics, , the , a distribution, so a material authored once reads correctly under any lighting, and artists author measured properties (base color, metallic, roughness) instead of tuning a specular exponent per scene.

Physically based, not physically correct, and not one equation

It's an approximation under energy-conservation constraints; "based"/"plausible" is the honest word^[6]. And PBR is a family: metallic-roughness vs specular-glossiness parameterizations, GGX vs Beckmann distributions, Lambert vs Disney diffuse. This tutorial teaches the metallic-roughness Cook-Torrance model that glTF 2.0 standardizes^[2]; every "the model does X" below is scoped to that variant.

The canonical chart: spheres across roughness and metallic, under a moving light. The highlight tightens at low roughness and the diffuse vanishes as metallic rises:

02The rendering equation

Everything traces back to Kajiya's rendering equation: the radiance leaving a point toward the viewer is the emitted radiance plus the integral, over the hemisphere of incoming directions, of the times the incoming radiance times the cosine term^[1].

eq. 1 \cdot the rendering equation L o = L e + \int Ω f r (ω i, ω o) \cdot L i (ω i) \cdot (n \cdot ω i) d ω i

The light leaving a point toward your eye (L_o) is whatever the surface emits (L_e, usually zero) plus a sum over every incoming direction. The ∫_Ω is that sum: it sweeps the hemisphere of directions ω_i above the surface. For each one, take the incoming light L_i, multiply by the BRDF (the fraction reflected toward the view direction ω_o), and weight by n · ω_i, the cosine that dims light arriving at a slant. Hover any symbol to see what it stands for.

Need a refresher on what a hemisphere integral means?

The ∫ sign means "add up a quantity over a continuous range," the continuous version of a sum. Here the range is every direction in the hemisphere (Ω): the half-dome of sky above the surface point, since light can only arrive from in front of the surface, not through it.

Picture splitting that dome into many tiny patches. For each patch you have a direction ω_i, the light coming from it, and the cosine weight; you compute the reflected contribution and add them all up. The integral is the exact answer in the limit of infinitely small patches. Real-time rendering can't do that per pixel, so §7 swaps the integral for a finite sum over a few punctual lights, and §9 approximates the rest with image-based lighting.

L_o is outgoing radiance (power per area per solid angle) toward the viewer; L_e is emitted (nonzero only for emissive surfaces); f_r is the BRDF; L_i is incoming radiance; (n · ω_i) is the Lambert cosine foreshortening; the integral sweeps the hemisphere above the surface.

This is the target, not what we evaluate

Real-time can't integrate the hemisphere every pixel, so it replaces the integral with a finite sum over punctual (delta) lights (§7) plus an image-based-lighting approximation (§9). Keep the cosine outside the BRDF (it's part of the equation, not f_r): in code, Lo += BRDF * radiance * NdotL^[5].

03The microfacet BRDF

Model the surface as a statistical distribution of microscopic mirror facets; roughness controls how spread out their orientations are. The Cook-Torrance specular term is

eq. 2 \cdot Cook-Torrance specular f specular = D \cdot F \cdot G 4 \cdot (n\cdotv) \cdot (n\cdotl)

Specular reflection is three statistical terms over a normalizing denominator. D says how many microfacets face the right way to bounce this light to your eye (it sets the highlight's tightness), F is the Fresnel reflectance at those facets, and G discounts facets blocked by their neighbours at grazing angles. The denominator 4 · (n·v) · (n·l) converts the facet statistics back into a per-surface reflectance. Multiply the three on top, divide, and you have the specular lobe.

with the half-vector h = normalize(v + l). The three terms: D is the fraction of facets aligned with h (the normal distribution), F is the Fresnel reflectance at those facets, and G is the fraction not masked or shadowed by neighbors. The choices below (GGX, Smith, Schlick) are the ones Karis adopted for Unreal Engine 4 and that became the real-time default^[3].

D, GGX/Trowbridge-Reitz: α² / (π·((n·h)²(α²−1)+1)²), with α = roughness². The long tail gives a bright core with a soft falloff.
G, Smith with Schlick-GGX: masking times shadowing, G₁(v)·G₁(l); the k remap differs for direct lights versus IBL.
F, Schlick: F₀ + (1−F₀)(1−(v·h))⁵. Reflectance rises to 1 at grazing angles (the rim).

α = roughness², the single most-missed line

The roughness in the UI and texture is perceptual; the BRDF uses its square. glTF, Karis, and Filament all remap alpha = perceptualRoughness²^[2]^[4]. Skip it and mid-roughness highlights come out wrongly sharp (the toggle in the §1 widget). Watch the naming hazard: in Filament the user-facing material parameter roughness is perceptual, while inside the BRDF the variable roughness (also written a) is the squared α, the same word for both, so state your convention once and hold it.

The teaching form below is the textbook D·F·G/(4·NdotV·NdotL) (matching LearnOpenGL); production often folds G and the denominator into a single visibility term V so specular is D·V·F (Filament's height-correlated V_SmithGGXCorrelated)^[4]:

The PBR shader core: one punctual light (GLSL)

const float PI = 3.14159265359;

// D: GGX normal distribution. `alpha` is roughness SQUARED.
float distributionGGX(float NdotH, float alpha) {
    float a2 = alpha * alpha;
    float denom = NdotH * NdotH * (a2 - 1.0) + 1.0;
    return a2 / (PI * denom * denom);
}
// G1: Schlick-GGX for one direction. `k` is remapped from PERCEPTUAL roughness (direct-light form).
float geometrySchlickGGX(float NdotX, float k) {
    return NdotX / (NdotX * (1.0 - k) + k);
}
float geometrySmith(float NdotV, float NdotL, float k) {
    return geometrySchlickGGX(NdotV, k) * geometrySchlickGGX(NdotL, k);  // masking * shadowing
}
// F: Schlick Fresnel. F0 = normal-incidence reflectance (0.04 dielectric, baseColor metal).
vec3 fresnelSchlick(float cosTheta, vec3 F0) {
    return F0 + (1.0 - F0) * pow(clamp(1.0 - cosTheta, 0.0, 1.0), 5.0);
}

vec3 shade(vec3 N, vec3 V, vec3 L, vec3 radiance,
            vec3 albedo, float metallic, float perceptualRoughness) {
    vec3  H = normalize(V + L);
    float NdotV = max(dot(N, V), 0.0), NdotL = max(dot(N, L), 0.0), NdotH = max(dot(N, H), 0.0);
    float alpha = perceptualRoughness * perceptualRoughness;            // THE squaring
    float k = (perceptualRoughness + 1.0) * (perceptualRoughness + 1.0) / 8.0;  // direct-light k
    vec3  F0 = mix(vec3(0.04), albedo, metallic);                   // dielectric 0.04, metal = baseColor

    float D = distributionGGX(NdotH, alpha);
    float G = geometrySmith(NdotV, NdotL, k);
    vec3  F = fresnelSchlick(max(dot(H, V), 0.0), F0);
    vec3  specular = (D * G * F) / (4.0 * NdotV * NdotL + 0.0001);  // Cook-Torrance

    vec3 kD = (vec3(1.0) - F) * (1.0 - metallic);                   // energy split; metals: 0
    vec3 diffuse = kD * albedo / PI;                                   // Lambert
    return (diffuse + specular) * radiance * NdotL;                    // cosine OUTSIDE the BRDF
}

Drag roughness, metallic, base color, and the light to feel the lobe reshape and the diffuse/specular split:

04Diffuse & energy conservation

The diffuse term is Lambert, albedo / π (albedo is the diffuse base color). Energy conservation splits the budget: specular keeps the Fresnel weight, diffuse gets the rest, k_d = (1−F)·(1−metallic). The (1−metallic) factor is why metals have zero diffuse. Reflected diffuse plus specular must not exceed the incoming light.

Energy conservation here is approximate

The single-scatter Cook-Torrance microfacet model loses energy at high roughness, light that should bounce between facets is dropped, so rough metals look too dark. The fix is multi-scatter compensation (Kulla-Conty; Filament ships a scaled version)^[8]. Name it as a known limitation, not a bug. Also: k_d = 1 − k_S is itself a simplification (it treats Fresnel as the whole specular weight), fine to teach, worth flagging.

05Metallic-roughness

The glTF 2.0 workflow has three core inputs: baseColor, metallic, and roughness. The derived shading quantities^[2]:

F₀ = lerp(0.04, baseColor, metallic), dielectric reflectance 0.04, metal reflectance is the base color.
c_diff = lerp(baseColor·(1−0.04), black, metallic).
α = roughness².

F0 = 0.04 is a common approximation, and the texture set has strict color spaces

0.04 corresponds to an index of refraction around 1.5; real dielectric F0 varies roughly 0.02 to 0.05, and some workflows expose IOR or a reflectance slider instead^[7]. Metals have colored (tinted) specular; dielectrics have achromatic specular plus colored diffuse. The texture set, with color spaces that are a correctness issue (cross-ref Textures): baseColor and emissive are sRGB; metallic-roughness (green = roughness, blue = metallic), normal, and occlusion (red channel) are linear. Occlusion is often packed with metallic-roughness as ORM (R=occlusion, G=roughness, B=metallic). Sample a data texture through an sRGB view and you corrupt the values.

The material factor block (CPU side)

// std140-friendly: vec4s keep 16-byte alignment (cross-ref Going 3D). Textures via descriptors.
struct MaterialFactors {
    glm::vec4 baseColorFactor;   // rgba
    float     metallicFactor;
    float     roughnessFactor;
    float     normalScale;
    float     occlusionStrength;
    glm::vec4 emissiveFactor;    // rgb + pad
};

#[repr(C)]
#[derive(Clone, Copy)]
struct MaterialFactors {
    base_color_factor: [f32; 4],
    metallic_factor:   f32,
    roughness_factor:  f32,
    normal_scale:      f32,
    occlusion_strength: f32,
    emissive_factor:   [f32; 4],     // rgb + pad for std140
}

06Normal mapping

A normal map stores per-texel surface normals in tangent space (the surface's local frame). The TBN matrix (tangent, bitangent, normal) transforms a sampled normal into world space for lighting. Stored values are a direction, not a color, hence linear, and the dominant out-of-surface +Z maps to ~0.5 blue (the blue cast).

Sample it linear, and mind the green-channel convention

A normal map must be sampled as linear/UNORM, not sRGB, the classic bug (cross-ref Textures): an sRGB decode bends every vector and tilts the lighting. And the green-channel convention flips bumps: OpenGL is +Y (green up), DirectX is −Y (green down)^[11]. The same map under the wrong convention inverts every detail (rivets cave in). The tangent's handedness (the w sign in glTF tangents) feeds the bitangent: B = cross(N, T) * tangent.w. Drop it and mirrored UVs light wrong.

Tangent-space normal fetch + TBN (GLSL)

// from a LINEAR/UNORM texture, not sRGB. [0,1] -> [-1,1]
vec3 sampledNormal = texture(normalMap, uv).xyz * 2.0 - 1.0;
// OpenGL (+Y) assumed; for a DirectX (-Y) map: sampledNormal.y = -sampledNormal.y;

vec3 N = normalize(inNormalWS);
vec3 T = normalize(inTangentWS.xyz);
T = normalize(T - dot(T, N) * N);            // Gram-Schmidt re-orthogonalize
vec3 B = cross(N, T) * inTangentWS.w;        // handedness sign from the glTF tangent
mat3 TBN = mat3(T, B, N);
vec3 normalWS = normalize(TBN * sampledNormal);  // tangent space -> world space

Toggle the map and rotate the light to see faked relief on a flat quad; flip the color-space toggle to see the bug:

07Punctual lights

Point, directional, and spot lights are punctual: a delta with zero solid angle, so the hemisphere integral collapses to a single BRDF evaluation per light, Lo += shade(...) summed over lights. Point lights fall off as inverse-square (1/d²); directional lights have constant radiance; spots add an angular cone.

Inverse-square, with a window

Raw 1/d² is the physically correct point-light falloff, but it reaches infinitely far and blows up at d→0, so engines add a smooth range window (Karis and Filament use a windowed inverse-square)^[3]. Mention the window; don't ship raw 1/d².

08HDR & tone mapping

Lighting results are unbounded (a bright light times albedo easily exceeds 1.0), so you render to a float/HDR target, apply exposure, then a tone-mapping operator that compresses [0,∞) into [0,1], then sRGB-encode for the display.

Tone mapping is not a clamp, and ACES is a look, not "correct"

A clamp/saturate clips highlights flat to white (hard detail and hue loss); an operator compresses the rolloff so highlight detail survives. Reinhard is L/(1+L)^[9]; the popular ACES fits (Narkowicz, Hill) are approximations of a film-look pipeline, widely used but not perceptually neutral, the author of the simple fit notes it oversaturates brights^[10]. Order matters: exposure → tone-map → sRGB-encode. Do lighting in linear; a manual pow(1/2.2) on top of an sRGB swapchain is the double-correction bug (cross-ref Textures).

Tone-map + encode (GLSL)

vec3 color = hdrColor * exposure;         // HDR scene radiance, exposure-scaled
color = color / (color + vec3(1.0));    // Reinhard (swap an ACES fit in production)
// sRGB encode for an UNORM swapchain. If the swapchain is an sRGB format, DROP this
// (hardware encodes) -- doing both is the double-correction bug.
color = pow(color, vec3(1.0 / 2.2));

09IBL overview

Punctual lights cover direct light; ambient/environment light comes from image-based lighting. Karis's split-sum precomputes three pieces^[3]: an irradiance map (the diffuse environment, convolved with the cosine lobe), a prefiltered environment map (the specular environment pre-blurred per roughness into mips), and a BRDF integration LUT (a 2D table on NdotV and roughness giving a scale and bias on F0). Specular IBL is then prefiltered · (F₀·scale + bias).

A preview, with a known error

This section is the map of the territory; cubemap capture, the importance-sampled prefilter, and generating the LUT are a later module. The split-sum's prefilter assumes view = normal = reflection, so it can't produce the stretched reflections at grazing angles, the most-cited split-sum limitation. And note: ambient occlusion darkens the ambient term in crevices, it's a coarse approximation of occluded ambient light, not global illumination (it doesn't bounce light).

Wrong answers, and why: a red metal has no diffuse and a red-tinted specular (not a plastic body, not black); and wrong normal-mapped lighting is the sRGB-sampling bug, not the base-color space or the roughness square.

10Pitfalls

Roughness not squaredThe BRDF uses α = roughness². Mid-roughness highlights come out too sharp otherwise.

Metal with a diffuse bodyMetals have zero diffuse and colored specular (F0 = base color). Apply (1−metallic).

Normal map as sRGBIt stores directions; sample linear/UNORM or the lighting tilts.

Wrong green-channel convention+Y (OpenGL) vs −Y (DirectX). Mismatch inverts every bump.

Data texture as sRGBMetallic-roughness and occlusion are linear; only base color and emissive are sRGB.

Clamp instead of tone mapClamping clips highlights to white; an operator compresses them.

Double gammaManual pow(1/2.2) on top of an sRGB swapchain over-brightens. Pick one.

"PBR is physically correct"It's physically based; single-scatter loses energy at high roughness.

11What's next

The surface shades like a real material under direct light. It's still missing the thing that grounds objects in a scene: shadows. The next module builds shadow mapping (rendering depth from the light, the bias problem, PCF, and cascades), then deferred rendering and global illumination. The full 3D path is on the series hub.

James T. Kajiya. "The Rendering Equation." SIGGRAPH 1986. overview. The integral form real-time rendering approximates.
The Khronos Group. glTF 2.0 Specification (metallic-roughness material, Appendix B). registry.khronos.org. F0 = 0.04 dielectric, α = roughness², the texture channels and color spaces.
Brian Karis. "Real Shading in Unreal Engine 4." SIGGRAPH 2013. selfshadow.com. The GGX/Smith/Schlick choices, windowed lights, and the split-sum IBL.
Google. "Physically Based Rendering in Filament." google.github.io/filament. The most rigorous free reference: the BRDF GLSL, the roughness remap, and multi-scatter compensation.
Joey de Vries. LearnOpenGL, "PBR" (Theory / Lighting / IBL). learnopengl.com. The canonical tutorial GLSL and the reflectance loop.
Tomas Akenine-Möller, Eric Haines, Naty Hoffman, et al. Real-Time Rendering, 4th ed., ch. 9. realtimerendering.com. Microfacet theory and the physically-based (not correct) framing.
Sébastien Lagarde. "Memo on Fresnel equations." seblagarde.wordpress.com. F0 from IOR (1.5 → 0.04) and the dielectric F0 range.
Christopher Kulla and Alejandro Conty. "Revisiting Physically Based Shading at Imageworks." SIGGRAPH 2017. selfshadow.com. Multi-scatter energy loss at high roughness and the compensation fix.
Erik Reinhard et al. "Photographic Tone Reproduction for Digital Images." SIGGRAPH 2002. cs.utah.edu. The L/(1+L) operator.
Krzysztof Narkowicz. "ACES Filmic Tone Mapping Curve." knarkowicz.wordpress.com. The simple ACES fit and its "oversaturates brights" caveat (ACES is a look, not ground truth).
Marmoset. "Tangent Space & Handedness." docs.marmoset.co. The OpenGL +Y vs DirectX −Y green-channel convention and the tangent handedness sign.