Build a Game Engine · 3D Milestone

The 3D-Game Capstone

Twenty-plus modules built the parts: a fixed loop, a Vulkan renderer, deferred PBR with shadows and post, skeletal animation, a physics solver, an audio engine, a gameplay layer, AI that paths and steers, networking, and the tools to profile it all. This is where they become one thing, a small, complete, playable 3D game, and where the only new lessons are the ones that live in the seams between subsystems.

Time~60 min LevelSenior · capstone PrereqsThe whole series. This integrates every module; each section links the one it draws on. StackC++ & Rust

01What "done" looks like

The capstone game is a third-person survival arena, the smallest design that forces every subsystem to fire without genre-specific tuning eating the budget:

The vertical slice

One level (a glTF scene with a baked navmesh), a player capsule (move + look, the character controller), waves of AI enemies that path and steer toward the player and attack in melee range, a dodge/attack combat loop, rendered with the PBR pipeline + a shadow + bloom/tonemap/TAA, audio cues on hits and spawns, and an ImGui debug overlay showing the live frame profiler. Win = survive N waves; lose = health hits zero. It's a , not a shippable game, every subsystem fires once, nothing is polished to ship.

02The engine, layer by layer

The series built Gregory's runtime architecture bottom-up^[1]; the capstone stacks it back together. Subsystems start up in dependency order and shut down in reverse:

Platform & core: the window, the memory model + allocators, the job system, lock-free queues.
Resources: asset pipeline, streaming, compression.
The frame spine: the loop, input.
Runtime systems: the renderer, animation, physics, audio.
Gameplay & AI: the object model, behavior trees, navmesh + steering.
The game on top, calling down, called by nothing.

This is a canonical decomposition (layered, deferred, fixed-step), not the only one, forward/Forward+, ECS-everywhere, and variable-tick-with-substeps are all valid alternatives.

03The master frame loop

The single integration point. The whole engine hangs off the from The Game Loop: one accumulator drives the fixed sim, the render interpolates with alpha and is read-only of the sim, and the sim never calls the GPU^[2].

The master loop (the spine that ties every subsystem together)

const double dt = 1.0 / 60.0;        // the fixed sim step
double accumulator = 0.0;

while (running) {
    pumpEvents(input);                 // OS events -> this frame's input snapshot (Input module)
    double frameTime = clock.tick();
    accumulator += min(frameTime, 0.25);   // clamp = the spiral-of-death guard (Fiedler)

    while (accumulator >= dt) {          // 0, 1, or N fixed steps, ONE accumulator
        previousState = currentState;   // keep prior transforms for interpolation
        simulate(world, input, dt);      // the whole fixed step (order below)
        accumulator -= dt;
    }
    double alpha = accumulator / dt;     // 0..1 blend between previous and current
    renderFrame(world, alpha);          // READ-ONLY: deferred PBR + shadows + skinned meshes + post
    imguiOverlay(profiler);            // debug UI on top of the final image (Tooling module)
    present();
}

void simulate(World& world, const Input& input, double dt) {
    applyInput(world, input);          // 1. resolve player actions
    aiTick(world, dt);                 // 2. behavior trees -> navmesh -> steering
    gameplayUpdate(world, dt);         // 3. component logic, events, spawns
    physicsStep(world, dt);            // 4. solver + character controller resolve
    reapDestroyed(world);             // 5. deferred destruction at step end
}

const DT: f64 = 1.0 / 60.0;          // the fixed sim step
let mut accumulator = 0.0;

while running {
    pump_events(&mut input);            // OS events -> this frame's input snapshot
    accumulator += clock.tick().min(0.25);  // spiral-of-death clamp

    while accumulator >= DT {            // ONE accumulator drives the fixed sim
        previous_state = current_state.clone();   // for interpolation
        simulate(&mut world, &input, DT);
        accumulator -= DT;
    }
    let alpha = accumulator / DT;
    render_frame(&world, alpha);        // READ-ONLY interpolated render
    imgui_overlay(&profiler);
    present();
}

fn simulate(world: &mut World, input: &Input, dt: f64) {
    apply_input(world, input);         // 1. actions
    ai_tick(world, dt);                // 2. BT -> navmesh -> steering
    gameplay_update(world, dt);        // 3. logic, events, spawns
    physics_step(world, dt);           // 4. solver + controller
    reap_destroyed(world);            // 5. deferred destruction
}

The three rules every capstone lives or dies by

One accumulator, not N. Physics, AI, and animation sampling are all things the one simulate(dt) does, nesting a second accumulator (the physics tutorial runs one in isolation) double-steps. Render is read-only. No advancing an animation timer or moving an entity during draw; render consumes lerp(previous, current, alpha) for positions and slerp for rotations (lerping a rotation shrinks it, the LBS failure again). The sim never calls the GPU. It produces a scene snapshot the renderer consumes, which buys headless tests, clean interpolation, and a net-ready sim.

04Update order & the AI stack

The order inside simulate isn't arbitrary: input precedes AI (AI reacts to this frame's commands), AI precedes gameplay (decisions set intents gameplay executes), gameplay precedes physics (physics resolves the requested motion), and physics is last (it produces the authoritative post-collision transforms)^[3]. The one subtlety: AI and gameplay must read a consistent transform snapshot, or you get one-frame-stale targeting.

The AI stack is four layers, don't conflate them

"The enemy moves toward the player" is four different modules: the behavior tree decides (returns SUCCESS/FAILURE/RUNNING), the navmesh + funnel plans the path (requestPath, stringPull), steering produces motion (arrive, clamped to maxForce/maxSpeed), and the character controller resolves it against geometry (collide-and-slide). Pathfinding is global and discrete; steering is local and continuous; they're not the same function.

One enemy through the four-layer AI stack

void updateEnemy(Enemy& e, const World& world, float dt) {
    e.behaviorTree.tick(e.blackboard, e.navAgent);   // DECIDE: MoveTo leaf calls navAgent.requestPath(target)
    if (e.navAgent.hasPath()) {                        // PLAN done: funnel-pulled corners
        Vec3 corner = e.navAgent.currentCorner();
        Vec3 steering = arrive(e.agent, corner, e.slowingRadius);  // MOVE: local force
        integrate(e.agent, steering, dt);             //   clamp to maxForce / maxSpeed
    }
    e.controller.move(e.agent.velocity, dt);          // RESOLVE: capsule collide-and-slide vs world
}

fn update_enemy(e: &mut Enemy, world: &World, dt: f32) {
    e.behavior_tree.tick(&mut e.blackboard, &mut e.nav_agent);   // DECIDE
    if e.nav_agent.has_path() {                        // PLAN done
        let corner = e.nav_agent.current_corner();
        let steering = arrive(&e.agent, corner, e.slowing_radius);  // MOVE
        integrate(&mut e.agent, steering, dt);
    }
    e.controller.move_(e.agent.velocity, dt);          // RESOLVE: collide-and-slide
}

The character controller is the one piece with no prior tutorial, a thin kinematic capsule that collide-and-slides against the collision primitives the physics module built. The capstone adds it as glue (a future module's territory).

05Assembling the render frame

renderFrame reads the interpolated, read-only snapshot and runs the deferred passes in order:

Shadow depth pass from the light (Shadows).
G-buffer geometry pass (Deferred): static meshes and skinned meshes via vertex-shader skinning from the per-frame matrix palette (Skeletal Animation), writing gAlbedo/gNormal/gMaterial.
SSAO (AO/GI).
Full-screen lighting: the cookTorrance loop (PBR) with shadowFactor + PCF folded in.
Forward pass for transparency (a G-buffer holds one opaque surface per pixel, so transparents can't be deferred, order-independent transparency aside).
Post: bloom pyramid → tonemap → TAA (Post-Processing).
ImGui overlay → present.

The new content here is purely the ordering, everything else is a call into a module you already built, reading transforms that were lerp/slerp-interpolated and mutating nothing.

06Making it a game

The gameplay layer supplies the rest. Entities are components on game objects; the event bus decouples a hit from the health update, the UI, and the audio cue; generational handles let an AI's target survive that entity's death (resolve returns null on a stale handle, not a use-after-free); deferred destruction reaps the dead at step end. A game-state machine (menu / playing / paused / game-over) gates the sim, and waves spawn prefabs through the factory, configured by a Lua table so tuning needs no recompile.

A melee hit, end to end

An attack is one physics overlap query → an event on the bus → a damage-component update and an audio command pushed to the SPSC ring (the audio thread, fed lock-free, must never block) → and if health hits zero, the entity is marked for deferred destruction. Win when the survived-wave count reaches the target; lose when player health reaches zero. That single interaction touches physics, the event bus, audio, the component model, and the state machine, the whole engine in one swing.

07Play it

A canvas approximation of the engine's systems, a top-down projection of the 3D arena. The systems are the real ones: the fixed-timestep accumulator with interpolation, A*-pathfinding enemies that route around the walls (a grid stand-in for the navmesh), steering, capsule collision, audio events (visualized as a pulse), and the win/lose loop. Move with WASD / arrows, attack with space:

08Watch its frame

The engine's own frame budget, the live profiler from the Tooling module pointed at this game. Each subsystem gets a slice; the sim slices repeat per fixed step while the render slices run once. Push the load; the answer to "where do I optimize" is whatever slice is widest, not whatever you guessed:

09The data flow

How the subsystems connect. Input feeds the sim; the sim produces a render description and audio commands; the renderer and audio thread consume them. Click a node to see its role, and try the forbidden edge:

10If you added networking

The sim is already a pure simulate(world, input, dt) on a fixed timestep, exactly the precondition netcode needs. Lockstep would exchange input per tick and run the identical simulate on every peer; rollback would snapshot world, predict, rewind, and replay unacked inputs.

The blocker is determinism, not architecture

A fixed step is necessary but not sufficient: float results differ across toolchains and architectures, so making this net-ready is a project (fixed-point or tightly controlled float), not a drop-in. This capstone is single-player, and its float nondeterminism is fine, determinism only matters once you add lockstep or rollback. Don't claim it's net-ready; claim its shape is.

11What every module gave

The synthesis payoff, every module, and the exact thing the capstone uses it for:

ModuleWhat the capstone uses it for

AllocatorsArena/pool for per-frame and entity allocation; the handle backing store

Job SystemsParallelize within a step: culling, skinning prep, lighting tiles (fork/join)

Lock-free QueuesThe SPSC ring to the audio thread; the job pool deque

3D MathVec/Mat/Quat, the MVP chain, lerp for positions and slerp for render rotations

Floating PointWhy this single-player sim needn't be bit-deterministic; the netcode precondition

Game LoopThe one accumulator, dt/alpha, interpolation, the spiral clamp, the spine

InputThe per-frame snapshot, the action map (move/look/attack/dodge)

Asset PipelineMounting the package; the glTF/texture/skin/navmesh load; prefabs; hot-reload

Going 3DPerspective camera, depth, glTF meshes, the scene graph (world = parent · local)

PBRcookTorrance in the lighting pass; the GGX/Smith/Schlick BRDF; tonemap

ShadowsThe directional shadow map + PCF, folded into the lighting pass

DeferredThe G-buffer + full-screen lighting, the render architecture; transparency goes forward

AO/GIThe SSAO pass between geometry and lighting

Post/AABloom pyramid → tonemap → TAA at the back of the frame

Skeletal AnimationSkinned characters: the matrix palette, vertex-shader skinning, clip slerp

Gameplay LayerGame objects, the event bus, generational handles, deferred destruction, prefabs, Lua

Behavior TreesEnemy decisions; the MoveTo leaf and the RUNNING/SUCCESS/FAILURE protocol

Navmesh & SteeringrequestPath/hasArrived, the funnel, seek/arrive, local avoidance

Graph TraversalA* over the navmesh dual graph

PhysicsGJK/SAT + the impulse solver; the primitives the character controller slides against

AudioThe mixer, voices, equal-power pan + distance attenuation, the hard-real-time callback

NetcodeThe "how it would slot in" section: the deterministic-sim precondition

ToolingThe ImGui overlay and the scoped-timer profiler watching the frame

Wrong answers, and why: the loop owns one accumulator and the render is read-only/interpolated (per-subsystem clocks double-step; no-accumulator couples sim to frame rate); and the AI stack is four distinct layers (steering alone gets stuck on walls; the behavior tree decides and requests, it doesn't path or move).

12Retrospective & pitfalls

What this isn't: production-complete. A vertical slice exercises every subsystem once; it skips streaming open worlds, a full editor, GPU-driven rendering, animation state machines and IK, save/load (object-graph serialization with handle remapping), networking polish, LOD, and full audio DSP. Each is a module's worth of depth the capstone deliberately leaves on the table.

Two accumulatorsThe loop owns the timestep. Physics/AI/anim run inside the one step.

Mutating sim in renderRender is read-only. Consume lerp/slerp; emit GPU commands only.

Sim calls the GPUBreaks headless tests + interpolation. Sim makes a snapshot; renderer reads it.

Lerping rotationsSlerp them; lerp shrinks the rotation (the LBS failure).

Conflating the AI stackDecide (BT) → plan (navmesh) → move (steering) → resolve (controller).

Deleting mid-iterationMark for death; reap at step end.

Blocking the audio callbackNo locks/alloc/IO. Drain the SPSC ring wait-free.

"It's net-ready"The shape is; bit-determinism isn't. Float varies across machines.

13What's next

That's the series: from the cache line to a playable 3D game, in C++ and Rust, every subsystem built from scratch and assembled here. Where to go from a vertical slice: pick one subsystem and go deep (streaming, an editor, a real animation graph, deterministic netcode), or build a second game on the same engine, the truest test of whether what you built is really an engine. The full map is on the series hub. Now go make something.

Jason Gregory. Game Engine Architecture, 3rd ed. gameenginebook.com. The layered runtime architecture, ordered subsystem startup/shutdown, and the game-loop / object-update structure this capstone assembles.
Glenn Fiedler. "Fix Your Timestep!" gafferongames.com. The accumulator, the 0.25 spiral-of-death clamp, and alpha interpolation (not extrapolation) the master loop is built on.
Robert Nystrom. Game Programming Patterns ("Game Loop", "Update Method", "State"). gameprogrammingpatterns.com. The input→update→render sequencing, per-object update, and the game-state machine.